Multiple-Fault and Degradation Degree Simultaneous Diagnosis Based on Artificial Neural Network and Hidden Semi-Markov Model

DOI : 10.17577/IJERTV3IS051620

Download Full-Text PDF Cite this Publication

Text Only Version

Multiple-Fault and Degradation Degree Simultaneous Diagnosis Based on Artificial Neural Network and Hidden Semi-Markov Model

Huang Darong, Chu Xiaoyan, Zhao Ling, Tang Jianping

Chongqing Jiaotong University

Institute of Information Science and Engineering Chongqing, China

AbstractBased on the theory of Artificial Neural Networks and Hidden Semi-Markov Model, a hierarchical diagnosis network (ANN-HSMM) is proposed with the respect to multiple- faults and their degradation degree simultaneous diagnosis. ANN-HSMM consists of several sub-networks, and aims at dividing a larger pattern into several smaller subspaces, so that the sub-network can be trained on the subspace, respectively, and the whole network is capable of multiple-faults and their degradation degree simultaneous diagnosis. The experiment results show that ANN-HSMM can not only recognize multiple- faults diagnosis, but also achieve the degradation degree diagnosis for the corresponding faults, and is available for real- time condition monitoring and diagnosis.

KeywordsMultiple-Fault; Degradation Degree; ANN; HSMM)


    It is the fault diagnosis and the identification for corresponding fault degradation degree that is the basis to maintain the critical transmission parts for machinery and equipment to operation smoothly and steady. In [1], aiming at the problem of fault diagnosis and fault-tolerant control for system with delayed measurements and states, an innovative solution is proposed; in [2], a fractal dimension calculation method for discrete signals in the fractal theory was proposed and was applied to extract the fractal dimension feature vectors and classified various fault types; in [3], a new safety performance evaluation of the fault-prediction technology was research based on misclassification cost, and the future development trend of the fault prediction technology was discussed.

    Meanwhile, Artificial Neural Network (ANN) and Hidden Markov Model (HMM) are applied in the field of the fault diagnosis, and the studies about Hidden Semi-Markov Model (HSMM) are launched subsequently, and the diagnosis accuracy is improved to some extent. Aiming at the problem that the static multiple fault models in existence can not fit the need of the more and more complicated equipments as the development of electron technique, [4] put forward a dynamic multiple fault model which is based on Hidden Semi-Markov Model (HSMM) and make the process simpler, meanwhile the original dynamic multiple fault problems are decomposed into several independent sub-problems and the Binary Particle

    This work is supported by National Nature Science Foundation under Grant (NO. 61004118, 61304104) and the China Scholarship Council.

    Swarm Optimization (BPSO) is utilized to solve this problem; respecting to multiple-faults simultaneous diagnosis for the rotating machine, [5] proposed a hierarchical diagnosis network (HDANN) based on the theory of Artificial Neural Networks; in [6], the work status of pump are divided into five categories, and each kind of the four faults is subdivided into three kinds of health statues respectively, and then Hidden Markov Model (HMM) is used to recognize the real work statues from the thirteen statues, realizing the fault and its degradation degree simultaneous diagnosis. However, the model reveals good recognition performance on faults, but bad for faults degradation degree, which is on account of that the state dwell time probability of various faults degradation degree obeys exponential distribution on the condition of Markov assumption, causing the dwell time for each degraded state can not be reacted in the evolution of equipment failure. In accordance with the theory of HMM, HSMM is proposed on the condition of introducing the dwell time for each degraded state. HSMM, owning rigorous mathematical structure, can make the behavioral characteristics of the entire observation sequence being expressed completely. However, it is difficult to recognize the multiple-faults by means of HSMM due to the characteristics and the maximum likelihood criterion of HSMM. Artificial Neural Network (ANN) not only having good biology basis and data base, but also owning good ability of generality and fault tolerance, which can deal with the problem of multiple-faults diagnosis, but the accuracy for the identification of the faults degraded state is not high. Thus, it is critical to research the model of multiple-fault and degradation degree simultaneous diagnosis based on the advantages of Artificial Neural Network and Hidden Semi- Markov Model.

    This paper presents a new fault diagnosis method using ANN and HSMM, realizing the multiple-fault and degradation degree simultaneous diagnosis.


    1. The Fundamental of ANN

      Artificial Neural Network (ANN) [7], a mathematical model which can response the human brain structure and its functions abstractly, is a complex network interconnected by a large number of neurons, and its structure is shown as Fig 1. There are m nodes in the input layer, and the number of output equals to the input; the hidden layer i has q nodes, and

      f1 denotes the activation function, ij is the connection

    2. Fundamental of Hidden Semi-Markov

      The Hidden Markov Model [8] is a doubly stochastic process: one is a state process. Its nature is a Markov chain that can be used to describe the state transfer, and at the same time it is a basic stochastic process; the other is a process called observation sequence value which can describe the corresponding relations of probability and statistics between every state and its observed values. Because the observation sequence value which is due to the implicit state cannot be acquired directly, and the state existence and properties are obtained by perception through a random process. So it is called Hidden Markov model, which is short for HMM denoted

      weights between the input layer j and the hidden layer i ; the by (N, M , , A, B) .

      output layer k contains l nodes, and f2 stands for their

      Where, A (aij )N N

      denotes state probability transition

      activation function, ki is the connection weights between the

      matrix and a


      | q ),1 i, j N ; B (b ) is

      hidden layer i and the output layer k . Firstly, the hidden input,

      ij t 1

      j t i

      ik N N

      which is obtained by weighted sum for the input of the input


      on behalf of the observation value probability matrix and

      bik P(Ot Vk | qt i ), 1 j N, 1 k N ; (1,2 ,,N )

      layer, can be presented by pk ij y j ,

      j 1

      k 1, 2,, q .

      stands for the initial probability distribution vector and

      i P(q1 i ),1 i N ; N presents the state number of the

      Secondly, in accordance with the activation of the f1 in hidden

      model, and N states are denoted by H (H , H ,, H

      ) , the

      layer, the input of the hidden layer can be expressed

      1 2 N

      m state at time t is denoted by st ; M is the corresponding

      as mk f1 (ij y j k ), k 1, 2,, q . Lastly, the real number of possible observation values for every state, and M

      j 1

      observation values are denoted by V {v , v ,, v } .

      input results can be got by the activation function f2

      for the

      1 2 M

      output layer.


      j i k


      Probability of model staying for some time in a certain state generally obeys the following exponential distributin P (d ) ad 1 (1 a ) in HMM, so we can get to know that the

      i ii ii

      probability presents exponentially decreasing trend as time

      y2 o2


      increasing. But according to the actual situation, the state staying time does not obey this distribution function in most instances.


      input layer


      hidden layer



      output layer

      HSMM [9] is one of the models used to overcome the above HMM disadvantage. Based on the HMM, HSMM could overcome HMM modeling limitation caused by the Hidden

      Fig.1. The topology for ANN

      The model, which can not only simulate the humans ability about presentation and storage for knowledge, but also imitate the humans reasoning behavior for knowledge, is an information system. The objective function, which is

      Markov modeling assumption. In HMM, a state only corresponds to one observation value; but in HSMM, a state corresponds to a sectional observation value, it means that the state staying time is added in and the observation value is connected with both the current state and its staying time when transferring from the current state to the next. The HSMM

      established by comparing the actual output with the ideal

      output, can be used to revise the interconnection weights

      topological structure is shown in Fig.2, in which q

      r 1


      between the respective neurons by repeated learning, achieving the interconnection weights convergence within the stable

      the start point at time r and the staying time is d qr qr 1 .


      range. The trained and learning algorithm of network, which H

      determines the initial weights that connect the related neurons, will adjust the weights automatically with the training patterns

      H2 HN

      adding. By training the learning algorithm, the network could get the satisfied performance. The common training algorithms include the learning of error correction, competitive learning and so on. And the algorithm of DFP is adopted in this paper.

      s1 s2 sq1

      o1 o2 oq1

      Sq1+1 Sq2

      Oq1+1 oq2

      SqN-1 +1 SqN

      oqN-1 +1 oqN

      The neural network could acquire the knowledge through the connection structure and steady weight distribution in the related neurons. Meanwhile, the neural network itself could filter out the noise and deal with problems in the presence of noise. Therefore the neural network is suitable for online fault detection and diagnosis.

      Fig.2. HSMM topological structure

      HSMM properties can be described by the initial state distribution vector , state staying time distribution D, and the probability matrix of observation value; so HSMM can be denoted with (N, M , , A, D, B) . In HSMM, transition

      among macro states meets with the Markov process, but transition among micro states does not, but the model can be trained by according to the algorithm in [10].

    3. Fault Diagnosis Model Based on ANN-HSMM Network

    Fault diagnosis is divided into before and after two parts in ANN-HSMM network, equals to assign a large-scale diagnosis task to many sub-networks. The first network uses ANN model to diagnosis multiple-faults; based on the above diagnosis results, the second network adopts HSMM to analyze and diagnose the measured signals again to get the degradation degree. This model can identify multiple-faults, diagnose fault degradation degree, and improve the network learning efficiency to meet the practical online diagnosis requirements. Suppose the measured faults can be divided into 5

    classes{F , F , F , F , F } , in which, exclude from F (normal

    The first-class network






    the input vectors

    The second-class network

    Fig.3. ANN-HSMM network structure

    Every network is trained individually according to its grade level. To the trained networks, the inputs are the extracted fault characteristics vectors from the measured objects and the diagnosis should be executed step by step. First ANN network takes effect, and the second corresponding HSMM is

    0 1 2 3 4

    0 stimulated by the results of first-class network, and the

    wok condition), the other 4 faults can further be divided into

    diagnostic rules are as follows:

    three different degradation degree faults

    {F11 , F12 , F13} ,

    • If the first output value of ANN is 0, it indicates no

      {F21 , F22 , F23} , {F31 , F32 , F33} and {F41 , F42 , F43} . So the faults are

      divided into 13 states and the extracted fault eigenvector equals to Y {y1 , y2 ,, ym } .

      The first layer is accomplished by the feed-forward single hidden layer neural network; number of the input and output panel points is m and 5 (number of fault kind), every output panel point relates to a certain fault, and if the output value is 1, which means the relative fault exists, otherwise no fault exists. The network training modes are shown in TABLE .

      Training mode



      Relative fault


      ( y0 , y0 ,, y0 )

      1 2 m

      0 0 0 0 0

      F0 (no fault)


      ( y1 , y1 ,, y1 )

      1 2 m

      1 1 0 0 0



      ( y2 , y2 ,, y2 )

      1 2 m

      1 0 1 0 0



      ( y3 , y3 ,, y3 )

      1 2 m

      1 0 0 1 0



      1 0 0 0 1



      1 1 1 0 0



      1 1 0 1 0



      1 1 0 0 1



      1 0 1 1 0



      1 0 1 0 1



      1 0 0 1 1



      1 1 1 1 0

      F1F2 F3


      1 1 1 0 1

      F1 F2F4


      1 1 0 1 1



      ( y14 , y14 ,, y14 )

      1 2 m

      1 0 1 1 1



      ( y15 , y15 ,, y15 )

      1 2 m

      1 1 1 1 1

      F1F2 F3F4

      TABLE I. The first network training mode

      The second layer exploit HSMM network, and output values of the first layer can stimulate the 2th layer to function, reclassify the measured signals to identify different order of fault severity, and the network structure is shown in Fig.3.

      faults exist in the equipment, and its over; otherwise turn to 2th step.

      • Judge the other output of ANN in order excluded from the first output, then input all the relative serial number that is connected with the state whose output value is greater than 0.5 into empty matrix A; if all the output values of ANN are smaller than 0.5, input the serial number that is connected with the panel point whose output value is maximum into matrix A.

      • Stimulate the second network in order to function according to the elements in matrix A, in the relative HSMM library which connects with the stimulated faults, the state connects with the maximum output probability is concerned to be the degree to this equipment and its corresponding fault.

      • The ultimate diagnosis result is due to the output of both the first and second network. The output values of the first layer network those are greater than 0.5 or the maximum output value possibly is the fault class to the measured object. Stimulated by theoutput results of the first layer network, the state connected with the model that has the maximum output probability in HSMM library is the corresponding degradation degree to the related faults.

    As the training is independent to the sub-networks of the related grades, so in actual, it only needs to use the practical measured fault samples to retrain the related sub-networks for different grades, and the suitable specific units can be acquired which is easier to be trained and has stronger adaptability.


    1. Acquirment of Training Samples

      The training eigenvector is structured by the energy ratio in vibration signal band of measured equipment and used to divide the fault category to get the fault training model. For the frequency domain features of the vibration signals of the rotating machinery are acquired through the Fourier transform, and time domain signals can be represented by the superposition of multi sinusoidal signals, if a fault exists, it

      means that the original time domain signals superpose one or several different sinusoidal signals which have different frequency. f (t) is supposed to be the time domain signal when

      the unit has tow single-frequency faults, as:

      f (t) f1 (t) f2 (t) sin(2 f1t 1 ) sin(2 f2t 2 ) (1)

      Where, E(Wk ) is the gradient for E plays onWk .

      As for the training about HSMMs, the second-order HSMM with continuous Gaussian density are chosen, and the Markov chain with its status varying from left side to right, which has better effects and high training speed, is selected. The Markov chain with three statuses is selected in this paper, and the initial

      According to the linear characteristic of Fourier transform, there can be:

      F() F( f1 (t)) F( f1 (t) f2 (t)) F1() F2 () (2)

      i i

      Multi-fault eigenvector can be acquired through the superposition of single-fault eigenvector, in order to protect the fault characteristic, doubly fault eigenvector is acquired by two single-fault eigenvector weighting. Suppose y1, y2 is the

      I-th relative component of single-fault eigenvector, so the doubly-fault eigenvector can be defined as follows:

      probability is [1, 0, 0] . The initial value of A can be

      obtained by means of uniform choice and B is got randomly. The Baum-Welch algorithm is developed on the basis of climbing algorithm so that the initial value has great effects on achieving the best solutions. Therefore, the K-means algorithm and clustering algorithm are utilized to obtain the initial values and the Baum-Welch [12, 13] with several observation sequences is utilized to train the model to increase the robustness. A couple of initial parameters of HSMM are estimated by the K-means algorithm and clustering algorithm, and then the data obtained above is utilized to train the various HSMM.

      y1 y2

      C. The Test Results and the Analysis for the Network

      y1,2 i y1 i y2


      i y1 y2 i

      y1 y2 i

      i i

      i i

      The network test is obtained on the trained network ANN-

      It is the formula (3) that is a filtering process, which can not only maintain the fault characteristics, but also restrain the non- fault features. The response component of the multiple-fault characteristics vectors can be established similarly to (3), obtaining the standard training samples of the network. Similarly, the statues feature vectors of the various faults degradation degree can be got under different faults status.

    2. The Training for the Network

    The feed-forward signal hidden layer network is adopted in the first-class network, and the number of input nodes is m, the output nodes is 5, the hidden layer nodes can be commutated by the formulation the number of hidden layer nodes=2*the number of input nodes+1, that is 2m+1. Then DPF Approach

    1. is adopted to train the first network, and the learning objective function can be defined as below:

      HSMM, achieving multiple-fault and degradation degree simultaneous diagnosis. The test results are shown as below:

      • The diagnosis and the classification for the large fault can be realized by means of the first-class network, and the diagnostic accuracy reaches 95%.

      • Stimulated by the results of the first-class network, the second corresponding HSMM starts to work, achieving the faults degradation degree diagnosis. Expected for

    the two states S10 , S11 , the other states can be

    identified correctly by the hierarchical diagnosis network (ANN-HSMM), and the identification accuracy is 90%.

    For the same diagnosis problem, the diagnostic performed for the HMM and the ANN-HSMM mentioned above is compared, and the comparison results are shown as below:

    1 M 2

    1 M M 2

    J (W ) 2 Ei (W ) 2 (Oj

    • d j )

      TABLE II. The comparison for the recognition rate

      i 1 i 1


      S 1



      S 4




      S 8

































      Where, W denotes the weights vector of the first-class

      network; Ei

      is on behalf of the error between the real input

      and the ideal input. In accordance with the formula (4), the second-order polynomial Taylar around the minimum point can be regarded as the objective function approximately, obtaining the estimated value of the minimum point. After obtainingWk 1 , the connection weights for the neural network can be modified according to the following formula (5-8):

      As can be seen from the above table, the diagnosis and the classification for the large faults can be achieved by HMM and ANN-HSMM, but for the problem of faults degradation degree diagnosis, the diagnosis accuracy for ANN-HSMM are higher than HMM, in other words, the multiple-fault and

      W W

      H E (k) E (W ) /

      degradation degree simultaneous diagnosis can be obtained

      k 1

      k k i i k k

      through the hierarchical diagnosis network (ANN-HSMM).

      H *E(W ) *ET (W ) * H

      The test results reveal that for the same diagnosis problem,

      k k

      H 1[H

    • k k k k ]



    regardless of the diagnosis and the classification for the large fault or the faults degradation degree diagnosis, ANN-HSMM is superior to the HMM in conference [6], therefore the

    ET (W )* H

    *E(W )


    established hierarchical diagnosis network (ANN-HSMM) in

    k k k k

    H1 I (Identity Matrix (8)

    this paper is reasonably practicable.


Based on the theory of Artificial Neural Networks and Hidden Semi-Markov Model, a hierarchical diagnosis network (ANN-HSMM) is proposed with the respect to multiple-faults and their degradation degree simultaneous diagnosis. The experiment results show that comparing with the fault diagnosis method in [6], ANN-HSMM can not only recognize multiple-faults diagnosis, but also achieve the degradation degree diagnosis for the corresponding faults, and the diagnostic accuracy is higher, available for real-time condition monitoringand diagnosis.


  1. Juan Li, Hong-wei Gao, Peng Zhang and Da-rong Huang. Fault Diagnosis and Optimal Fault-Tolerant Control for Systems with Delayed Measurements and States [J]. International Journal of Control, Automation, and Systems. 2012, 10(1): 150-157.

  2. Zhao Ling, Huang Darong, Song Jun. Fault diagnosis method based on fractal theory and application in wind power systems [J]. Journal of China Ordnance, 2012, 8(3): 167-173.

  3. HUANG Da-rong, SONG Jun, ZHAO Gang. Research on Safety Performance Evaluation Method of Fault-prediction Technology Based on Misclassification Cost [J]. Acta Armamentarii, 2011, 32(10): 1292- 1297.

  4. Ge Pengyue, Huang Kaoli, Liu Xiaoqin, Lian Guangyao. Research on Dynamic Multiple Fault Diagnosis Model (DMFD) Based on Hidden

    semi-Markov Model [J]. Computer Measurement & Control, 2010, 18(6): 1280-1286.

  5. He Yongyong, Zhong Binglin, Huang Ren. Multiple-Fault Simulaneous Diagnosis Based on Artificial Neural Networks for Rotating Machine [J]. JOURNAL OF SOUTHEAST UNIVERSITY, 1996, 26(5): 39-43.

  6. YUE Xia. HMM-based Fault Diagnosis Technology under Complex Conditions [D]. Guang Zhou: South China University of Technology, 2012.

  7. LIU Jing-yan, LI Yu-dong, YANG Xiao-bang. Application of Genetic Neural Network to Gear Fault Diagnosis [J]. 2012, 33(3): 36-39.

  8. Hu Wei, Gao Lei, Fu Li. Research on motor fault detection method based on optimal order hidden Markov model [J]. Chinese Journal of Scientific Instrument, 2013, 34(3): 524-530.

  9. WANG Ning, SUN Shu-dong, CAI Zhi-qiang, LI Shu-min. Method of limitation state recognition for two stage equipment based on HSMM [J]. Application Research of Computers, 2011, 12(28): 4560-4563.

  10. YANG Zhi-bo, DONG Ming. Equipment Fault Diagnosis Using Auto- regression Hidden Markov Model [J]. JOURNAL OF SHANGHAI JIAOTONG UNIVERSITY, 2008, 42(3): 471-479.

  11. Wang Yaonan. Research of Fast Learning Algorithm for Neural Networks Based on DPF Approach [J]. ACTA SIMULATA SYSTEMATICAL SINICA, 1997, 3: 34-39.

  12. G. Georgoulas, M. O.Mustafa, I. P. Tsoumas, etc. Principal Component Analysis of the star-up transient and Hidden Semi-Markov Modeling for broken rotor bar fault diagnosis in asynchronous machines [J]. Experts Systems with Application, 2013, 40: 7024-7033.

  13. Boutros, T. & Liang, M. Detection and diagnosis of bearing and cutting tool faults using hidden Markov models. Mechanical Systems and Signal Processing [J]. 2011, 25: 2102-2124.

Leave a Reply