 Open Access
 Total Downloads : 107
 Authors : Angaraj Das, Dr. S. R. Nirmala
 Paper ID : IJERTV6IS040210
 Volume & Issue : Volume 06, Issue 04 (April 2017)
 DOI : http://dx.doi.org/10.17577/IJERTV6IS040210
 Published (First Online): 07042017
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Wavelet and PCA Based ECG Compression
Angaraj Das Department of ECE Gauhati University Guwahati, India
Dr. S. R. Nirmala
Department of ECE Gauhati University Guwahati, India
Abstract – ECG is the most important biological signal for the diagnosis of cardiac diseases. In many cases, ECG monitoring devices generate a huge amount of data. Therefore, compression of ECG signal is an important objective of ECG signal processing for the purpose of efficient storage and transmission. In this paper, a wavelet PCA based ECG compression technique is proposed. Two statistical parameters Percent Root Mean Square Difference (PRD) and Compression Ratio (CR) are calculated to evaluate the performance of the proposed method. The database used for testing purpose is taken from MITBIH. The results show that the proposed compression technique provide good performance for different ECG signals considered from clinical point of view.
Index Terms ECG, Compression, 2D Wavelet, PCA.

INTRODUCTION
ECG is considered as the most efficient tool for the diagnosis of the heart diseases. It is the graphical recording of hearts electrical activity. ECG is generated by two processes called depolarization and repolarization of cells. In the normal state, cardiac cells are electrically polarized. A flow of electric current is generated by the depolarization process. ECG signal mainly contains five peaks and valleys namely P, Q, R, S and T. It has some specific time interval as shown in the Fig.1. Pwave represents atrial depolarization. QRScomplex is obtained by ventricular depolarization. Twave is generated by repolarization of ventricles [1].
Fig.1: Components of ECG
In many cases, ECG monitoring devices have to monitor the heart condition of a patient for a long duration of time. In such cases, a huge amount of storage capacity is required to store the recorded data. For transmission purpose also, it is not possible to transmit such huge amount of data through network. Storage requirement is further increased with the increase of sampling rate, sample resolution and number of leads. Therefore, ECG compression is an essential operation and consequently represents an important objective of ECG signal processing [215].
Generally, compression techniques can be divided into two classes: lossless and lossy compression. In lossless compression method, the original signal can be reconstructed without any loss of information. But, in lossy compression, approximated version of the original signal can be reconstructed with a certain amount of distortion in the reconstructed signal. But in compression of ECG signals, it is important that decompressed signal contains all the clinically important information. Otherwise, the decompressed signal becomes misleading for the diagnosis of heart condition of a patient. Furthermore, ECG compression techniques can be classified into direct data compression techniques, transformation approaches and parameter extraction techniques.
Principal component analysis (PCA) is a statistical methodology whereby the main linear factors underlying the movement of a given multivariate data series are extracted as distinct vectors. It is a method of dimensionality reduction without sacrificing the accuracy much. PCA aims to summarize data with many independent variables to a smaller set of derived variables in such a way that first component has maximum variance, followed by second, followed by third and so on. Therefore, the first few principal components represent the information of the data. Hence, the original data can be reconstructed very well even if last few components are lost.
The organization of this work is as follows: in section II a brief review of literature is presented. Proposed compression method is explained in section III. Section IV presents the experimental results. Finally, conclusions are given in section V.

LITERATURE REVIEW
In ECG data compression algorithms, the main objective is to achieve a low information rate, while preserving the relevant diagnostic information in the reconstructed signal. In recent years, lots of ECG data compression methods have been proposed. In [2], a multichannel ECG compression technique based on Multiscale PCA in wavelet domain is proposed. Here a new PC selection method based on average fractional energy contribution of eigenvalues in a data matrix is also proposed. In this method, compression is achieved by uniform quantizer and entropy coding of PCA coefficients. For the evaluation of the compression result, two statistical parameters PRD of each lead and wavelet energybased diagnostic distortion (WEDD) are calculated. The reconstructed signal quality was found to be very good containing the diagnostic information.
The authors in [3], proposed a 2D Discrete Wavelet Transform (DWT) based ECG compression technique. In the preprocessing stage, QRScomplex is detected. Then, a 2D matrix is formed by storing each period as a row in the matrix. This 2D matrix is wavelet transformed and the obtained wavelet coefficients are segmented into groups and thresholded. Finally, the thresholded coefficients are coded using coding technique. This method gives high compression ratio with relatively low distortion.
In [4], a 2D wavelet based ECG compression method is proposed. Firstly, the 1D ECG signal is segmented and aligned to form a 2D data array. Then 2D wavelet transform is applied to the constructed 2D data array. Here, a modified vector quantization (VQ) is applied to the obtained wavelet coefficients. The experimental results provide higher compression ratio with clinical features well preserved.
In [5], an ECG compression method is presented based on beta wavelet using lossless encoding. In this method, runlength encoding is used. This method uses a modified thresholding. Here, the compression of the signal is improved by the wavelet filters based on beta function and its derivative. The results show the superiority of this technique in terms of compression ratio and signal quality.
The authors in [6], proposed target distortion level (TDL) and target data rate (TDR) wavelet based ECG compression algorithms for real time applications. Two statistical parameters PRD and Root Mean Square Difference (RMSE) are calculated as a quality measures. Here, different conditions of different ECG signals are considered for the evaluation of data rate variability and reconstructed signal quality. The experiment results show that the TDR algorithm provide high data rate for realtime application in communication.
In this paper, a single lead ECG compression method based on wavelet and PCA is proposed. In the proposed method, a new PC selection technique is proposed based on the variance of the original signal and the level of wavelet decomposition.

PROPOSED METHOD
The present work presents single lead ECG data compression using PCA. This method is based on 2D DWT. Here, a 2D matrix is formed from the original signal where each column of the matrix contains approximately one heart beat. Then PCA is applied to each 2D coefficient matrix obtained from 2D DWT of the generated 2D matrix. The block diagram for the proposed method is shown in Figure 2.
Figure 2: Block diagram for proposed method

Preprocessing
In preprocessing stage, denoising of ECG signal is performed. There are different types of noises that are present in an ECG signal during acquisition. The major ones are high frequency noise; power line interference and baseline wander noise. High frequency noise caused due to muscle activity and outerenvironment. Power line interference occurred due to improper grounding of power line. Baseline wander is a low frequency noise caused due to offset voltages in the electrodes, respiration and body movement. Thus noise removal becomes an essential part for proper analysis [16]. The outcome of ECG compression process is influenced by the noise level present in the signal. Accordingly, if the noise present in the signal is more, then the compression ratio will decrease. Here, probability based ECG denoising technique is used for noise removal [17]. In this process, at first the acquired ECG signal is decomposed using wavelet transform. Then, probability of wavelet coefficients are determined at different levels. The noisy coefficients are thresholded based on different probability based threshold for different levels. Finally, the preprocessed ECG signal is obtained by inverse wavelet transform (IWT) of the thresholded coefficients.

Wavelet PCA based Compression
In this scheme, compression is achieved by applying entropy coding to the selected principal components of each 2D coefficients array obtained from 2D DWT of ECG signal. The selection of PC is based on variance of the original signal and the level of wavelet decomposition. This compression technique has five major steps. First, the signal is segmented to form a 2D array such that each segment contains approximately one heart beat. In the second step, 2 D DWT is applied up to 4 decomposition level to the constructed 2D array in the first step. In the third step,
principal components (PCs) are extracted for each 2D coefficients array. In the next step, PCs are selected for each 2D coefficients matrix. Finally, selected PCs are compressed
,
D
D
j1 j 2
,
D
j 3
,………………,
D
jn
by using entropy coding.
Each of the steps is discussed in detail in the following subsections.

Generating data matrix:

The signal segment of a heart beat is represented by the column vector as
x(1)
.
x(2)
y
.
.
x( N )
where N is the number of samples of the segment. If M
number of heart beats are considered, then the entire
where, ` j ' is the wavelet scale and `n' is the total number of
eigenvalues for that particular 2D coefficients array.
4. PC selection method:
Here, a new PC selection scheme based on the variance of the original ECG signal is proposed. The PC selection process is performed by deriving different threshold for different coefficients matrices. These thresholds depend on the
variance ( ) of the original ECG signal. Because, if
variance is less, then the value of these thresholds will be more and less number of PCs will be selected, that resulting more compression. Conversely, if variance is more, then the value of these thresholds will be less and more number of PCs will be selected, that resulting less compression. These PC selection thresholds can be defined as:
ensemble is compactly represented by the N M
matrix,
data
n
Ai
T i1 (1)
X y1 y2 …….yM
A L
n
H
Thus X is the data matrix.
ji

Wavelet decomposition:
The 2D data matrix X of size N M is decomposed by
T i1 (2)
H
j L
applying 2D DWT. The `L' level wavelet decomposition results in `L' th approximation 2D coefficients array cAL and ` j 'th detail 2D coefficients array in three orientations
V ji
V
n
T i1
j L
(3)
(horizontal, vertical and diagonal) as cH j , cVj and cDj n
where j 1, 2,….., L . This results in `3L+1' number of 2D
Dji
T i1 (4)
coefficients matrices and arranged as
cAL , (cHL , cVL , cDL ),…………….., (cH1 ,cV1 ,cD1 ) .

Applying PCA on coefficient matrix:
Dj L
Where, TA is the threshold for Approximation
T
H
V
coefficients matrix. , T
j j
and
T
D j
are thresholds for
After taking 2D DWT, the PCA of each 2D coefficients matrix is performed. This gives number of eigenvalues and eigenvectors and they will be in pairs. Eigenvalues and their corresponding eigenvectors are arranged in descending order. The principal components are selected based on the eigenvectors with corresponding higher eigenvalues. The number of eigenvalues chosen determines the reduction of dimension. Ordered eigenvalues in approximation (A) and detail coefficients matrices (H, V, D) are:
A , A , A ,………………, A
horizontal, vertical and diagonal detail matrices at ` j 'th decomposition level. Those eigenvalues which are greater than threshold are selected from each coefficient matrix (A, H j , V j , Dj ), that decide the number of PCs.
After selection of PCs, they are quantized and Huffman encoded. Here different quantization levels such as 4bits to 8bits are used for different PCs. The compressed data is obtained after Huffman coding the quantized PCs. Then the compression ratio (CR) is calculated as:
1 2 3 n
H
H
,
j1 j 2
, ,………………,
H
H
j 3 jn
CR
Original File Size Compressed File Size
(5)
Vj1 , Vj 2 , Vj 3 ,………………, Vjn
To reconstruct the original signal, Huffman decoding, dequantization and PCA reconstruction is performed. The wavelet coefficients are passed through the same wavelet reconstruction filter. Finally PRD is calculated between reconstructed and original ECG signal using equation (6).
Table 1: PC selection for each coefficients matrix of 105 Arrhythmia database
PRD
N
n1
(x[n] x[n])2
2D coefficients matrix
Total extracted PCs
Selected PCs
cA4
16
4
cH1
14
9
cH2
15
7
cH3
16
5
cH4
16
4
cV1
14
8
cV2
15
7
cV3
16
6
cV4
16
5
cD1
14
8
cD2
15
7
cD3
16
6
cD4
16
5
100 (6)
n1
N (x[n])2
Where, signal.
x[n] is the original signal and
x[n] is decompressed


EXPERIMENTAL RESULTS
In this paper, a new 2D DWT and PCA based ECG compression technique is proposed with minimum loss of clinical information in the reconstructed signal. This section presents the experimental results obtained from the proposed compression algorithm. The database is taken from MITBIH
[18] for the evaluation of the proposed compression method. Each file in the database consists of two lead recordings sampled at 360 Hz with 11 bits per sample resolution. From each record first 3000 samples of first lead are considered for experimental purpose. Wavelet decomposition using Daubechies 7/9 biorthogonal wavelet filters are used and same has been applied for reconstruction filters also. Here, original ECG signal is segmented to form a 2D data matrix where each segment contains approximately one heart beat. Then 2D DWT is applied on this 2D data matrix. In tis work, wavelet decomposition up to 4 levels is used. As a result, 13 coefficients matrices are found. PCA is then applied on each of these coefficients matrices.The efficiency of the proposed compression technique can be measured based on two parameters. The compression ratio (CR) is the first measurement that reflects the ratio between the original to the compressed ECG file size. The second one is PRD, used for distortion measurement between the original ECG signal and the reconstructed ECG signal.
Fig. 3(a) shows original ECG signal from record 105 from MITBIH Arrhythmia database and (b) shows the decompressed signal. Some circled areas are shown in Fig. 3(b). These are distortions observed in the decompressed signal. Here, distortions are mainly observed in the areas where diagnostic information is less. Table 1 presents principal components obtained for each of the coefficients matrices for 105 Arrhythmia database. Here, the first column presents different coefficients matrices obtained from 2D DWT, the second column shows total number of PCs extracted for each coefficients matrix and the last column indicates the number of selected PCs by the proposed PC selection procedure.
Table 2: Average PRD and CR values for the proposed compression method
Database Type
Record no.
Without denoising
With denoising
PRD
CR
PRD
CR
Arrhythmia
100124, 200207
0.617
8.192
0.527
10.305
PTB Diagnostic
S0010 ,s0014s0017, s0021
1.612
8.152
1.690
9.149
Noise Stress
118e00, 118e06, 119e00
2.035
8.060
2.022
9.297
NonInvasive Fetal ECG
Ecgca102, ecgca192, ecgca252
3.743
8.401
3.568
8.898
Sudden Cardiac
death holter
3032
2.983
7.202
2.382
9.113
Average
2.198
8.001
2.037
9.352
Figure 3: Reconstructed signal by proposed method: (a) Original Signal (b) Decompressed Signal (105 Arrhythmia database from MITBIH
The results obtained by the proposed method when applied on various databases are given in Table 2. The first and the second column in the table show different type of databases from MITBIH and the records that are used for testing purpose. Next two columns in the table provide results obtained by the proposed ECG compression technique without applying any denoising technique on the ECG signals. Last two columns give results obtained by the proposed method after applying denoising technique. It is seen from the results that the PRD value is improved when denoising technique is applied. Similarly, CR is also increased when denoising technique is used. The proposed compression method is compared with existing data compression methods in Table 3. The experimental results show that proposed method provides an average PRD of
2.037 and CR of 9.352. It is observed from Table 3 that, [3] and [6] provide more CR. But, PRD values are also increased. Therefore, the proposed method shows good CR with low PRD value. It is also observed that the clinical information is preserved to the significant level in reconstructed signals.
Table 3: PRD and CR comparison with existing methods
Methods
Database used
No. of
channels
PRD
CR
Proposed
method
MITBIH
1
2.037
9.352
L. N. Sharma et al.[2]
CSE Multilead
Measurement library
12
2.090
5.980
M. Abo
Zahhad[3]
MITBIH
1
2.209
24.288
Manikandan et al.[6]
MITBIH,
Creighton University
1
6.330
12.000

CONCLUSION
This paper presents single lead ECG data compression. In this proposed method, a 1D ECG data is first segmented and aligned to form a 2D data array and compression is achieved by entropy coding of selected principal components. This method is based on 2D DWT of 2d data array. It is observed that the proposed algorithm provides good compression ratio with good reconstruction signal quality in terms of clinical information present. Hence, the proposed method can be considered as a simple but effective ECG compression method. In some cases, clinical components are not preserved correctly. In future the algorithm can be improved by modifying the PC selection threshold for effective reconstruction of the signals. Later, it can be extended for multichannel ECG signal.
REFERENCES

A. Gacek, W. Pedrycz, ECG Signal Processing, Classification and Interpretation: A Comprehensive Framework of Computational Intelligence, Springer Science & Business Media, New York, 2011.

L. N. Sharma, S. Dandapat, Multichannel ECG Data Compression Based on Multiscale Principal Component Analysis, IEEE Transaction on Information Technology in Biomedicine, vol. 16, no. 4, pp.730736, July, 2012 .

M. AboZahhad, S.M. Ahmed and A. Zakaria, An Efficient Technique for Compressing ECG Signals Using QRS Detection, Estimation, and 2D DWT Coefficients Thresholding, Hindawi Publishing Corporation Modelling and Simulation in Engineering, vol.1, pp.110, 2012.

X. Wang, J. Meng, A 2D ECG Compression Algorithm Based on Wavelet Transform and Vector Quantization, Digital Signal Processing, vol.18, pp.179188, 2008.

R. Kumar, A. Kumar, R. K. Pandey, Beta wavelet based ECG signal compression using lossless encoding with modified thresholding, Computers and Electrical Engineering, vol.39, pp.130140, 2013.

M. S. Manikandan and S. Dandapat, Wavelet Threshold Based TDL and TDR Algorithms for Realtime ECG Signal Compression, Biomedical Signal Processing and Control, Elsevier, vol.3, pp.4446, 2008.

M. AboZahhad, A. F. AlAjlouni, S. M. Ahmed, R.J. Schilling, A New Algorithm for the Compression of ECG Signals Based on Mother Wavelet Parameterization and Bestthreshold Levels Selection, Digital Signal Processing, vol.23, pp.10021011, 2013.

B. Huang, Y. Wang, J. Chen, ECG compression using the context modeling arithmetic coding with dynamic learning vectorscalar quantization, Biomedical Signal Processing and Control, Elsevier, vol. 8, pp.5965, 2013.

A.Ibaida, D. AiShammary, I. Khalil, Cloud Enabled fractal based ECG Compression in Wireless Body Sensor Network, Future Generation Computer Systems, vol.35, pp.91101, 2014.

K. Ranjeet, A. Kuamr, R. K. Pandey, ECG Signal Compression using optimum wavelet Filter Bank based on Kaiser Window, Procedia Engineering, vol.38, pp.28892902, 2012.

R. Benzid, A. Messaoudi, A. Boussaad, Constrained ECG Compression Algorithm Using the Blockbased Discrete Cosine Transform, Digital Signal Processing, vol.18, pp.5664, 2008.

A. S. Lalos, L. Alonso, C. Verikoukis, Model Based Compressed Sensing Reconstruction Algorithms for ECG Telemonitoring in WBANs, Digital Signal Processing, vol.35, pp.105116, 2014.

H. L. Chan, Y. C. Siao, S. W. Chen, S. F. Yu, Waveletbased ECG compression by bitfield preserving and running length encoding, Coputer Methods and Programs in Biomedicine, Elsevier, vol.90, pp.18, 2008.

A. S. Lalos, L. Alonso, C. Verikoukis, Model based compressed sensing reconstruction algorithms for ECG telemonitoring in WBANs, Digital Signal Processing, Elsevier, vol.35, pp.105116, 2014.

A. Adamo, G. Grossi, R. Lanzarotti, J. Lina, ECG compression retaining the best natural basis kcoefficients via sparse decomposition, Biomedical Signal Processing and Control, Elsevier, vol.15, pp.1117, 2015.

L. N. Sharma, S. Danadapat and A. Mahanta, Multiscale PCA based Quality Controlled Denoising of Multichannel ECG Signals, International Journal of Information and Electronics Engineering, Vol. 2, No. 2, pp.107111, March, 2010.

A. Das, S. R. Nirmala and J. P. Medhi, "ECG denoising based on probability of wavelet coefficients," International Symposium on Advanced Computing and Communication (ISACC), pp. 198204, September, 2015.

MITBIH Database, Available [online ]; http://www.physionet.org/ physiobank/database/mitdb/.