Diagnosis of Chronic Obstructive Pulmonary Disease from Lung Sounds using Support Vector Machine

According to the World Health Organization, Chronic Obstructive Pulmonary Disease is one of the leading causes of death in countries across all income groups. It is a noncommunicable, respiratory disease characterized by persistent reduction of airflow to and from the lungs. A leading cause of COPD is smoking tobacco (active and passive), raising air pollution, and occupational dust and chemicals. Emphysema and chronic bronchitis are the most common conditions that make up COPD. The damage to the lungs from COPD cannot be cured but treatment can relieve symptoms, improve quality of life, and reduce the risk of death. Early diagnosis is an important area of research in recent years. A detailed Pulmonary Function Test and Spirometry can help in the clinical diagnosis of the condition but requires the patient to exhale with force making it inaccessible for heart patients and other conditions. Lung sounds are an excellent way of observing the condition and performing diagnosis that is more patient accessible and friendly. Lung sounds for COPD patients have characteristics of wheeze, Rhonchi, and crackle sounds. To diagnose this condition Mel Frequency Cepstral Coefficients (MFCC) of Lung sounds are extracted and a Support Vector Machine (SVM) based diagnostic model is developed. Experiments were conducted to optimize the performance of SVM by analyzing two parameters during feature extraction, the significance of the Energy Parameter, and by varying the window length and results discussed. A base model was developed using the default settings of MFCC as discussed in the literature. The analysis was done to identify the optimal feature parameters to best diagnose lung sounds. The proposed model was successful in achieving high classification performance accuracy of 92.5% from a base model of 75.9% demonstrating a performance improvement of 16.6%. Keywords— COPD, MFCC, SVM, wheeze, Rhonchi, and crackle


I. INTRODUCTION
Non-communicable diseases are the leading causes of deaths than all other causes. They are caused by four behavioral risk factors that may affect the economic transition, rapid urbanization, and 21st-century lifestyles: tobacco use, unhealthy diet, insufficient physical activity, and therefore the harmful use of alcohol [6] . Some of the non-communicable diseases comprising mainly are cardiovascular diseases, cancers, diabetes, and chronic lung diseases. In that, the World Health Organization (WHO) has reported that Chronic Obstructive Pulmonary disease kills more than four million people every year and affects hundreds of millions more [7] . Women and children are particularly affected, especially those in low and middle-income countries, where they're exposed to pollution from solid fuels for cooking and heating [1] . In 2010, the Global Burden of Disease (GBD) study evaluated that COPD was the third leading cause of death, accounting for 2.1% of total adult deaths. In the next 10 years, Global deaths from COPD are projected to increase by more than 30%. In the European Union, approximately 150,000 adults aged 40 years or older die of COPD each year. Also, total COPD costs are estimated to amount €141.4 billion annually [8] .
II. MACHINE LEARNING PHASE: The Mel-Frequency Cepstrum (MFCC) gives a Discrete cosine transform(DCT) of a real logarithm of the short-term energy displayed on the Mel Frequency Scale. It gives a small set of features (about 10-20) describes the overall shape of a spectral envelope. They contain information about the rate changes in different spectrum bands. MFCC extraction takes mainly about the 12-13 feature vector because these cepstral features are favorable on the ability to separate the impact of source and filter in the signal.

B. Mel -Frequency Cepstral Coefficient Algorithm:
The MFCC function splits the entire data into overlapping segments. The length of each roll-off segment is determined by the 'WindowLength'. The length of overlap between segments is decided by the 'OverlapLength'. This computes Mel Frequency Cepstral Coefficients, log energy values for each segment [4] . The lung sound obtained from the stethoscope or microphone should be fed into the system.

Pre-emphasis:
The goal of the pre-emphasis is to compensate for the higher frequency part that was suppressed. The sound appears to be sharper after completing the emphasis. It is computed by the first-order derivative, Sn' =Sn -KSn-1 (K-emphasis coefficient, Sn-number of samples) Framing: The emphasized signal is segmented to frames with optimal overlap. The input is given as 'Window Length' and 'Time Step' of the signal. Windowing: Each frame has to be multiplied with 'Hamming Window' to keep the continuity of the first and the last points in the frame. The windowing can be computed by, s(n), n=0,N-1, Sn= {0.54 -0.46 [cos (2*3.14(n-1)) }/N-1]Sn Fast Fourier Transform: The Fast Fourier Transform is performed to obtain the magnitude frequency response of each frame.
Mel filter bank: The triangular filters are spread over the whole frequency range from zero to Nyquist frequency. The magnitude coefficient is binned by correlating them with each triangular filter, Binning=Fast Fourier Transform*corresponding filter gain The filter bank amplitude is obtained.

Cepstral features:
The Cepstral feature is calculated from a log filter bank amplitude using a Discrete Cosine Transform (DCT).

Energy:
The energy of the signal is computed from the logarithm of the signal. Given by, E=log∑Sn 2 Output: The output is obtained as several Cepstral coefficient, energy parameter based on the settings of window length, time step. The 12 Cepstral coefficient and 1 Energy parameter are extracted.

III MACHINE LEARNING TECHNIQUE:
A. Support Vector Machine: Support Vector Machine (SVM) may be a supervised machine learning algorithm that may be used for both classification and regression challenges. However, it is mostly used in classification problems. In the SVM algorithm, we plot each data item to some extent in ndimensional space (where n is the number of features you have) with the worth of every feature being the value of a particular coordinate. Then, we perform classification by finding the hyper-plane that differentiates the 2 classes alright. • In the SVM algorithm, the data points fed into the system are plotted in n-dimensional space (n-no of feature or data) with the value of a particular coordinate.

International
• The classification is performed by finding the optimal hyper-plane.
• The optimal hyperplane is obtain based on maximal marginal distance from the support vector.
• The support vector is data points that are closer to the decision surface and influence the position and orientation of the hyperplane.

A. Support Vector Machine:
Experiments were conducted to optimize the performance of SVM by analyzing two parameters during feature extraction, the significance of the Energy Parameter, and by varying the window length and results discussed. A base model was developed using the default settings of MFCC as discussed in the literature. The analysis was done to identify the optimal feature parameters to best diagnose lung sounds.
The following analyses were performed to optimize the performance of the system.

B. Inference:
In the above section, we have shown the recognition performance with the arbitrarily chosen configuration of the Cepstral coefficient for lung sound classification. In addition to that, we have made the optimization of parameters for better classification and analysis. 1. Firstly, optimization based on window length is analyzed. We have analyzed the performance(accuracy) variation based on varying the window length and the time step of a signal while extracting the MFCC feature. We have experimented with varying the window length between 15ms to 45ms. The best accuracy was found to be 92.5% at 35ms. Lung sound is a non-stationary signal. To analysis, the feature component, applying the short term Fourier transform of a signal. At 35ms, the signal is found to be stationary and matches with the breath cycle of lung sound (duration of inspiration (app. 2 Sec) and expiration (app. 3 Sec)). Thus increases the accuracy of the classifier. 2. Secondly, optimization based on the energy parameter is analyzed. We have analyzed that the performance of the classifier is better without using an energy parameter (zeroth coefficient) of the MFCC extraction method. The energy parameter values of COPD and healthy sound does not have any distinct values to classify. Thus, the energy parameter acts as a poor support-vector.

C. Pros:
• It works well with a clear margin of separation • It is effective in high dimensional spaces.
• It is effective in cases where the number of dimensions is greater than the number of samples.
• It uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.

D. Cons:
• It doesn't perform well when we have a large data set because the required training time is higher • It also doesn't perform very well, when the data set has more noise i.e. target classes are overlapping • SVM doesn't directly provide probability estimates, these are calculated using an expensive five-fold cross-validation. It is included in the related SVC method of Python sci-kit-learn library.
VI. CONCLUSION: This paper presented an overview of the feature extraction and classification technique. SVM are motivated through statistical learning theory. The theory characterizes the performance of learning machines using bounds on their ability to predict future data. SVM is trained by solving a constrained optimization problem. The optimization technique is used to efficiently train SVM with large data sets and has been successfully used for medical diagnosis. The ideas presented in the papers suggest several future research directions, from tuning the basic statistical learning theory results, to developing efficient training methods.