Diabetic Retinopathy Detection using Machine Learning

DOI : 10.17577/IJERTV9IS060170

Download Full-Text PDF Cite this Publication

Text Only Version

Diabetic Retinopathy Detection using Machine Learning

Revathy R1, Nithya B S2 , Reshma J J3 , Ragendhu S S4 ,5 Sumithra M D

1,2,3,4,5Dept of Computer Science and Engineering

1,2,3,4,5LBS Institute Of technology For Women, Thiruvananthapuram, Kerala.

Abstract: -Diabetic retinopathy is a disease caused by uncontrolled chronic diabetes and it can cause complete blindness if not timely treated. Therefore early medical diagnosis of diabetic retinopathy and it medical cure is essential to prevent the severe side effects of diabetic retinopathy. Manual detection of diabetic retinopathy by ophthalmologist take plenty of time and patients need to suffer a lot at this time. An automated system can help detect diabetic retinopathy quickly and we can easily follow-up treatment to avoid further effects to the eye. This study proposes a machine learning method for extracting three features like exudates, hemorrhages, and micro aneurysms and classification using hybrid classifier which is a combination of support vector machine, k nearest neighbour, random forest, logistic regression, multilayer perceptron network. From the results of the experiments, the highest accuracy values 82%. Hybrid approach produced a precision score of 0.8119,Recall score of 0.8116 and f-measure score of 0.8028.

Keywords-Diabetic Retinopathy, KNN, SVM, Random Forest, Retinal Fundus Images

I. INTRODUCTION

Diabetic Retinopathy is a complication that affect the eye due to the result of high blood glucose called diabetes. It can cause vision loss and in severe condition can lead to complete blindness. Early symptoms of diabetic retinopathy includes blurred vision, darker areas of vision, eye floaters and difficulty in perceiving colours. Proper detection of diabetic retinopathy in early stage is extremely important to prevent complete blindness. Of an estimated 285 million people with diabetes mellitus worldwide, approximately one third have signs of diabetic retinopathy. Globally the number of people

affected with diabetic retinopathy will increase from 126.6 million in 2010 to 191.0 million by 2030. Non Proliferative Diabetic Retinopathy (NPDR) is an early stage of disease in retina where tiny red spots occur. These tiny spots may represent haemorrhage and abnormal pouching of blood vessels represents microaneurysms. The lining of these blood vessels can become damaged enough to allow leakage of fluid and fatty material called exudates.

Available physical tests to detect diabetic retinopathy includes pupil dilation, visual acuity test, optical coherence tomography, etc. But they are time consuming and patients need to suffer a lot. This paper focuses on automated computer aided detection of diabetic retinopathy using machine learning hybrid model

by extracting the features haemorrhage, microaneurysms and exudates. The classifier used in this proposed model is the hybrid combination of SVM and KNN.

II . LITERATURE REVIEW

[Farrikh Alzami, 2019] described a system for diabetic retinopathy grade classification based on fractal analysis and random forest using MESSIDOR dataset. Their system segmented the images, then computed the fractal dimensions as features. They failed to distinguish mild diabetic retinopathy to severe diabetic retinopathy.

[Qomariah 2019] proffered an automated system for classification of Diabetic Retinopathy and normal retinal images using concurrent neural network (CNN) and support vector machine (SVM). Features comprised of exudates, haemorrhage and microaneurysms. The author partitioned the proposed system into 2 parts: the first part composed with feature extraction based on neural networks and the second part performed classification using SVM.

[Kumar, 2018] proposed a system for improved diabetic retinopathy detection by extracting area and number of microaneurysms using colour fundus images from DIARETDB1 dataset. Pre-processing of fundus images were performed using green channel extraction, histogram equalization and morphological process. Principal component analysis (PCA), contrast limited adaptive histogram equalization (CLAHE), morphological process, averaging filtering were applied for microaneurysms detection and classification is done by linear support vector machine (SVM). [Mohamed Chetoui, 2018] proffered a system which detect diabetic retinopathy using different texture feature and machine learning classification model. Two features haemorrhage and exudates are extracted using local ternary pattern (LTP) and local energy- based shape histogram (LESH). SVM is used for leaning and classification of extracted histogram using feature vectors of LTP and LESH. [S Choudhury, 2016] proposed a system which deals with fuzzy C means based feature extraction and classification of diabetic retinopathy using SVM. Blood vessels extraction is performed using top hat filter and mathematical morphology. Retinal vessel density and exudates are chosen as the features. Exudate extraction is done by fuzzy C means segmentation. Gaussian Radial Basis function is used to map the training data into SVM kernel space.

[Sangwan, 2015] described a system that identifies different stages of diabetic retinopathy based on blood vessels, haemorrhage and exudates. The features are extracted using image pre-processing and they are fed into the neural network.

SVM based training provided into the data and classify the images into three categories as mild, moderate non proliferative diabetic retinopathy and proliferative diabetic retinopathy. But the system could not give expected results if the exudate areas in the fundus images exceeds that of an optical disc size.

[Morium Akter, 2014] described a system for morphology based exudates detection from colour fundus images. The model uses grayscale conversion, histogram equalisation, thresholding, erosion, dilation, logical AND operation and watershed transformation. The system produces an output with ranges of exudates affected in diabetic retinopathy. [Handayani, 2013] proposed a system for the classification of non-proliferative diabetic retinopathy using soft margin SVM. Hard exudates in the retinal fundus images are used to classify severity level of non-proliferative diabetic retinopathy. Mathematical morphology is applied to segment hard exudates. But the system does not include micoaneurysms and haemorrhage as the features.

[Saravanan, 2013] proposed an automated system for the red lesion diabetic retinopathy detection based on microaneurysms using GMM classifier. The feature is extracted using mathematical morphology, filter based method and supervised learning method. Severity level of candidate microaneurysms is detected into four stages.

[Venkatalakshmi, 2011] described automated system for hard exudate detection using sharp edge and colour highlights as two features. Methods involved in the detection process were colour based classification, sharp edge detection, and extraction of optic disc. Training and testing were done using DRIVE and DIARETDB0 dataset. The system used MATLAB

7.8 for graphical user interface (GUI).

  1. DATASET AND METHODS

    1. Dataset

      This study used publicly available Kaggle Dataset for Diabetic Retinopathy Detection. The database was created with images taken from publicly available retinopathy detection datasets. The Kaggle dataset contain 1000 images with diabetic retinopathy and 1000 images without diabetic retinopathy. From the total images we have chosen 122 images with diabetic retinopathy and 122 normal images. Chosen abnormal images contains exudates, hemorhages, and microaneurysms.

      Median Filtering

      Median Filtering

      The presence of diabetic retinopathy is based on the appearance, number, spread and size, area of exudates, microaneurysms, and hemorrhages as shown in Fig.1.Exudates are the bright areas with the yellowish appearance which has colour variance from colour of optic disc in slight range. The ruptured blood vessel contains lipid causes the occurrence of exudates. The ruptured microaneurysms in the blood vessels causes the formation of hemorrhages. Spread of exudates and hemorrhages appear in severe diabetic retinopathy images which is the last stage of diabetic retinopathy.

      Fig 1. Normal image and an image with exudates, hemorrhages and microaneurysms

    2. Steps

      Input Image

      Input Image

      Pre processing

      Pre processing

      Image Segmentation

      Image Segmentation

      Feature Extraction

      Feature Extraction

      Classification

      Classification

      Normal

      Abnormal

      Normal

      Abnormal

      Fig 1c) Flow chart of proposed model

      1. Pre-processing

        In image pre-processing, to find exudates, initially image from dataset is converted to HSV image. Colour space conversion is converting an image that is represented in one colour space to another colour space, the goal being to make the translated image look as similar as possible to the original. Red, Blue, Green channels in the given image to Hue, Saturation, Value. It is useful to extract yellow coloured exudates from RGB image when we convert RGB to HSV. Then edge zero padding, median filtering and adaptive histogram equalization is done. Fig 2 shows image before pre- processing ang Fig.3 shows image after pre-processing.

        Colour Space Conversion

        Colour Space Conversion

        Edge Zero Padding

        Adaptive Histogram Equalisation

        Adaptive Histogram Equalisation

        Fig 2 a) Abnormal image before pre-processing

        b) Normal image before pre-processing

        Fig 3 a) Abnormal image after pre-processing

        b) Normal image after pre-processing

      2. Image Segmentation

        After image pre-processing, to segment exudates we have done smoothing, masking and bitwise AND. Smoothing is employed to remove high spatial frequency noise from image. Image blurring is achieved by convolving the image with a low-pass filter kernel. Masking is an image processing

        method in which we define a small 'image piece' and use it to modify a larger image. Here we are masking yellow coloured ([60,255,255]) exudates and optic disc in smoothed image with blue ([0,0,0255]) colour.

        Bitwise AND operations are used in image manipulation and used for extracting essential parts in the image. Bitwise operations help in image masking. Image creation can be enabled with the help of these operations. These operations can be helpful in enhancing the properties of the input images. Here we are combining input image with masked image there by eliminate portions other than optic disc and exudates from original image. Fig 4. Represents abnormal and normal images after exudate segmentation

        Smoothing

        Smoothing

        Masking

        Masking

        Bitwise-AND

        Bitwise-AND

        Fig: Flowchart of exudate segmentation

        To segment hemorrhages and microaneurysms median blurring, thresholding, image erosion and image dilation are performed. Image erosion and dilation are the morphological operations performed on image. Thresholding partitions an image into foreground and background. This image analysis technique is a type of image segmentation that isolates objects by converting grayscale images into binary images. Morphological Opening is defined as an erosion followed by a dilation. Opening can remove small bright spots and connect small dark cracks. This tends to open up gaps between features. Morphological erosion sets a pixel to the minimum over all pixels in the neighbourhood. Morphological dilation sets a pixel to the maximum over all pixels in the neighbourhood. Fig

        5. represents abnormal and normal images after hemorrhages and micro aneurysms segmentation. He segmented images are represented in binary images where white spots in the images represents the feature vectors or parameters. These parameters are counted for further classification processes.

        Green Channel Extraction

        Green Channel Extraction

        Morphological Opening

        Morphological Opening

        Image Compliment

        Image Compliment

        Smoothing

        Smoothing

        Thresholding

        Thresholding

        Fig: Flowchart of hemorrhages and micro aneurysms segmentation

        Fig 4. a) Abnormal images with segmented exudates

        b) Normal images without exudates

        Fig.5 a) Abnormal images with hemorrhages and micro aneurysms

        b) Normal images without hemorrhages ang micro aneurysms

        Fig 5. c) Abnormal images with segmented hemorrhages and micro aneurysms

        d) Normal images after segmentation.

      3. Feature Extraction

        For binary classification, here we are using 2 features, ie, number of exudates as first parameter and number of hemorrhages and micro aneurysms as second parameter. That is, we are counting number of white pixels from the segmented images and divide it by total number of pixels in the image.

      4. Classification

    In the proposed method we are implementing hybrid classifier. That is we are using combination of five classifiers, Support vector machines, K nearest neighbours, Random forest. Each classifier will classify the total 244 images into either normal or abnormal image. SVM classifier with kernel radial bias function and degree 3 is used. After obtaining the classifiers we have done voting as hybrid method. Training of dataset is done on five different classifiers and testing is done. Training and testing set are prepared in ratio 80:20.

    SVM: Support Vector Machine is a supervised machine learning algorithm which is extensively used for both classification and regression day to day problems .It is mostly used in classification problems rather than regression problems. In the SVM algorithm, we will have n number of features. We can plot these each data item hat is n features as a point in n-dimensional space where the value of each feature represent the value of a particular coordinate in the n dimensional space. Then we classifies the plotted data points into n classes by means of a hyperplane.

    KNN: The k-nearest neighbors (KNN) algorithm is a simple and it is easy-to-implement focused on supervised machine learning algorithm. It is mainly used to solve both classification and regression problems. A supervised machine learning algorithm is one that pointed on labelled input data from user

    dataset, directed to learn a function. The function produces an appropriate output when a new unlabelled data is feed on the algorithm. KNN captures the idea of similarity which is often called distance / proximity / closeness. Here we are calculating the distance between points on a graph. This distance is used to classify the given data. That is less distance with data point suggests that higher similarity.

    Random Forest: Random forest implies it consists of a large number of individual decision trees. Decision trees are drawn upside down with its root at the top. In a decision tree, it contains condition/internal node, based on which the tree splits into branches/ edges. The end of the branch that doesnt split anymore is the decision/leaf. The fundamental principle behind random forest is the wisdom of crowds ie a large number of relatively uncorrelated models (trees) operating as a committee will outperform any of the individual constituent models.

    Voting: It is the simplest method of combining the outputs from multiple machine learning algorithms. Initially we create two or more standalone machine learning models with our training dataset. Then a voting classifier can then be used to combine our standalone models and average the predictions of the standalone sub-models when a new data is given to the model or predictions. The predictions of the sub-models can be weighted by providing weight for each models manually or heuristically.

  2. RESULTS AND DISCUSSION

    Sensitivity is defined as the ability of a test to detect correctly people with disease.

    Sensitivity = TP / (TP + FN)

    Specificity is defined as the ability of a test to exclude properly people without disease condition.

    Specificity = TN / (TN + FP)

    True positive (TP) is the condition when a test result is positive and individual can detect the disease. True negative (TN) is the condition when the result is negative and individual is not diagnosed with the disease. False positive (FP) is the condition when a test result is positive and individual cannot express it. False negative (FN) is the case when the result is negative and individual can have it. SVM results in 68% accuracy. KNN classifier results in 76% and random forest results in 90% accuracy. After voting of three classifiers, the testing set results in 82% accuracy. Hybrid approach produced a precision score of 0.8619,Recall score of 0.8116 and f-measure score of 0.8028. That is out of 49 test samples 36 produced correct prediction.

  3. CONCLUSION

In this proposed method hemorrhages, exudates and microaneurysms are detected. For exudate detection green channel extraction, masking, smoothing, bitwise AND are done which results in better calculation and extraction of exudates. For detection of hemorrhages and micro aneurysms, morphological operations are performed like opening. Dilation and erosion operators are performed here. For diabetic retinopathy detection, count the number for MA occurred, count the number of hemorrhages occurred and count the number of exudates occurred in the image so we can decide

the condition of image. Then features are calculated and feed to both SVM, KNN, Random Forest classifier. Voting of three classifiers are chosen as final prediction . So from the extracted feature it directly concludes the disease grade as normal or abnormal. So earlier detection and diagnosis of diabetic retinopathy help the patients from blindness and also the severe effects of disease can be decreases.

REFERENCES

  1. Farrikh Alzami, Abdussalam, Rama Arya Megantara and Ahmad Zainul Fanani, Diabetic Retinopathy Grade Classification based on Fractal Analysis and Random Forest, International Seminar on Application for Technology of Information and Communication, 2019.

  2. Dinial Utami Nurul Qomariah, Handayani Tjandrasa and Chastine Fatichah, Classification of Diabetic Retinopathy and Normal Retinal Images using CNN and SVM, 12th International Conference on Information and Communication Technology and System, 2019.

  3. Shailesh Kumar and Basant Kumar Diabetic Retinopathy Detection by Extracting Area and Number of Microaneurysms from Colour Fundus Images, 5th International Conference on Signal Processing and Integrated Networks, 2018.

  4. Mohamed Chetoui, Moulay A Akhloufi, Mustapha Kardoucha , Diabetic Retinopathy Detection using Machine Learning and Texture Features, IEEE Canadian Conference on Electrical and Computer Engineering, 2018.

  5. S Choudhury, S Bandyopadhyay, SK Latib, DK Kole, C Giri, Fuzzy C Means based Feature Extraction and Classifiaction of Diabetic Retinopathy using Support Vector Machines, International Conference on Communication and Signal Processing, April 2016.

  6. Surbhi Sangwan, Vishal Sharma, Misha Kakkar, Identification of Different Stages of Diabetic Retinopathy International Conference on Computer and Computational Sciences, 2015.

  7. Morium Akter, Mohammed Shorif Uddin, Mahmudul Hasan Khan, Morphology based Exudate Detection from Colour Fundus Images in Diabetic Retinopathy International Conference on Electrical Engineering and Information and Communication Technology, 2014.

  8. Handayani Tjandrasa, Ricky Eka Putra Arya Yudhi Wijaya, Isye Arieshanti, Classification of Non-Proliferative Diabetic Retinopathy based on Hard Exudates using Soft Margin SVM, IEEE International Conference on Control System, Computing and Engineering, November 2013.

  9. V Saravanan, B Venkatalakshmi, Vithiya Rajendran, Automated Red Lesion detection in Diabetic Retinopathy IEEE Conference on Information and Communication Technologies, 2013.

  10. Prof B Venkatalakshmi, V Saravanan, G Jenny Niveditha, Graphical User Interface for Enhanced Retinal Image Analysis for Diagnosing Diabetic Retinopathy, 2013.

  11. V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, and J. Cuadros, Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. Jama, vol. 316, no. 22, p. 2402, 2016.

  12. R.GargeyaandT.Leng,Automated identication of diabetic retinopathy using deep learning, Ophthalmology, vol. 124, no. 7, pp. 962969, 2017.

  13. B. Graham, Kaggle diabetic retinopathy detection competition report, https://kaggle2.blob.core.windows.net/forum- messageattachments/88655/2795/competitionreport.pdf/,August6,2015

    ,accessed May 20, 2018.

  14. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, Rethinking theinceptionarchitectureforcomputervision,inProceedingsoftheIEEE conference on computer vision and pattern recognition, 2016, pp. 2818 2826.

  15. M. Lin, Q. Chen, and S. Yan, Network in network, arXiv preprint arXiv:1312.4400, 2013.

  16. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, Going deeper with convolutions, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 19.

Leave a Reply