Classifying Diabetic Retinopathy using Deep Learning Architecture

DOI : 10.17577/IJERTV5IS060055

Download Full-Text PDF Cite this Publication

Text Only Version

Classifying Diabetic Retinopathy using Deep Learning Architecture

T Chandrakumar M. E, Post-Graduate; R Kathirvel M.E.,(Ph.D), Assistant Professor,

Department of EEE, Department of EEE,

RVS College of Engineering, Dindigul RVS College of Engineering, Dindigul Tamil Nadu Tamil Nadu

Abstract A recent development in the state-of-art technology machine learning plays a vital role in the image processing applications such as biomedical, satellite image processing, Artificial Intelligence such as object identification and recognition and so on. In Global, diabetic retinopathy suffered patients growing vastly. And the fact is earliest stage could not diagnoses to normal eye vision. Increasing necessity of finding a diabetic retinopathy as earliest would stops vision loss for prolonged diabetes patient although suffered youngs. Severity of the diabetic retinopathy disease is based on a presence of microaneurysms, exudates, neovascularization, Haemorrhages. Experts are categorized those diabetic retinopathy in to five stages such as normal, mild, moderate, severe Non- proliferative(NPDR) or Proliferative diabetic retinopathy patient(PDR). A proposed deep learning approach such as Deep Convolutional Neural Network(DCNN) gives high accuracy in classification of these diseases through spatial analysis. A DCNN is more complex architecture inferred more from human visual perspects. Amongst other supervised algorithms involved, proposed solution is to find a better and optimized way to classifying the fundus image with little pre-processing techniques. Our proposed architecture deployed with dropout layer techniques yields around 94-96 percent accuracy. Also, it tested with popular databases such as STARE, DRIVE, kaggle fundus images datasets are available publicly.

Keywords : Diabetic Retinopathy, Convolutional Neural Network, Relu, Drop-Out

show an increasing prevalence in rural areas as well. Indian studies show a 3-fold increase in the presence of diabetes among the rural population over the last decade or so (2.2% in 1989 to 6.3% in 2003).

In India, Study shows the estimated prevalence of type 2 diabetes mellitus and diabetic retinopathy in a rural population of south india are nearly 1 of 10 individuals in rural south india, above the age of 40 years, showed the evidence of type 2 diabetes mellitus.

Fig 1. Diabetic retinopathy stages


    Diabetic retinopathy also known as diabetic eye disease, is when damage occurs to the retina due to diabetes. Its a systemic disease, which affects up to 80 percent of all patients who have had diabetes for 20 years or more. Despite these intimidating statistics, research indicates that at least 90% of these new cases could be reduced if there were proper and vigilant treatment and monitoring of the eyes. The longer a person has diabetes, the higher his or her chances of developing diabetic retinopathy.

    According to the International Diabetes Federation, the number of adults with the diabetes in the world is estimated to be 366 million in 2011 and by 2030 this would have risen to 552 million. The number of people with type 2 diabetes is increasing in every country 80% of people with diabetes live in low-and middle-income countries. India stands first with 195%(18 million in 1995 to 54 million in 2025). Previously, diabetes mellitus(DM) was considered to be present, largely, among the urban population in india. Recent studies clearly

    There are five major level of clinical DR severity. Many patients have no clinically observable DR early after DM diagnosis, yet there are known structural and physiologic changes in the retina including slowing of retinal blood flow, increased leukocyte adhesion, thickening of basement membranes, and loss of retinal pericytes. The earliest clinically apparent stage of DR is mild non-proliferative diabetic retinopathy(NPDR) characterized by the development of microaneurysms. The disease can progress to moderate NPDR where additional DR lesions develop, including venous calibre changes and intraretinal microvascular abnormalities. The severity and extent of these lesions in increased in severe NPDR, and retinal blood supply becomes increasingly compromised. As a consequence, the non-perfused areas of the retina send signals stimulating new blood vessel growth, leading to proliferative diabetic retinopathy(PDR). The new blood vessels are abnormal, friable, and can bleed easily often causing severe visual loss. Diabetic macular edema(DME)

    occurs when there is swelling of the retina due to leaking of fluid from blood vessels within the macula, and can occur during any stage of DR.

    The progression from no retinopathy to PDR can take 2 decades or more, and this slow rate enables DR to be identified and treated at an early stage. Development and progression of DR is related to duration and control of diabetes. DR in its early form is often asymptomatic, but amenable to treatment. The Diabetic Retinopathy Study and the Early Treatment of Diabetic Retinopathy Study(ETDRS) showed the treatment with laser photocoagulation can more than halve the risk of developing visual loss from PDR.


    A supervised classification is based on classifying the test image dataset from the training data with a labeled classes. In general, classification is done by extracting the features from the images followed by identifying the categorized classes based on the trained data with labeled classes.

    In Classification, the extracted features listed out the severity of the diabetic retinopathy diseases. There are five categories of diabetic retinopathy classification from non- proliferative diabetic retinopathy to proliferative diabetic retinopathy are classified based on extracted feature values.

    Some of the popular methodologies well utilized to do feature extraction and classification of diabetic retinopathy analysis are :

    S.Wang, et al[1], using convolutional neural network performs as a trainable hierarchical feature extractor and Random Forest(RF) as a trainable classifier. It has 6 stacked layers of convolution and followed by subsampling layers for feature extraction. Random Forest algorithm is utilized to for classifier ensemble method and introduced in the retinal blood vessel segmentation. This architecture is used in the DRIVE,STARE databases and achieved around 0.98 and 0.97. Mrinal Haloi et al[2], a new deep learning based computer-aided system for microaneurysm detection. Comparing other deep neural network, it required less preprocessing, vessel extraction and more deep layers for training and testing the fundus image dataset.It consists of five layers which includes convolutional, max pooling and Softmax layer with additional droput training for improving an accuracy.It achieved low false positive rate. And the performance measured as 0.96 accuracy with .96 specificity

    and .97 sensitivity.

    M.Melinscak et al[3], an automatic segmentation of blood vessels in fundus images. It contain a deep max-pooling convolutional neural networks to segment blood vessels.It is deployed 10-layer architecture for achieving a maximum accuracy but worked with small image patches. It contain a preprocessing for resizing and reshaping the fundus images. It carried around 4-convolutional and 4-max pooling layer with

    2 additional fully connected layer for vessel segmentation. Also, this method achieved an accuracy around 0.94.

    Gardner et al[4], a pioneer method of diabetic retinopathy screening tool using artificial neural network with preprocessing techniques. This method learned features from

    the sub-images.It heaviy relied on back propagation neural network. It contains set of diabetic features in fundus images and compare against the ophthalmologist screening set of fundus images.Its a wholistic approach of recognition of vessels, exudates and haemorrhages were 91.7%, 93.1% and 73.8%.

    Sohini Roychowdhury[5] proposed a novel two stage hierarchical classification algorithm for automatic detection and classification. For automated detection, novel two-step hierarchical binary classification is used. For classification of lesions from non-lesions purposed GMM, SVM, KNN and ADABOOST methods are used. They take

    30 top features like are, variance of Ired channel, Igreen channel, I sat of object, major and minor axis length, Mean pixels for Igreen, Ired and intensity, solidity etc. The DREAM system 100 percent sensitivity, .5316 specificity achieved. Also, carried out average computation time for DR severity per image from 59.54 to 3.46s. overall feature reduction effects the average computation time.

    JayakumarLachure et al[6], retinal micro-aneurysms, hemorrhages, exudates, and cotton wool spots are the abnormality find out in the fundus images. Detection of red and bright lesions in digital fundus photographs. Pre- processing, morphological operations performed to find microaneurysms and features are extracted such as GLCM and structural features for classification. This SVM classifier optimized to 100 percent and 90 percent sensitivity.

    R.Priya, P.Aruna et al[7], to diagnostic retinopathy used two models like Probabilistic Neural network(PNN) and Support Vector Machines. The input color retinal images are pre-processed using grayscale conversion, adaptive histogram equalization, discrete wavelet transform, matched filter and fuzzy C-means segmentation. The classification of pre- processed images features were extracted.It achieved an accuracy of 89.6 percent and SVM of around 97.608 percent.

    Giraddi et al[8], detection of the exudates in the color variability and contrast retinal images. Comparative analysis made for SVM and KNN classifier for earliest detection. They utilized the GLCM texture features extraction for obtaining the reduced number of false positives. Eventually the true positive rates for SVM classifier around 83.4 and KNN classifier around 92%.As a result, KNN outperforms SVM with color as well as texture features.

    Srivastava et al[9], a key idea of randomly drop units along with their connections during the training. His work significantly reduces the over fitting and gives improvements over other regularization techniques. Also, improves the performance of neural networks in vision, document classification, speech recognition etc.

    Overall other methods[12][13][14], to identifying the microaneurysm, Exudates, vessels segmentation for maximizing the accuracy rate is the key objective. Also, increases the complexity by added more preprocessing stages such as deblurring algorithm prior to detection, segmentation of blood vessels, rotating cross section,mathematical modeling of enhancing light intensity, morphological reconstruction.


In recent years most of the image processing researchers indulged in the development of machine learning especially deep learning approaches in the field of Hand-written digit recognition such as MNIST dataset, image classification by IMAGENET. Our proposed methodology strongly emerged based on these key aspects of diseases severity classification from the fundus images.

In general, especially classification of diseases with the proposed architecture a DCNN[add citation] following these basic steps to achieve maximum accuracy from the images dataset are i) Data Augmentation ii) Pre-processing iii) Initialization of Networks iv) training v) Activation function selections vi) Regularizations vii) Ensemble the multiple methods.

In our proposed diabetic retinopathy classification model in Fig.3.1, an architecture are condensed and its building blocks are :

  1. Data augmentation

  2. Preprocessing

  3. Deep Convolutional Neural Network Classification

















    Fig 3.1 Block diagram of proposed model


      The fundus images are obtained from the different datasets are taken under different camera with varying field of view, non-clarity, blurring, contrast and sizes of images different. In data augmentation, contrast adjustment, flipping images, brightness adjustments are made.


      For Deep convolutional neural network worked on spatial data of the fundus images. A primary steps involved in the preprocessing is resizing the images. Before feeding into the architecture for classification, convert the images in to gray scale. And then, convert in to the L model. It is a monochrome images which is used to highlights the microaneurysms, and vessels in the fundus images. And flatten the images in single dimensional for processing further.


      In Image recognition, a Convolutional Neural Network(CNN) is a type of feed-forward artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of animal visual cortex, whose individual neurons are arranged in such a way that respond to overlapping regions tiling the visual field.

      In deep learning, [10][11] the convolutional neural network uses a complex architecture composed of stacked layers in which is particularly well-adapted to classify the images. For multi-class classification, this architecture robust and sensitive to each feature present in the images.

      Common layers deployed in making Deep Convolutional Neural Network architecture(DCNN) are shown in Fig. 3.2

      1. Convolutional Layer

      2. Pooling Layer

      3. ReLU Layer

      4. Dropout layer

      5. Fully connected Layer

      6. Classification Layer


    This is the first and foremost layer laid after the input image which want to be classified. The backbone of the convolutional neural network are : local receptive fields, shared weights. These are making deep convolutional neural network for image recognition.

    Local receptive field :

    During image recognition, convolutional neural network consists of multiple layers of small neuron collections which look at small portions of the input image.

    Shared weights and bias :

    Each feature map of the convolutional neural network shared the same weights and bias values. This shared values will represent the same feature all over the image. Depends on the application, the feature map generation is varied.

    Fig 3.2 Deep Convolutional Neural Network (DCNN) Architecture

    The convolutional layer consists of kernel or set of filters(local receptive field). Each filter is convolved against the input image and extract the features by forming a new layer or activation map. Each activation map contain or represent some significant characteristic or features of the input image.

    In convolutional layer , NxN input neuron layer is convoluted with mxm filter. Then, the convolutional layer output will be of size (N-m+1)x(N-m+1).It applied non- linearity through neural activation function.


    This is one of the most significant layer which helps the network from avoiding over-fitting by reduce the parameters and computation in the network.

    It works as a form of non-linear down sampling. Pooling partition the activation maps into set of rectangles and collect the maximum value in the sub region. Its merely a downsize the pixels with features. For instance, if NxN input layer, that will give output layer of N/K x N/K layer.

    Th main significance of this layer is to ask whether a given feature is found anywhere in a region of the image. It then throws away the exact positional information.

  3. ReLU LAYER :

    Rectified Linear Unit(ReLU) layer is an activation function.

    x input to the neuron; also a ramp function

    A smooth approximation to the rectifier is the analytic function

    This activation function induces the sparsity in the hidden units. Also, It has been shown that the deep neural networks can be trained efficiently compared than sigmoid and logistic regression activation function.


    The crucial part of the deep convolutional neural network is handling the parameters generated from each stacked layers abundantly. It may cause over-fitting. For avoiding such scenarios, droping out some neurons in the layer which cascaded to the next layer. Usage of dropout mainly near Fully connected layer to avoid excessive generation of parameters. It is a widely used regularization techniques.

    The feed forward operation of the dropout layer network[9] can be described as (for n {0,1,. N-1} and any hidden unit i)

    rj(n) ~ Bernoulli(p) yd n = r(n) * y(n)

    zi(n+1) = wi(n+1) ydn + bi(n+1) ; yi(n+1) = f(zi(n+1));


    rj(n) Bernoulli random variables each of which has the probability p of being 1;

    yd n dropout outputs from the layer n

    wi(n) Weights at layer n

    bi(n) bias at layer n of i hidden unit

    f(zi(n)) Activation function of n layer

    Also, it cause some drawbacks of missing out the information from previous layers to the next layers. It shown those effects on model learning the parameters through back propagation error analysis.


    The layer which comes after the cascaded convolutional and max/average pooling layer is called Fully connected layer. The high level reasoning is done through this layer during classification. A fully connected layer takes all neurons in the previous layer from max-pooling layer and connects it to every neuron it has. Fully connected layers are not spatially connected anymore. It visualize as one-dimensional layer.


    After the stacked or deep multiple layers, the final layer is a softmax layer which stacked at the end for classifying the fundus image followed by the Fully connected layer output. Here, the deciding as a single-class classification or multi- class classification.


      The measurement of an accuracy for the network architecture is estimated by correctly classified DR suffered images from the pool of images in the different dataset. Also evaluate the algorithm which will be suffered by over-fitting or under-fitting could be visualized by plotting the training and validation loss. A whole objective is to minimizing the cost function of the deep convolutional neural network results significantly reflected in the testing datasets.

      In terms of diabetic retinopathy performance measurements, Specificity(SP), Sensitivity(SE) and Accuracy(Acc) are the crucial parameters for deciding the algorithms. Four parameters which take part in measuring those performances are :

      True Positive(TP) – Correctly detected DR images True Negative(TN) – Correctly detected Non-DR images False Positive(FP) – Number of Non-DR images

      are detected wrongly as DR images False Negative(FN) – Number of DR images are

      detected wrongly as Non-DR images At last, the Sensitivity, Specificity and Accuracy are

      measured for each fundus images available in the database.

      Sensitivity(true positive rate or recall) measures how likely the test is positive who someone have a diabetic retinopathy. Specificity(true negative rate) measures how likely the test is someone dont have the diabetic retinopathy. Positive predictive value is also called as Precision.Accuracy measures the diabetic and non-diabetic patients from the database.


      Hardware and Software requirements :

      For augmentation, Image editor tool is used for contrast adjustment, color balance adjustment, rotate or cropping. At preprocessing stage, monochrome conversion and resizing is done with the numpy package. Convolutional Neural Network(CNN), multi-layer deep architecture are implemented using Theano and Lasagne libraries. Simple datasets are handled with the hardware Intel i5 @2.30GHz, 4GB RAM Ubuntu 14.04 Theano API python libraries are used. For handling large kaggle dataset, Graphics Processing Unit is needed. Amazon EC2 web service instance is used.

      Dataset :

      Kaggle dataset[19] : A high-resolution retina images taken under a variety of imaging conditions. A clinician rated the presense of diabetic retinopathy and scale it as 0-4. It contain 35126 training images and 53576 test images.

      DRIVE dataset[20] : This database contain 40 color eye fundus images taken with Canon CR5 3CCD camera with 45 degree field of view. It separated as train and test images by two experts.

      STARE[21] : This dataset contain 20 color eye fundus images taken with the TopCon TRV camera with 35 degree filed of view.Each imgae has a resolution of 700*605.

      RESULTS :

        1. Pre-processing :

          Input images is scaled down to 256×256. Fig.5.1, shows the database input images and its resized monochrome images.

          Fig 5.1 Input image Fig 5.2 Monochrome image

        2. Deep Convolutional Neural Network Architecture:

      In table 5.1 can describe about the Deep convolutional neural network architecture implementation. This architecture are trained with the kaggle dataset images.

      Table 5.1 Deep convolutional neural network architecture

      Layer No.


      Maps & Neurons

      Kernel Size


















































      In Table 5.2 Confusion matrix results is shown the classification reports of the trained datasets.

      Table 5.2 Confusion matrix results

      Actual Results

      Predicted Results































      The performance evaluation reports are otained as shown in the table 5.3.

      Table 5.3 Performance evaluation reports



















      An overall accuracy has achieved around 94% for classifying the diabetic retinopathy stages with the STARE and DRIVE dataset.


Among other existing supervising algorithms, most of them are requiring more pre-processing or post-processing stages for identifying the different stages of the diabetic retinopathy. Also, other algorithms mandatorily requiring manual feature extraction stages to classify the fundus images. In our proposed solution, Deep convolutional Neural Network(DCNN) is a wholesome approach to all level of diabetic retinopathy stages. No manual feature extraction stages are needed. Our network architecture with dropout techniques yielded significant classification accuracy . True positive rate(or recall) are also improved. This architecture has some setbacks are: An additional stage augmentation are needed for the images taken from different camera with different field of view. Also, our network architecture is complex and computation-intensive requiring high-level graphics processing unit to process the high resolution images when the level of layers stacked more.


  1. S.Wang, et al, Hierarchical retinal blood vessel segmentation based on feature and ensemble learning, Neurocomputing(2014),

  2. Mrinal Haloi, Improved Microaneurysm detection using Deep Neural Networks, Cornel University Library(2015), arXiv:1505.04424.

  3. M.Melinscak.P.Prentasic, S.Loncaric, Retinal Vessel Segmentation using Deep Neural Networks, VISAPP(1), (2015):577-582

  4. G.Gardner,D.Keating, T.H.Willamson, A.T.Elliott, Automatic detection of diabetic retinopathy using an artificial neural network: a screening Tool, Brithish Journal of Opthalmology,(1996);80:940-944

  5. S.Roychowdhury, D.D.Koozekanani, Keshab K.Parhi, DREAM: Diabetic Retinopathy Analysis Using Machine Learning, IEEE Journal of BioMedical and Health Informatics, Vol.18, No 5, September (2014).

  6. J.Lachure, A.V.Deorankar, S.Lachure, S.Gupta, R.Jadhav, Diabetic Retinopathy using Morphological operations and Machine Learning, IEEE International Advance Computing Conference(IACC), (2015).

  7. R.Priya, P.Aruna, SVM and Neural Network based Diagnosis of Diabetic Retinpathy, International Journal of computer Applications(00975-8887), volume 41-No.1,(March 2012).

  8. S.Giraddi, J Pujari, S.Seeri, Identifying Abnormalities in the Retinal Images using SVM Classifiers, International Journal of Computer Applications(0975-8887), Volume 111 No.6,(2015).

  9. N.Srivastava, G.Hinton, A.Krizhevsky, I Sutskever, R Salakhutdinov, Dropout: A simple way to prevent Neural networks from overfitting, Journal of Machine learning research(2014) 1929-1958

  10. G.Lim, M.L.Lee, W.hsu, Transformed Representations for Convolutional Neural Networks in Diabetic Retinopathy Screening, Modern Artificial Intelligence for Health Analytic Papers from the AAAI(2014).

  11. P Kulkarni, J Zepeda, F Jurie, P Perez, L Chevallier, Hybrid Multi-layer Deep CNN/Aggregator feature for Image Classification, Computer Vision and pattern recognition, ICASSP conference, (2015).

  12. E M Shahin, T E Taha, W Al-Nuaimy, S.El Raaie, O F Zahran, F E Abd El-Samie, Automated Detection of Diabetic Retinopathy in Blurred Digital Fundus Images, IEEE International Computer Engineering Conference , pages-20- 25,(2012).

  13. Xiang chen et al, A novel method for automatic hard exudates detection in color retinal images, Proceedings of the 2012 International Conference on Machine Learning and Cybernetics, Xian (2012).

  14. Vesna Zeljkovi et al, Classification Algorithm of Retina Images of Diabetic patients Based on Exudates Detection, 978-1-4673-2362-8/12, IEEE(2012).

  15. MichaelNielsen,

  16. Kaggle dataset: detection/data

  17. DRIVE dataset[online] :J.J. Staal, M.D. Abramoff, M. Niemeijer, M.A. Viergever, B. van Ginneken, "Ridge based vessel segmentation in color images of the retina", IEEE Transactions on Medical Imaging, 2004, vol. 23, pp. 501-509.

  18. STARE dataset[online] :

Leave a Reply