Deep Insights in to Diabetic Retinopathy Detection: A Review of Techniques in Deep Learning

DOI : 10.17577/IJERTV13IS010098

Download Full-Text PDF Cite this Publication

Text Only Version

Deep Insights in to Diabetic Retinopathy Detection: A Review of Techniques in Deep Learning

Dr. Praveen Blessington Thummalakunta

Sheetal Patil, Bhageshree Supekar, Atharva Ranjane, Vaishnavi Phule

Department of Information Technology,

Zeal College of Engineering and Research, Pune, Maharashtra

Abstract Diabetic retinopathy (DR) is a prominent global cause of blindness that affects a large proportion of people with diabetes. To stop irreversible visual deterioration, early detection and treatment are essential. Deep learning algorithms have shown a great deal of promise recently in automating the processing of retinal pictures to detect DR. This study attempts to do a thorough analysis of the different deep learning methods used in the diagnosis of diabetic retinopathy. The study offers a thorough examination of several convolutional neural network (CNN) architectures. Furthermore, the study explores pre- processing methods and their effects on model performance, including image augmentation and normalization. Additionally, the study assesses how dataset diversity and size affect DR detection methods' efficacy. The performance of the examined algorithms is evaluated by comparative assessments of various evaluation measures, including sensitivity, specificity, and the area under the receiver operating characteristics curve. The article also discusses the drawbacks and restrictions of the current methods and makes recommendations for possible directions for further study in this area. Researchers, physicians, and practitioners working toward the development of reliable and accurate automated systems for diabetic retinopathy identification are expected to find this thorough evaluation to be a useful resource.

KeywordsConvolutional neural network, Diabetic Retinopathy, retinal images, fundus images, Deep Learning.


    A major consequence of diabetes mellitus that affects the blood vessels in the retina and seriously jeopardizes eyesight is diabetic retinopathy (DR). It is the primary cause of vision loss in working-age adults worldwide, and the rising number of people with diabetes463 million as of right nowis directly related to the prevalence of this condition. Considering this concerning figure, early detection of DR is essential for efficient treatment and avoidance of visual damage.

    The goal of this review paper is to present a thorough overview of the topic of diabetic retinopathy identification by utilizing multiple deep learning algorithms. The main goal is to carefully investigate and assess the efficacy of various convolutional neural network (CNN) architectures intended for retinal image analysis. Prominent models that have garnered notoriety include AlexNet, VGG19, InceptionV3, ResNet18, and DenseNet121. These models each have special features and

    adjustments that affect how well they diagnose diabetic retinopathy.

    By addressing the shortcomings and problems with the current approaches and emphasizing the progress made thus far, the study acts as a spark for conversations about possible future research areas. In doing so, this comprehensive analysis hopes to establish itself as a vital resource for researchers, clinicians, and practitioners committed to furthering the creation of accurate and trustworthy automated techniques for the identification of diabetic retinopathy. In the end, better early diagnosis and intervention will improve the prognosis and quality of life for those with diabetes and its related problems. The ultimate goal is to contribute to the ongoing work in this direction.

    According to World Health Organization projections, 347 million people worldwide have diabetes, and by 2030, that figure is expected to rise to 552 million. Diabetic macular edema (DME), cataracts, glaucoma, and diabetic retinopathy (DR) are among the eye conditions that diabetics are most susceptible to. The most frequent cause of vision loss, diabetic retinopathy (DR), is brought on by bleeding of the retina's tiny blood vessels. Untreated bleeding retina veins can result in blindness or different degrees of vision loss. The presence of hard exudates, vitreous hemorrhages, microaneurysms, and retinal detachments are among the symptoms of diabetic kidney disease (DR).

    Fig.1 Normal Vs Diabetic Retina.

    Over 44 million adults over 40 have DR issues at one point or another. Owing to its quiet nature, DR may only result in minor vision issues or no symptoms at all. Physicians recommend yearly eye exams for diabetic patients since early detection may increase the likelihood of an effective therapy to avert blindness. It is possible to coordinate diabetic eye care and promptly provide suitable treatment for the prevention of blindness and vision loss with an early diagnosis and precise assessment of the severity of DR. Current research, however, indicates that the availability of this kind of healthcare varies from 60% to 90% in wealthy nations, with far lower rates in poor nations.


M. Abramoff et al. [1] integrated a CNN with an IDX-DR device to anticipate and categorize DR images. They utilized data augmentation on the 1748-image Messidor-2 dataset. A Random Forest classifier was used to merge their different CNNs to identify both normal retinal structure and DR abnormalities. The photos were divided into three categories: referable, no DR, and vision-threatening DR. They suggested a sensitivity of 96.8%, specificity of 87.0%, and area under the curve of 0.980. Regretfully, they did not take into account the five DR stages and instead treated photos of the mild DR stage as if there was no DR.

X. Wang and colleagues [2] examined the three CNN pre- trained architecturesVGG16, AlexNet, and InceptionNet V3to identify the five stages of disease recovery found in the Kaggle dataset. During the preprocessing phase, the images were downsized to 224 × 224 pixels for VGG16, 227 x 227 pixels for AlexNet, and 299 x 299 pixels for InceptionNet V3. There are just 166 photos in the dataset. Nevertheless, they trained the networks with a small number of images, which could restrict the CNN from learning additional features, and the images required more pre-processing functions to improve them. As a result, they reported an average accuracy of 50.03% in VGG16, 37.43% in AlexNet, and 63.23% in InceptionNet V3. Additionally, their study's evaluation was based on a single dataset.

Gao Huang et al.[3] introduce a novel architecture with dense connectivity, where each layer receives input from all preceding layers. DenseNet addresses vanishing gradients and promotes feature reuse, achieving competitive performance with fewer parameters than traditional architectures. Its impact extends to various computer vision tasks, making it a foundational reference in deep learning literature. This review will highlight DenseNet's key contributions, architectural insights, experimental results, and its lasting influence on subsequent research.

Sehrish Qummar, Fiaz Gul Khan, et al.[4] used five pre-trained modelsResnet50, Dense169, Inceptionv3, Dense121, and Xceptionon the Kaggle dataset. For multi-class classification, they obtained an accuracy of 80.8%. They employed a particular stacking technique in which a variety of powerful classifiers are combined to form the stacking. The ensemble model becomes excessively complex and opaque in their attempt to create a model that corrects for faults, which limits the applicability of their work. They also didn't discuss

which model works best at which stage of DR or offer the data for a single model.

Mobeen-ur-Reman et al. [5] used their proprietary CNN architecture and pre-trained models, such as AlexNet, VGG-16, and SqueezeNet , to detect the DR levels of the MESSIDOR dataset. 1200 photos total from four DR phases are included in this dataset. During the pre-processing phase, the photos were cropped, shrunk to a fixed size of 244 by 244 pixels, and improved using the histogram equalization (HE) approach. There are five layers in the custom CNN: three FC levels, two max-pooling layers, two CONV layers, and two. With their customized CNN, they claimed the best accuracy (98.15%), specificity (97.87%), and sensitivity (98.94%). Regretfully, their CNN failed to identify the DR lesions and was only able to assess one dataset.

Zubair Khan et al.[6] introduce a novel hybrid model that combines VGG and NIN for improved diabetic retinopathy detection. They underscore the need for enhanced solutions in this domain. The paper showcases the model's accuracy, potentially surpassing methods, and discusses evaluation metrics such as sensitivity and specificity. However, the drawbacks section may address concerns like computational complexity, extended training times, and potential limitations in real-world applications. This balanced perspective contributes valuable insights, acknowledging the model's strengths while highlighting areas for improvement in terms of efficiency and practical deployment.

W. Zhang et al. [7] proposed a system to detect the DR of their dataset. There are 13,767 photos in the dataset, organized into four classes. These photos were enhanced using HE and adaptive HE after being cropped and resized to fit the specifications of each network. Furthermore, a contrast stretching method used to dark images increased the contrast and data augmentation increased the size of the training images. To identify the DR, they improved the pre-trained CNN architectures ResNet50, InceptionV3, InceptionResNetV2, Xception, and DenseNet. Their method was to train the additional FC layers that were placed on top of these CNNs. Subsequently, they adjusted some layers of the CNNs to retrain them. The strong models were finally incorporated. They reported the best accuracy of 98.15%, specificity of 97.87%, and sensitivity of 98.94% by their custom CNN.

Using the AlexNet architecture on the Messidor dataset, T. Shanthi and R. Sabeenian [8] obtained a noteworthy 96.35% accuracy in recognizing the phases of diabetic retinopathy. But there are drawbacks because their approach was only tried with a single dataset and architecture. Although successful, the preprocessing methods may not be generalizable, and the paper admits that it cannot detect lesions in retinal images. Future studies could investigate larger dataset testing and enhancements in lesion recognition skills to improve applicability and guarantee a more reliable diagnostic method for diabetic retinopathy.

Dolly Das et al.[9] introduced the Diabetic Retinopathy Feature Extraction and Classification (DRFEC) approach, employing various deep learning CNN models for feature extraction and image classification in diabetic retinopathy. Their study evaluated models like VGG-16, VGG-19, Xception, InceptionV3, ResNetV2, MobileNet, etc. Notably, VGG19

exhibited the lowest accuracy, while DenseNet achieved the highest accuracy. This comprehensive exploration of CNN architectures provides valuable insights for developing effective automated systems for early diabetic retinopathy detection. Future work may focus on refining and combining these models to enhance overall performance in clinical applications

For more research we studied and compared already existing methods and their accuracies. These papers are as follows:

Table 1. Results obtained from following algorithm from reviewed paper





Activation function


Sehrish Qummar, Fiaz Gul Khan [10]

ResNet50, InceptionV3, Xception, DenseNet121, DenseNet169


Stochastic Gradient Descent (SDG)



Suvajiit Dutta , Bonthala CS

Manideep [11]

Feed Froward Neural Network (FNN) for

classification, Deep Neural Network (DNN), VGG19


Sigmoid, ReLU




Gabriel Garcal, Jhair Gallardo [12]

VGG16, VGG16noFC1,VGG16noFC2







Xianglong Zeng1, Haiquan Chen [13]

InceptionV3, Siamese-like Network Structure




Leaky ReLU



Waleed M. Gondal, Jan M. Kohler[14]

CNN, Class Activation Maps (CAM)



Softmax, ReLU


Yehui Yang, Tao Li[15]

Two staged DCNN




Hidenori Takahashi, Hironobu Tampo[16]

GoogleNet DCNN

Public Source

Softmax, ReLU



In the process of developing a Diabetic Retinopathy Detection System, retinal fundus photos were collected from various sources, particularly utilizing 3662 images from the APTOS dataset sourced from Kaggle. These images, originally scaled to 224 by 224 pixels, cover a spectrum of diabetic retinopathy severity levels, ranging from normal to severe cases. To ensure uniformity and enhance feature visibility, a preprocessing step was implemented, involving resizing each image to a resolution of 256 x 256 pixels.

The preprocessing procedure extended beyond resizing; it incorporated contrast enhancement techniques to emphasize crucial details and elevate overall image quality. Additionally, standardization of pixel intensity values through normalization was applied, ensuring consistency in image representation. This meticulous preprocessing not only facilitates a standardized format but also provides a foundation for more effective feature extraction.

The Diabetic Retinopathy Detection System focuses on extracting specific features from retinal images, including Microaneurysms, Hemorrhages, and Exudates. Moreover, the system examines the features and structure of retinal blood vessels, aiming to identify anomalies that could signify potential issues. This comprehensive approach extends to scrutinizing abnormalities in the optic disc, termed optic disc anomalies, to provide a holistic assessment of diabetic retinopathy symptoms.


(This work is licensed under a Creative Commons Attribution 4.0 International License.)

By incorporating multiple features and systematically examining retinal structures, this detection system aims to enhance the accuracy and reliability of diabetic retinopathy diagnosis. The emphasis on preprocessing and feature extraction underscores the importance of meticulous image analysis in developing effective automated systems for identifying and categorizing diabetic retinopathy. Future improvements could involve refining feature extraction algorithms or incorporating advanced machine learning techniques to further enhance the system's diagnostic capabilities.

Fig.3 Extracted Features

We shall divide diabetic retinopathy into five phases based on the extent of retinal damage. The first stage is called Diabetic Retinopathy without Proliferation, which occurs when no diabetic retinopathy was found in the patient's eye. Early stages are characterized by small areas of vascular expansion in the retina. Small volumes of blood or fluid may leak from these sites into the retina, causing Moderate Non-proliferative Retinopathy. As the condition progresses, the blood vessels supplying the retina may get clogged. This could lead to extreme retinal degeneration without proliferation, or ischemia, or the lack of blood supply to some areas of the retina. At this point, more blood vessels have blocked, which drastically reduces the amount of blood coming to particular parts of the retina. As a result, there may be an increase in the development of abnormal blood vessels, leading to retinal proliferation. At this advanced stage, the surface of the retina and the vitreous gel that fills the eye start to develop new, delicate blood vessels. The significant risk of leakage in these developing capillaries can cause internal bleeding in the eye.

We suggested a hybrid approach to identify diabetic retinopathy. CNNs are adept at extracting features from images, seeing patterns, and picking out pertinent features from images. Long-term dependencies within sequential data can be remembered by LSTMs, which are excellent at processing sequences. They can be applied to the analysis of temporal sequences or patterns in the disease progression in the case of diabetic retinopathy detection. By integrating these models, CNN can extract visual features from retinal images. These features can then be passed to the LSTM, which can use them to comprehend changes in these features over time or temporal relationships between them. This helps with the identification and tracking of the progression of diabetic retinopathy. The combination method improves the precision and comprehension of the disease's course by utilizing both the temporal dependencies and spatial image data recorded by CNN and LSTM.

In evaluating the hybrid model, the use of precision, sensitivity, and F1-score is pivotal. Precision assesses accuracy in identifying positive cases, sensitivity gauges the model's ability to capture all positives, and the F1 score provides a balanced evaluation. These metrics offer a comprehensive understanding, especially in imbalanced datasets. Following are the formulae for precision, sensitivity, and f1-score:

Precision = (TP)/(TP+FP) Sensitivity = (TP)/(TP+FN)

F1-score = (2*precision*recall)/precision + recall


Improved performance in reliably identifying indications of diabetic retinopathy in retinal pictures is expected from a diabetic retinopathy detection system that uses a hybrid model of ResNet18 and LSTM. A thorough understanding of the retinal data is made possible by the synergistic combination of the LSTM, which can capture temporal dependencies, and the ResNet18 architecture, which is renowned for its deep convolutional features. It is anticipated that this hybrid strategy will enhance sequence modeling as well as feature extraction, improving the sensitivity and specificity of diabetic retinopathy detection. The system's goal is to give patients with diabetic retinopathy access to a more reliable and accurate diagnostic tool for early identification and treatment.


Diabetic retinopathy (DR) affects a sizable fraction of people with diabetes and is one of the primary causes of blindness globally. Preventing irreversible eyesight loss requires early detection and prompt management. Deep learning algorithms have demonstrated encouraging outcomes in automating the diagnosis of diabetic kidney disease (DR) using retinal pictures in recent years. Thus, our proposed hybrid model of ResNet18 and LSTM. A synergistic combination of robust feature extraction from ResNet18 and temporal dependency detection from LSTM will efficiently address the challenge of identifying subtle indications of diabetic retinopathy in image sequences. With the use of this hybrid model, retinal anomalies suggestive of diabetic retinopathy can be identified early thanks to a proactive approach. The method improves sensitivity and specificity by utilizing the advantages of both architectures, which helps diabetic patients receive faster diagnosis and treatment. Although the results are encouraging, it is important to recognize that there may be obstacles to overcome, like the need for computing efficiency in real-world deployment and the adaption of the model to a variety of patient datasets.


[1] Michael David Abramoff, Yiyue Lou, Ali Erginay, Warren Clarida, Ryan Amelon, James C. Folk, Meindert Neimeijer;Improved Automated Detection of Diabetic Retinopathy on a Publicly Available Dataset Through Integration of Deep Learning. Invest. Opthalmol, Vis.Sci. 2016;57(13):5200-5206.

[2] Xiaoliang Wang Wang, Xaiolang & Lu, Yongjin & Wang, Yujuan & Chen, Wei-Bang. (2018) . Diabetic Retinopathy Stage Classification Using Convolutional Neural Networks. 465 471.10.1109/IRI.2018.00074

[3] Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recog nition, Honolulu, HI, USA, 21-26 July 2017;pp. 4700-4708.

[4] Khan, Z.; Khan, F.G.; Khan, A.; Rehmsn, Z.U.; Shah, S.; Oummar, S.; Ali, F.; Pack, S. Diabetic Retinopathy Detection Using VGG-NIN a Deep Learning Architecture. IEEE Ac cess 2021. 9, 61408-61416

[5] Mobeen-ur-Rehman, Khan, S, H., Abbas, Z., & Danish Rizvi, S. M. (2019). Classification of Diabetic Retinopathy Images Based on Customized CNN Architecture. 2019 Amity International Conference on Artificial Intelligence (AICAI). cai.2019.8701231.

[6] Khan, Z.; Khan, F.G.; Khan, A.; Rehmsn, Z.U.; Shah, S.; Oummar, S.; Ali, F.; Pack, S. Diabetic Retinopathy Detection Using VGG-NIN a Deep Learning Architecture. IEEE Ac cess 2021. 9, 61408-61416

[7] Wei Zhang, Jie Zhong, Shijun Yang, Zhentao Gao, Junjie Hu, Yaunyaun Chen, Zhang Yi, Automated identification and grading system of diabetic retinopathy using deep neural networks, Knowledge- Based Systems, Volume 175,2019, Pages 12-25, ISSN 0950-7051.

[8] T. Shanthi, R.S. Sabeenian, Modified Alexnet architecture for classification of diabetic ret inopathy images, Computers & Electrical Engineering, Volume 76,2019, Pages 56-64, ISSN 0045-7906,

[9] Das D, Biswas SK, Bandyopadhyay S. Detection of Diabetic Retinopathy using Convolu tional Neural Networks for Feature Extraction and Classification (DRFEC). Multimed Tools Appl. 2022 Nov 29:1-59. doi: 10.1007/s11042-022-14165-4. Epub ahead of print. PMID: 36467440; PMCID: PMC9708148.

[10] S. Qummar, F. G. Khan, S. Shah, A. Khan, S. Shamshirband, Z. U. Rehman, I. Ahmed Khan, and W. Jadoon, A deep learning ensemble approach for diabetic retinopathy detection, IEEE Access, vol. 7, pp. 150530150539, 2019.

[11] S. Dutta, B. C. Manideep, S. M. Basha, R. D. Caytiles, and N. C. S. N. Iyengar, Classification of diabetic retinopathy images by using deep learning models, Int. J. Grid Distrib. Comput., vol. 11, no. 1, pp. 89 106, Jan. 2018.

[12] G. García, J. Gallardo, A. Mauricio, J. López, and A. D. Carpio, Detection of diabetic retinopathy based on a convolutional neural network using retinal fundus images, in Proc. Int. Conf. Artif. Neural Netw. Springer, 2017, pp. 635642.

[13] X. Zeng, H. Chen, Y. Luo, and W. Ye, Automated diabetic retinopathy detection based on binocular siamese-like convolutional neural network, IEEE Access, vol. 7, pp. 3074430753, 2019.

[14] W. M. Gondal, J. M. Kohler, R. Grzeszick, G. A. Fink, and M. Hirsch, Weakly-supervised localization of diabetic retinopathy lesions in retinal fundus images, in Proc. IEEE Int. Conf. Image Process. (ICIP), Sep. 2017, pp. 20692073.

[15] Y. Yang, T. Li, W. Li, H. Wu, W. Fan, and W. Zhang, Lesion detection and grading of diabetic retinopathy via two-stages deep convolutional neural networks, in Proc. Int. Conf. Med. Image Comput. Comput.-

Assist. Intervent. Springer, 2017, pp.533540

[16] H. Takahashi, H. Tampo, Y. Arai, Y. Inoue, and H. Kawashima, Applying artificial intelligence to disease staging: Deep learning for improvedstaging of diabetic retinopathy, PLoS ONE, vol. 12, no. 6, Jun. 2017,Art. no. e0179790.

[17] Mahmoud Abdallah, Nhien An Le Khac, Hamed Jahromi, and Anca Delia Jurcut. 2021. A Hybrid CNN-LSTM Based Approach for Anomaly Detection Systems in SDNs. In The 16th

[18] International Conference on Availability, Reliability and Security(ARES 2021), August 17-20, 2021, Vienna, Austria. ACM, New York, NY, USA 7 Pages.

[19] Nadeem, M.W., Goh, H.G.; Hussain, M.; Liew, S.-Y.; Andonovic, I.; Khan, M.A. Deep Learning for Diabetic Retinopathy Analysis: A Review, Research Challenges, and Future Directions. Sensors 2022,22,6780. s22186780

[20] Romero-Aroca, P.; Verges-Puig, R.; De La Torre, J.; Valls, A.; Relario- Barambio, N.; Puig D.;Baget-Bernaldiz, M. Validation of a deep learning algorithm for diabetic retinopathy. Telemed. e-Health 2020, 26, 1001-1009.

[21] Ruamviboonsuk, P.; Krause, J.; Chotcomwongse, P.; Sayres, R.; Raman, R.; Widner, K.; Campana, B.J.; Phene, S.; Hemarat. K.; Tadarati, M.; et al. Deep learning versus human graders for classifying diabetic retinopathy severity in a nationwide screening program. Npi Digit. Med. 2019, 2, 25.

[22] Beede, E.; Baylor, E.;Hersch, F.; Iurchenko, A.; Wilcox, L.; Ruamviboonsuk, P.; Vardou lakis, L.M. A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25-30 April 2020;pp. 1-12.

[23] Hacisoftaoglu, R.E.; Karakaya, M.; Sallam, A.B. Deep learning frameworks for diabetic retinopathy detection with smartphone-based retinal images systems. Pattern Recognit Lett. 2020, 135, 409-417.

[24] Rego, S.; Dutra-Medeiros, M.; Soares, F.; Montiero-Soares, M. Screening for Diabetic Ret inopathy Using an Automated Diagnostic System Based on Deep Learning: Diagnostic Ac curacy Assessment.Opthalmologica, 2021, 244, 250-257

[25] Benmansour, F.; Yang, Q.; Damopoulus, D.; Anegondi, N.; Neubert, A.; Novosel, J.; Armendariz, B.G.; Ferrera, D. Automated screening of moderately severe and severe nonpro liferatuve diabetic retinopathy(NPDR) from 7-field color fundus photographs(7F-CFP) using dep learning(DL). Invest. Opthalmol.Vis. Sci. 2021,62,115.

[26] Saha, S.K.; Fernando, B.; Cuadros, J.; Xiao, D.; Kanagasingam,Y. Automated quality assessment of color fundus images for diabetic retinopathy screening in telemedicine. J.Digit. Imaging 2018, 31, 869-


[27] Li, T.; Gao, Y.; Wang,K.; Guo,S.; Liu, H.; Kang, H. Diagnostic assessment of deep learning algorithms for diabetic retinopathy screening. Inf. Sci. 2019,501, 511-522.