Detection of Retinal Diseases using Smartphone Fundoscopy and Deep Learning

DOI : 10.17577/IJERTV11IS090011

Download Full-Text PDF Cite this Publication

Text Only Version

Detection of Retinal Diseases using Smartphone Fundoscopy and Deep Learning

Chiranthana M, Suhaas Gummalam, Tharun Sekar

Information Science and Engineering Dayananda Sagar College of Engineering, Bangalore,

Karnataka, India

detection of Diabetic retinopathy and other eye diseases.

AbstractIn recent times, there has been a lot of research on diabetic retinopathy and other eye diseases. Our project helps in the diagnosis of several retinal diseases using deep learning techniques. We also built a simple smartphone fundoscopy. Our web app takes images with the assistance of the ophthalmoscope and feeds them to the web app which runs a deep learning model in the backend and returns a diagnosis. The low-cost device and web app are meant to be an alternative to regular hospital visits for people with retinal diseases.


    1. Overview

      Diabetic Retinopathy and Glaucoma are leading causes of blindness. Diabetes Retinopathy occurs when diabetes damages the tiny blood cells inside the retina. A healthy retina is necessary for good vision. Over time, diabetic retinopathy gets worse and causes blindness. Everyone with diabetes should get a comprehensive dilated eye exam at least once every year. This can help avoid its progression. Also, during the first three stages of diabetic retinopathy, no treatment is needed. But regular access to specialists is difficult for most people. Our project is three- fold: (1) We develop a low-cost device that can be directly used with your phone to take retinal images; (2) We build a web application that takes the image from the user and outputs the result of the diagnosis; (3) We build a deep learning model to detect and classify diabetic retinopathy and other retinal diseases.

    2. Problem Statement

      Diabetic retinopathy is a perilous eye disease that can lead to blindness if not treated at the early stages. In recent times, there has been a spike in the number of registered cases of Diabetic Retinopathy. From 2010 to 2050, the number of Americans with diabetic retinopathy is expected to nearly double, from 7.7 million to 14.6 million. The accessibility of Indirect Ophthalmoscopes (devices to produce fundus images) is limited. A glaucoma is a group of eye diseases that can cause vision loss and blindness by damaging a nerve in the back of your eye called the optic nerve. The symptoms can start so slowly that you may not notice them. The only way to find out if you have glaucoma is to get a comprehensive dilated eye exam. Theres no cure for glaucoma, but early treatment can often stop the damage and protect your vision.

    3. Objectives

      1. To analyze empirical retina data for the detection of Diabetic retinopathy and Glaucoma.

      2. Application of Deep learning techniques for the

        the features of the retina. Shanmugam [7] used a mobile phone camera as a video indirect ophthalmoscope and a

      3. To carry out analysis of retina data for prediction of an aneurysm in the retinal veins.

      4. To detect eye diseases with high accuracy.

    4. Motivation

    Diabetic retinopathy and Glaucoma are frequent causes of blindness in adults in various countries, and their occurrence continues to increase. The global diabetes prevalence in 2019 is estimated to be 9.3% (463 million people), rising to 10.2% (578 million) by 2030 and 10.9% (700 million) by 2045. Prompt treatment of Diabetic Retinopathy can prevent blindness in more than 90% of cases, but the right treatment depends on a timely prognosis, and that continues to be a challenge worldwide. It affects the retina, the part of the eye that processes light into images. High blood glucose levels can cause the blood vessels in the eye to burst, swell, or leak; therefore damaging the eye. Researchers estimate that as many as half of all patients with diabetes remain undiagnosed. In many cases, the diagnosis is made only with the onset of complications. Regular and frequent ophthalmic examinations for patients with diabetes are crucial to detect the earliest signs of Diabetic Retinopathy and begin prompt treatment. However, a large number of barriers keep many of these patients from receiving the care they need to reduce the risk of blindness, including a lack of qualified ophthalmologists.


    Kumar [1] showed an evaluation of the diagnostic capability of a smartphone handset compared with a standard office computer workstation for tele ophthalmology fundus photo assessments of diabetic retinopathy. Ophthalmic images transmitted through both smartphone and Internet techniques match well with each other. Despite current limitations, smartphones could represent a tool for fundus photo assessments of diabetic retinopathy. Carrera [2] proposes a computer-assisted diagnosis based on the digital processing of retinal images to help people detect diabetic retinopathy in advance. They use a support vector machine to figure out the retinopathy grade of each retinal image. Priya [5] uses three models: Probabilistic Neural network (PNN), Bayesian Classification, and Support vector machine (SVM) are described and their performances are compared. The amount of the disease spread in the retina can be identified by extracting

    direct ophthalmoscope particularly to document the fundus findings of infants undergoing examination under

    anesthesia and also as a tool for fundus screening in camps. Abramoff [8] presents a brief overview of the most prevalent causes of blindness in the industrialized world including age-related macular degeneration, diabetic retinopathy, and glaucoma, the review is devoted to retinal imaging and image analysis methods and their clinical implications. Throughout the paper, aspects of image acquisition, image analysis, and clinical relevance are treated together considering their mutually interlinked relationships. Das [6] survey tells about the possibilities of diagnosis of retinal diseases from the retinal images using machine learning techniques. Abtahis [4] paper discusses the probable impacts of the pandemic on ophthalmic disorders and surgical indications over the coming years. Galieros [3] paper talks about the impact and the importance of telemedicine during the COVID-19 pandemic with a special emphasis on Diabetic Retinopathy due to a drastic increase in the cases of DR and to reduce the cumbersome procedure of screening.


    1. Functional Requirements

      1. The system must allow the user to upload images using the upload button on the diagnosis page.

      2. The system must return the result of the diagnosis to the user on the diagnosis page after the user has uploaded the page and clicked on the submit button.

      3. The system must allow the user to contact the development team by entering information into the contact us page.

      4. The system must allow the user to sign in using their credentials on the sign-in page if they have already signed up or created an account using the signup page.

    2. Non-Functional Requirements

      1. When the submit button is clicked, the user should get the diagnosis results within a minute.

      2. The user should only be able to sign in using the correct credentials.

      3. The user should be able to sign into their account using any device.

      4. The user should be able to use the web application using the latest versions of the operating systems and common browsers.

    3. Software Requirements

      1. Front-end: React.js, Firebase.

      2. Back-end: Tensorflow.js, Flask.

      3. Deployment: Google Cloud.

      4. Languages: Python, JavaScript.

    4. Hardware Requirements

      1. 28D Double Aspheric Lens.

      2. Eye model.

      3. Google Colab or GPU.

      4. 8GB RAM.

      5. Windows 10 or Linux or Mac OS.


    1. Analysis

      The patient uses the equipment to capture images of the retina and uploads them to the web app. From there, the image is sent to the deep learning model. The image is pre-processed and then the pre-trained deep learning model is used to classify the image. We also get an accuracy with which the prediction is made. The result is sent to be displayed to the user on the web app. The user can also connect with doctors using our contact us form.

    2. System Design

    We have designed the system so that it fulfills all the functional requirements for proper working without any flaws. The following diagrams explain them better.

    Fig. 1: System Architecture Diagram

    System Architecture Diagram: Our proposed system architecture (fig. 1) consists of a user interface that is a website and the core program that is the deep learning model. The image is taken by a user through his/her cellphone camera with the assistance of our apparatus. Then the image undergoes the necessary preprocessing and is fed to our model. The preprocessing includes feature extraction, and image augmentation just to name a few. The model then outputs a prediction with high accuracy.

    Fig. 1: System Architecture Diagram

    Data Flow Diagram: Fig. 2 depicts the flow of the proposed system. It mentions all the involved entities in the right order and connections. The usual lifecycle goes something like this. The patient takes a photo of his/her eye through the phone camera. The indirect ophthalmoscope is attached to the camera to take close- up images of the retina. The patient then uploads the image on the website after signing in. The choice of the disease to be checked is offered to the patient on the website. After uploading the image and hitting submit, the deep learning model gives its prediction along with the accuracy to the user on the website itself. The patient can either decide to consult a doctor through our website or can visit a doctor at their behest.

    Fig. 2: Data Flow Diagram

    Flow Chart: Fig. 3 is the flow chart of the complete program. Starts with the input dataset taken through a fundus camera. Then a deep learning model is trained on these images. The accuracy of the model is judged and if it is not satisfactory, the input dataset undergoes image augmentation. The pre-trained model also undergoes some tweaking. The same process is repeated till the desired accuracy is achieved. Once the model is ready, it is used on the website to make predictions on actual patient data.

    Fig. 3: Flow Chart Diagram

    Use Case Diagram: Fig. 4 demonstrates the use case diagram. Here the actors are the patients and the doctors. The use cases are Web Application login which pertains to signing up on the website. The Upload image means uploading an image file to the website. The Doctor is only associated with the doctor's feedback when the user is willing to consult the doctor. The user is also included in the Result Analysis use case.

    Fig. 4: Use Case Diagram

    Sequence Diagram: Fig. 5 demonstrates the Sequence diagram of the proposed system. The actors here are patients and doctors. The mentioned figure explains the sequence of events from logging on to the website to contacting a doctor is necessary.

    Fig. 5: Sequence Diagram

    1. Introduction


    1. Web Implementation

    The main concept is to use the smartphone screen to perform the exam rather than an indirect ophthalmoscope. The web page is hosted on the localhost but can be deployed onto any platform. The device and the web app can be used as follows:

    Enable the smartphones video mode.

    Set the flash to on for uninterrupted illumination. Fit the device with the lens on your smartphone and keep it at an appropriate distance from the eye to start recording.

    Use the smartphones pinch zoom feature to focus as needed.

    Once the exam is complete, stop the recording. To obtain a still image from the sequence, replay the video and capture a screenshot at the desired time.

    Upload the image to the web app for instant diagnosis using our deep learning models or send it to a doctor for remote consultation.

    1. System Implementation

      The images of the retina are captured using the apparatus attached to the smartphone. The image is digitally processed if necessary. The image is then uploaded to the web application where it is sent to the deep learning model. The deep learning model is a pre- trained model which classifies whether the retina does or does not have retinal disease. The accuracy and the prediction are sent to the user. For the deep learning model, the images from the dataset are first augmented. Then, the TensorFlow image data generator is used to increase the size of the dataset. Once the final dataset is ready, a deep learning model is trained on the images till the desired accuracy is obtained.

      Programming Language Used:

      • Python: for deep learning.

      • JavaScript: for the web application. Libraries Used:

      • Pandas & Numpy: for image preprocessing.

      • Tensorflow: to build deep learning models

      • Tensorflow image data generator: to increase the size of the dataset.

      • Flask: for the back-end of the web application.

      • React.js: for the front end of the web application.

      • Firebase: for user authentication.

    Fig 8: Diagnosis Page

    Fig 10: Result Page

    1. Dataset:

      3,662 high-resolution color photos with labels made up the training set of the dataset, whereas 1,928 images without labels made up the test set. In each batch, images were divided into 5 groups based on how severe the Diabetic Retinopathy existed. The group with label 0 is the control group. The labels 1-4, respectively, stand for mild, moderate, severe, and proliferative Diabetic Retinopathy. With more than 1,800 photographs representing the control group (label = 0) and fewer than 300 images representing the most severe category (label = 4), there is a glaring mismatch in the sizes of the groups. Even though this mismatch in real-world data is expected, it presents a challenge for many machine learning algorithms. In addition to uneven classes, the datasets images are varied.

    2. Image Pre-processing:

    1) Resizing: In image pre-processing resizing is of paramount importance as it needs to maintain information without loss of information. Here, the images are reduced to 224 X 224 pixels to be compatible with the pre-trained models used for transfer learning. While resizing i.e. reducing the size of the image there is pixel loss which in turn reduces the information but decreases the training time. The images need to be resized to the extent of maintaining the balance between a quicker training period and not considerable loss of information affecting the accuracy.

    1. Image Normalization:

      Colors are usually represented numerically between the range of 0 255. Each segment of the range indicates a certain primary color (Red, Green, or Blue). However, our model will have to calculate on a larger range of numbers, therefore, increasing the time complexity and data computation. Instead, we divide each pixel of an image by 255 such that they fall in the range of 0 1. This greatly enhances the efficiency without losing any of the original data.

      Fig. 6: Device set-up to take images of the retina.

      ID Test Case Description Expected Output Actual Output Test Case Re- sult

      1. Sign In without ceating an account first Account does not exist Account does not exist Pass

      2. Sign In with an account already created Diagnostic page opens Diagnostic page opens Pass

      3. Sign up with the same credentials as existing ac-


        Account already exists Account already exists Pass

      4. Submitting without

        choosing an image file

        Please upload an Image

      5. Submitting a PDF instead of an image file

        Please upload an image

        Please upload an Image file

        Please upload an image file

        f i l e

        f i l e

        Pass Pass

      6. Submitting other

        retinal diseases instead of diabetic retinopathy and Glaucoma

        Please upload the specified image file Uploaded Successfully Fail

      7. Adding multiple files Adding multiple files Adding multiple files Pass

        TABLE I: Test Cases

        1. The device. (b) Image of the retina.

    (c) Image of the retina. (d) Augmented image.

    Fig. 7: Implementation

    The range indicates a certain primary color (Red, Green or Blue). However, our model will have to calculate on a larger range of numbers, therefore, increasing the time complexity and data computation. Instead, we divide each pixel of an image by 255 such that they fall in the range of 0 1. This greatly enhances the efficiency without losing any of the original data.


    Image classification using Deep Learning, in general, re- quires a large amount of input dataset to make accurate predictions. In most cases, there is an imbalance of images which implicitly makes our model biased. Since the model undergoes a repetitive process of learning,the underrepresented groups are seldom seen compared to the overrepresented groups. To curb this outcome, data augmentation is used to subtly alter existing images. The parameters can range from resizing to flipping an image.


    A method called transfer learning can be used to execute computer vision tasks on small datasets very effectively. Here, a network that has been pre-trained on a very large image dataset is saved. This saves a network that was previously trained on a large image dataset. One such database, called ImageNet, gives users access to tens of millions of manually labeled pictures for computer vision applications. The goal of ImageNets annual challenge is to promote research into cutting-edge algorithms and develop computer vision research. The most advanced models each year have extremely complicated architectures with dozens to hundreds of layers. It is not viable to train these models on a personal CPU without access to extremely powerful computing resources. The spatial hierarchy of features learned by the pre-trained model can serve as a general representation of the visual world because these models have been trained on such enormous datasets.

    1. Feature extraction

      Using the convolutional base from a pre-trained network and then subjecting the model to a new classifier (dense layer) trained from scratch is referred to as feature extraction. The convolutional base, however, will contain useful maps of generic concepts that can be transferred between tasks even if they are unrelated. A comparison of the convolution bases from Visual Geometry Group (VGG16), ResNet50, and Xception is evaluated to see which achieves the maximum accuracy.

    2. Fine-tuning

    Higher layers of the convolution base tend to extract characteristics that may be more task-specific, while early levels prefer to extract less-abstract ideas like edges and dots. A pre-trained model can be fine-tuned by unfreezing some of the top layers but keeping the base layers frozen. Fine-tuning is proposed to be a beneficial method for enhancing model accuracy. Modifying these abstract layers that would be relevant, doing so may improve the models performance.


    The complete implementation code can be found at

    The pseudocode is as follows:

    Step 1: Take a photo of the eye using the artificial ophthal- moscope attached to the mobile phone.

    Step 2: Upload the image to the website which is responsible to provide a result with said accuracy.

    Step 3: Then we shrink the original image size to (224,224) pixels to fit the Deep learning model.

    Step 4: Preprocessing is done on the images i.e. data augmentation is done to increase the dataset using TensorFlow Image Data Generator.

    Step 5: Next, the images are fed into the Deep learning model. Here the model will learn the pattern in the given dataset through repetitive passes of the convolutional and dense layers.

    Step 6: The trained model is then evaluated against the ground truth. Accuracy is obtained by this process.

    Step 7: If the Accuracy is less, experiment with the parameters and go to step 4.

    Step 8: The accuracy and the result are displayed on the website for the patients to view.

    Step 9: An option to mail the results to a doctor is also provided, in case the patient is looking for a further review.


    The different test cases that we used to test our project are as follows:


    Detection of retinal diseases like Diabetic Retinopathy and Glaucoma.

    Creation of a web interface using MERN stack for patients and doctors to access and interact.

    Deep learning model selection by analyzing various parameters.

    Establishing a hardware model through trial and error. Understanding the lenses used in an indirect ophthalmo-



    The fundus images of patients having retinal diseases are predicted with high accuracy. Images of patients having diabetic retinopathy are predicted along with the stages. Everyone with retinal diseases should get a comprehensive dilated eye exam at least once in 4 months. This can help avoid its progression. But regular access to specialists is difficult for most people. Our model solves this problem by building hardware to be attached to a cellular device camera which acts as an indirect ophthalmoscope. Our web app gives instant diagnosis on retinal images using deep learning. Every patient diagnosed with a retinal disease can contact a doctor by sending them the prediction results for further treatment plans or further diagnosis.


We plan on including more retinal diseases as well as detecting them at every stage.

We plan on including in-app doctor consultations by partnering with doctors.

We plan on further improving the accuracy of our deep learning models and experimenting with other models.

We plan on studying HIPAA guidelines to understand how best we can protect our users medical data.

We plan on doing outreach and making this extremely low-cost device accessible to not only urban but also suburban and rural areas in India. We plan on making the project open source so that other people can improve it.


[1] Kumar, S., Wang, E. H., Pokabla, M. J., Noecker, R. J. (2012). Teleophthalmology assessment of diabetic retinopathy fundus images: smartphone versus standard office computer workstation. Telemedicine journal and e-health: the official journal of the American Telemedicine Association, 18(2), 158


[2] Carrera, E.V., Gonza´lez, A., Carrera, R.A. (2017). Automated detection of diabetic retinopathy using SVM. 2017 IEEE XXIV International Conference on Electronics, Electrical Engineering and Computing (INTERCON), 1-4.

[3] Galiero, R., Pafundi, P. C., Nevola, R., Rinaldi, L., Acierno, C., Caturano, A., Salvatore, T., Adinolfi, L. E., Costagliola, C., Sasso, F. C. (2020). The Importance of Telemedicine during COVID-19 Pandemic: A Focus on Diabetic Retinopathy. Journal of diabetes research, 2020, 9036847.

[4] Abtahi, S. H., Nouri, H., Moradian, S., Yazdani, ., Ahmadieh,

H. (2021). Eye Disorders in the Post-COVID Era. Journal of ophthalmic vision research, 16(4), 527530.


[6] Das, S., Malathy, C. (2018). Survey on the diagnosis of diseases from retinal images.

[7] Fundus imaging with a mobile phone: A review of techniques

– Mahesh P Shanmugam, Divyansh KC Mishra, R Madhukumar, Rajesh Ramanjulu, Srinivasulu YReddy, and Gladys Rodrigues.

[8] Abra`moff, M. D., Garvin, M. K., Sonka, M. (2010). Retinal imaging and image analysis. IEEE reviews in biomedical engineering, 3, 169208.

[9] Gagnon, L., Lalonde, M., Beaulieu, M., Boucher, M.C. (2001). Procedure to detect anatomical structures in optical fundus images. SPIE Medical Imaging.

[10] Shabbir, A., Rasheed, A., Shehraz, H., Saleem, A., Zafar, B., Sajid, M., Ali, N., Dar, S. H., Shehryar, T. (2021). Detection of glaucoma using retinal fundus images: A comprehensive review. Mathematical biosciences and engineering: MBE, 18(3), 2033 2076.



[13] classification-with-TensorFlow-data-augmentation-on- streaming-data-part-2/

[14] transfer- learning-for-medical-image-classification- fd772054fdc7


We would like to express my deep gratitude to Professor Suma V, for her patient guidance, enthusiastic encouragement, and useful critiques of this research work.

We would also like to also thank you parents , teachers and all the faculty of Dayananda Sagar College of Engineering for their constant support and encouragement.