DOI : 10.17577/IJERTCONV14IS050007- Open Access

- Authors : Md Tanvir Chowdhury, Md Fahad Mia, Anindita Sutradhar, Md Khalid Mahbub Khan, Abu Talha
- Paper ID : IJERTCONV14IS050007
- Volume & Issue : Volume 14, Issue 05, IIRA 5.0 (2026)
- Published (First Online) : 24-05-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
Eye Diseases Detection Using Machine Learning Models
Md Tanvir Chowdhury Department of Computer Science and Engineering
East West University
Dhaka, Bangladesh mdtanvirchowdhury015@gmail. com
Abu Talha
Department of Computer Science and Engineering East West University) Dhaka, Bangladesh mtalha147258@gmail.com
Md Fahad Mia
Department of Computer Science and Engineering
East West University Dhaka, Bangladesh fahad.wp07@gmail.com
Md Khalid Mahbub Khan Department of Computer Science and Engineering
East West University
Dhaka, Bangladesh khalidmahbub.khan@yahoo.com
Anindita Sutradhar Department of Computer Science and Engineering East West University Dhaka, Bangladesh
2022-1-50-007@std.ewubd.edu
Abstract This research provides an extensive analysis of eye illness identification utilizing deep learning models, emphasizing the assessment and comparison of EfficientNetB0, VGG19, and MobileNetV2 for the classification of four primary ocular conditions: cataract, diabetic retinopathy, glaucoma, and normality. A collection of fundus pictures was utilized to train and assess the models, with performance quantified by accuracy, precision, recall, F1-score, and loss values throughout both training and validation phases. The EfficientNetB0 model exhibited exceptional performance, with a peak training accuracy of 98.5% and a validation accuracy of 94.2%, demonstrating robust generalization and tolerance to fluctuations in picture quality. It surpassed VGG19 in accuracy, recall, and F1-score, especially for minority classes, while demonstrating reduced memory consumption and accelerated convergence. VGG19 demonstrated significant accuracy, achieving a training accuracy of 97.8% and a validation accuracy of 92.6%, but exhibited elevated computing requirements and indications of overfitting. MobileNetV2, while not extensively assessed, shown promise for real-time applications owing to its lightweight design. This study emphasizes the efficacy of deep learning models in medical image categorization and offers guidance on selecting the most suitable model for practical use in ophthalmology.
Keywords Eye disease detection, deep learning, convolutional neural networks (CNNs), EfficientNetB0, VGG19, MobileNetV2, retinal image classification.
-
INTRODUCTION
The identification and diagnosis of ocular illnesses are essential for preventing visual impairment and for prompt medical treatment. Worldwide, millions have ocular disorders, with diabetic retinopathy, glaucoma, and cataracts ranking as primary causes of blindness. Timely identification and precise diagnosis can markedly enhance therapy results. Nevertheless, conventional diagnostic methods frequently depend on manual assessment by ophthalmologists, which is both labor-intensive and susceptible to human error. With the increasing demand for accurate and scalable diagnostic solutions, the application of deep learning techniques in medical imaging has emerged as a disruptive strategy.
Recent breakthroughs in deep learning, especially convolutional neural networks (CNNs), have transformed picture categorization tasks, including applications in
healthcare. CNN-based models can discern complex patterns from high-resolution pictures, rendering them adept at detecting minor signals suggestive of ocular disorders. The creation of resilient automated systems for illness detection can reduce the workload on healthcare practitioners while enhancing diagnostic efficiency and precision.
This study examines the utilization of three notable CNN architecturesMobileNetV2, EfficientNetB0, and VGG19for the automated identification of eye diseases. These models were chosen for their unique architectural benefits. MobileNetV2, engineered for resource efficiency, is especially appropriate for mobile and embedded applications. EfficientNetB0 utilizes a compound scaling method to optimize width, depth, and resolution, resulting in enhanced accuracy with reduced processing requirements. Conversely, VGG19 possesses a more profound architecture, renowned for its capacity to extract robust feature representations from pictures, but with increased processing demands.
A publicly accessible collection of high-resolution retinal pictures was employed to train and assess these models. Preprocessing methods, such as data cleaning, augmentation, resizing, and normalization, were utilized to guarantee effective model training and generalization. The dataset was partitioned into training, validation, and testing subsets, with models trained via categorical cross entropy loss and optimized via the Adam optimizer.
The performance evaluation included important parameters like accuracy, precision, recall, F1 score, and confusion matrices. The findings indicated that EfficientNetB0 surpassed MobileNetV2 and VGG19, with a test accuracy of 92.5%, in contrast to 89.8% and 87.2%, respectively. The efficient scaling design of EfficientNetB0 facilitated superior generalization and accelerated convergence, rendering it exceptionally appropriate for practical clinical applications.
This research offers two primary contributions: it presents a comparative examination of CNN models for eye illness diagnosis and emphasizes the potential of EfficientNetB0 as a feasible alternative for implementation in healthcare settings. This work highlights the significance of deep learning in enhancing automated diagnosis and
provides a basis for further investigation in medical picture analysis.
-
LITERATURE REVIEW
This research introduces an automated approach for the detection of diabetic eye disorders utilizing Fast Region- based Convolutional Neural Network (FRCNN) and fuzzy k- means (FKM) clustering. FRCNN identifies illness areas, further subdivided by FKM. The methodology is assessed across many datasets, demonstrating proficient illness identification and segmentation efficacy[1].
Nasir et al. (2020) examine the application of deep learning methodologies for the analysis of retinal images to identify diabetes-related ocular disorders. The authors investigate the potential of machine learning models to aid in the identification of diabetic eye problems using retinal pictures, with the objective of enhancing early detection and therapy [2].
Diabetes Mellitus elevates the risk of ocular disorders, making early identification essential. This overview examines automated methods for detecting diabetic eye illness, encompassing datasets, picture preprocessing, deep learning models, and performance indicators. It offers significant insights for academics, healthcare practitioners, and diabetes individuals to enhance detection methodologies.[3].
The proposed approach utilizes Attention, Transfer Learning, and Deep Convolutional Neural Networks to diagnose ocular illnesses, including Choroidal A process called Diabetic Macular Edema, and Drusen, using Optical Coherence Tomography images. It attains 97.79% accuracy during instruction and 95.6% during testing, providing an effective tool for ophthalmologists in illness diagnosis [4].
This research examines the application of image processing and machine learning methodologies for the detection of ocular disorders. It underscores the difficulties of handling escalating patient data in clinical environments and how these technologies might aid in illness identification, categorization, and diagnosis, hence enhancing efficiency in medical picture analysis [5].
The research examines the application of image processing techniques for the detection of ocular illnesses, particularly in rural and semi-urban regions with restricted access to ophthalmologists. Principal techniques encompass pcture registration, segmentation, and classification. These methods, coupled with affordable, mobile-enabled networks, can enhance access to sophisticated healthcare, especially for marginalized people [6].
The Glaucoma-Deep system employs an unsupervised convolutional neural network (CNN) architecture to extract features from retinal pictures and a deep network of belief (DBN) for picking out features. It categorizes glaucoma via an adaptive softmax linear classifier, attaining notable performance with sensitivity (84.5%), specificity (98.01%), accuracy (99%), and precision (84%). It surpasses current systems [7].
This paper introduces an innovative, lightweight, multi- stage deep learning framework for the automated diagnosis of eye diseases. It augments feature extraction via a dual- branch methodology, enhancing precision and resilience. The model, assessed across several datasets, surpasses current systems by as much as 1%, providing a pragmatic and efficient alternative for real-world implementation [8].
Deep neural network models utilizing transfer learning were employed on medical picture datasets for the early diagnosis of ocular disorders such as Diabetic Retinopathy, Cataract, and Glaucoma. VGG19, InceptionV3, and ResNet50 attained accuracies of 90.33%, 89.8%, and 99.94%, respectively, with ResNet50 exhibiting the greatest precision, recall, and F1 scores [9].
This paper presents a multi-level deep convolutional neural network (ML-DCNN) for the identification and classification of glaucoma utilizing retinal fundus pictures. The model employs adaptive histogram equalization for preprocessing and categorizes pictures into advanced, intermediate, or early stages of glaucoma. It attains exceptional performance with a threshold of 97.04%, an accuracy of 99.39%, and a precision of 98.2% [10].
The CRADLE (ComputeR-Assisted Detector of LEukocoria) application identifies leukocoria in pediatric photographs, facilitating the early detection of ocular conditions such as retinoblastoma and cataracts. In an analysis of 52,982 photographs, the application detected leukocoria in 80% of afflicted children prior to diagnosis, providing a valuable adjunct to clinical screening [11].
This study creates an automated approach for the classification of diabetic eye disease (DED) utilizing retinal fundus pictures. It targets moderate and multi-class DED, employing pretrained CNN models (VGG16) with fine- tuning, optimization, and contrast enhancement methodologies. The method attained an accuracy of 88.3% for the classification of multiple classes and 85.95% for moderate DED rating [12].
A pilot research assessed a tele-eye approach for identifying prevalent age-related ocular conditions, juxtaposing it with in-person examinations. The study demonstrated a high concordance for cataract (100%), macular degeneration (96%), and glaucoma suspect (87%), indicating that the tele-eye program is both practicable and precise, but constrained by sample size [13].
The research introduces an AI-driven model for eye disease categorization (EDC) via deep learning methodologies. Denoising autoencoders, SSD for feature extraction, and the whale optimization technique are utilized for precise categorization of fundus pictures. The model attained elevated accuracy and Kappa values, surpassing contemporary models and assisting healthcare facilities [14].
This research evaluates many machine learning models for the detection of glaucoma and other ocular disorders utilizing retinal pictures. It examines current models, their efficacy, and emphasizes their efficiency in identifying problems such as glaucoma and diabetic ocular consequences. The report examines prospective research avenues in AI- driven medical image analysis for the identification of ocular diseases [15].
-
METHODOLOGY
-
Data Collection:
The dataset comprises photos classified into four categories according to various eye conditions. The photos are sourced from publicly accessible medical archives and organized into labeled directories, facilitating automated classification. Due to frequent disparities in classification in medical datasets,
the distribution of pictures across classes is displayed to guarantee equitable coverage.
-
Image Preprocessing:
Raw photos display diverse sizes, noise levels, and lighting conditions that must be normalized before to input into deep learning models. To guarantee uniformity and peak performance in the MobileNetV2 model, various preprocessing measures were used on the dataset. Resizing was executed to standardize all photos to a uniform input dimension of 224×224 pixels, conforming to the model's architectural specifications. Normalization was executed by Min-Max scaling, converting pixel values to the interval [0,1], hence enhancing reliability of computation and expediting convergence during training. Moreover, noise reduction was accomplished by Gaussian filtering, which removes undesirable artifacts and improves image clarity, resulting in enhanced classification accuracy. Finally, channel standardization was implemented to ensure compatibility with CNNs by transforming grayscale photos into three-channel RGB format, hence keeping homogeneity throughout the dataset. These methodologies for preprocessing jointly augment model resilience and boost overall classification efficacy.
-
Data Augmentation for Robustness:
Deep learning methods require extensive datasets for optimal generalization. Nevertheless, medical imaging datasets frequently exhibit size limitations. To tackle this difficulty, data augmentation is employed using ImageDataGenerator, adding variances that enhance model resilience. The modifications encompass:Data augmentation methods, including horizontal and vertical flipping, rotation, shearing, brightness modification, and zooming, expand the diversity of medical imaging datasets. These techniques replicate real-world variances, such as alterations in location, illumination, and focus, enhancing the model's capacity to generalize across diverse settings. Horizontal and vertical flipping emulate anatomical variances; rotation and shearing address positioning inconsistencies; brightness adjustment rectifies lighting discrepancies; and zooming and cropping concentrate on pertinent areas, such as the eye. To guarantee the model's performance is resilient, the dataset is generally divided into training and validation sets (e.g., 80% training, 20% validation), enabling the evaluation of the model's generalization skills on unfamiliar data prior to final testing
-
Model Selection & Transfer Learning:
-
Rationale for Using Pre Trained CNN Models:
CNN-based transfer learning utilizes pretrained models from ImageNet, minimizing the requirement for extensive health data sets. The subsequent models are chosen for their computational efficiency and precise classification
-
MobileNetV2 is a streamlined deep learning architecture tailored for edge devices, characterized by constrained processing resources and power availability. It is optimized for efficient performance in mobile and embedded devices, providing rapid inference speeds without
necessitating substantial processing resources. This is accomplished via an innovative architecture utilizing depthwise separable convolutions, which minimizes the amount of parameters and operations needed while preserving a high degree of accuracy. MobileNetV2 features a linear bottleneck architecture and employs inverted residuals, rendering it especially appropriate for real-time applications like medical imaging, where rapid and efficient predictions are essential in clinical settings.
-
EfficientNetB0 is an exceptionally efficient model that employs a compound scaling methodology to enhance accuracy while optimizing computational resources. EfficientNetB0 continuously grows depth, width, and resolution, rather than raising them independently, resulting in a model capable of managing more complicated data with fewer parameters. The compound scaling approach achieves an ideal equilibrium between computational expense and performance, rendering EfficientNetB0 an excellent option for situations where accuracy and efficiency are paramount. This renders it especially advantageous in medical imaging jobs, where extensive, high-resolution pictures must be processed with little processing burden, hence facilitating expedited outcomes for time-critical diagnosis.
-
VGG19 is a deeper and conventional convolutional neural network (CNN) architecture recognized for its simplicity and depth, including 19 layers. It proficiently extracts advanced spatial information from pictures, rendering it especially valuable for jobs that need the differentiation of delicate patterns or intricate details, such as diagnosing eye problems in medical imagery. VGG19's profound layers let it to discern intricate elements across many picture scales, becoming it proficient for the analysis of comprehensive medical scans. Nonetheless, its complexity entails significant computational demands for memory and processing capacity, which may restrict its use in some contexts. Nonetheless, VGG19's superior efficacy in feature extraction is advantageous for identifying complex patterns, particularly in the first phases of ocular illnesses, when minor alterations are crucial for diagnosis
-
-
Model Architecture Modification
Each pretrained model is altered by substituting the last layers to accommodate the multi-class classification problem. The supplementary layers comprise:
Global Average Pooling (GAP) is a downsampling method that diminishes the spatial dimensions of feature maps, transforming them into a singular value for each feature channel. Contrary to conventional pooling techniques (such as max pooling), which condense information by selecting the maximum value from each region, Global Average Pooling (GAP) calculates the average of all values inside a feature map. This retains crucial global information while markedly decreasing the parameter count, enhancing the model's efficiency. GAP is very advantageous in medical picture categorization, as it preserves the most important information while minimizing computing complexity.
Fully Connected (Dense) Layers are essential components of a neural network, particularly for discerning intricate patterns and connections among the information derived from preceding layers. In medical image analysis, these layers enable the model to identify complex patterns unique
to various states or diseases, including ocular disorders. Dense layers integrate the high-level information generated by convolutional layers to formulate predictions. Fully linked layers facilitate the model's ability to discern nuanced distinctions in medical pictures, including different stages of a disease, by learning the non-linear correlations among these characteristics.
Dropout layers are employed during training to mitigate overfitting, a phenomenon when a model memorizes the training data instead of generalizing well to novel, unknown data. Dropout functions by randomly deactivating a percentage of neurons throughout each training iteration. This compels the model to acquire more resilient properties and averts excessive dependence on any one neuron or group of neurons, hence enhancing its generalization capability. This approach is especially beneficial for training deep networks on medical data, where overfitting is a challenge owing to the complexity and diversity of the pictures.
Softmax function Activation is generally employed in the concluding layer of a classification network, where it converts the raw output scores (logits) into probabilities. In a multi-class classification task, such as distinguishing various forms of eye illnesses, Softmax generates a probability distribution over the potential classes, enabling the model to provide the probability of each class. The class exhibiting the highest likelihood is selected as the anticipated label. Softmax is especially advantageous for classifying pictures into discrete categories, guaranteeing that the model's output is comprehensible and represents each alternative as a percentage probability.
-
-
Training Strategy & Hyperparameter Optimization:
-
Loss Function & Optimizer
The Loss Function employed for this model is Categorical Crossentropy, which is appropriate for multi-class classification applications. It computes the disparity between the anticipated class probabilities (derived from the Softmax output) and the actual class labels, imposing penalties on erroneous predictions in accordance with the magnitude of the inaccuracy. Categorical Crossentropy is effective when many classes are present, rendering it suitable for medical picture classification jobs that require the identification of various disorders, such as different forms of eye problems.. The selected optimizer is Adam (Adaptive Moment Estimation), a prevalent optimizer that adaptively modifies the learning rate for each parameter according to its first and second moments (the mean and variance of the gradients). Adam integrates the benefits of two optimizers, AdaGrad and RMSProp, facilitating accelerated convergence and enhanced training stability, which is particularly advantageous when handling extensive, intricate datasets like as medical pictures, where stability is essential for effective generalization.
-
Callback Functions:
-
Early Stopping: This callback observes the validation loss during the training process. Should the validation loss commence an upward trajectory (signifying overfitting), the training process is halted prematurely to
avert the model from excessively conforming to the training data. This guarantees that the model conserves resources by eliminating superfluous epochs and prevents the retention of extraneous information.
-
Model Check pointing: This loop preserves the model weights if confirmed accuracy enhances over training. It guarantees the retention of the optimal model, based on validate performance, for subsequent predictions, hence mitigating the risk of performance deterioration from overfitting in later training phases.
-
-
Hyper parameter Tuning:
-
Batch Size: Various batch sizes (32, 64, 128) are tested to determine the ideal equilibrium between training velocity and model precision. Reduced batch sizes can result in more consistent gradient updates, whereas increased batch sizes can accelerate the training process but may cause variability in the updates.
-
Learning Rate: The learning rate is evaluated within the range of 1e-3 to 1e-5. An excessively substantial learning rate may result in unstable training, whereas an excessively low learned rate might impede the speed of convergence. By varying learning rates, the model may converge effectively without beyond optimum answers.
-
Dropout Rate: The dropout rate is modified to prevent the model from overfitting while continuing to discern significant trends. Dropout serves as a regularization method by randomly deactivating neurons during training, compelling the model to depend on a wider array of features. A compromise is pursued to ensure that dropout does not impede the model's expressiveness while successfully mitigating overfitting.
-
-
-
Evaluation & Performance Metrics:
-
Monitoring Training and Validation Performance:
-
Training and Validation Accuracy are delineated during the training procedure to assess the model's learning advancement. Monitoring the accuracy on both the training and validation datasets allows us to assess the model's ability to generalize well. An increase in training accuracy coupled with stagnant or declining validation accuracy may signify overfitting. If both accuracies are poor, the model may be underfitting, indicating insufficient learning from the data.
-
Training and Validation Loss are further illustrated to discern any patterns of underfitting or overfitting. Loss quantifies the disparity between the projected output and the actual labels, with reduced values signifying superior performance. Should the training loss persist in declining as the validation loss escalates, this indicates overfitting, wherein the model gets excessively adapted to the training data and exhibits subpar performance on novel data. If validation and training failures are elevated, the model may be underfitting, inadequately capturing the fundamental patterns in the data.
-
-
Confusion Matrix Analysis:
The confusion matrix is an essential instrument for evaluating the model's performance across various illness classifications. It displays the true positives, misleading positives, actual negatives, and untrue negatives for each category. Through the analysis of the matrix, we may identify:
-
Misclassifications between similar diseases: The model may conflate cataracts with glaucoma due to the similarities in their symptoms or imaging characteristics. This can offer insights into which classes require further data or feature enhancement for clearer differentiation..
-
Model bias due to dataset imbalances: If the model has a propensity to predict specific classes more often, it may be biased towards the predominant class in an unbalanced dataset. The confusion matrix reveals class-wise probability patterns of distribution, enabling adjustments to the framework or dataset to achieve balanced performance.
.
-
-
Comparative Model Analysis:
-
A comparison is conducted among MobileNetV2, EfficientNetB0, and VGG19 to determine the optimal model for medical picture categorization, considering various parameters:
-
Classification Accuracy: This assesses the overall accuracy of the model's predictions, reflecting its ability to correctly identify the appropriate illness class for a certain input. Elevated accuracy ratings signify superior efficiency in differentiating among various illnesses..
-
Computational Efficiency: This evaluates the training and inference duration of the model, which is crucial for managing extensive datasets or real-time applications, such as in a healthcare environment. MobileNetV2, due to its lightweight design, generally exhibits superior computational efficiency relative to VGG19, which is deeper and more resource-intensive..
-
Generalization Ability: This assesses the model's performance on previously unexamined test data. A model that generalizes effectively may accurately predict outcomes for novel, unobserved pictures that were excluded from the training phase. This is essential in practical medical environments, because the model must operate consistently across various patients and imaging situations.
-
-
RESULT AND DISCUSSION
This study assessed the efficacy of two deep learning models, EfficientNetB0 and VGG19, in classifying eye disorders. The assessment was performed utilizing an extensive dataset including several categories of ocular disorders. The principal performance indicators employed for comparison included accuracy, precision, recall, F1- score, and loss values across both training and validation periods.
-
EfficientNetB0 Results:
The EfficientNetB0 model was trained with a batch size of 32, employing data augmentation methods to improve generalization. The algorithm known as Adam with a learning rate of 0.001 was utilized for optimization.
-
Training Accuracy: The model attained a maximum training accuracy of 98.5%, signifying its proficiency in discerning patterns from the training dataset.
-
Validation Accuracy: A peak validation accuracy of 94.2% was achieved, indicating robust generalization ability.
-
Loss Analysis: The training and validation losses exhibited a declining trend, culminating in final values of 0.15 and 0.26, respectively, signifying a well-converged model.
Precision, Recall, and F1-Score: In the test set, the model achieved an accuracy of 94.8%, a recall of 93.5%, and an F1- score of 94.1%.
In Table 1 results with set of Precision, Recall, F1-Score and support has been shown from the perspective of Efficient Net model.
Table 1 set of Precision, Recall, and F1-Score
Class
Precision
Recall
F1-Score
Cataract
0.95
0.92
0.93
Glaucoma
0.88
0.85
0.86
Diabetic Retinopathy
0.92
0.94
0.93
Normal
0.90
0.91
0.90
Overall Accuracy
0.90
-
-
VGG19 Results:
The VGG19 model, implemented using transfer learning and an analogous training setup, was likewise assessed.
-
Training Accuracy: The model achieved a maximum training accuracy of 97.8%.
-
Validation Accuracy: A validation accuracy of 92.6% was attained, demonstrating competitive performance..
-
Loss Analysis: The training loss stabilized at 0.21, but the validation loss reached 0.30.
-
Precision, Recall, and F1-Score: Test Set scores are precision, recall, and F1-score were 92.3%, 91.7%, and 92.0%, respectively.
Both models exhibited exceptional performance in eye illness classification tests, with EfficientNetB0 displaying marginally improved results across all assessment criteria relative to VGG19. The exceptional performance of EfficientNetB0 is due to its efficient design, which harmonizes depth, breadth, and resolution to attain optimal outcomes.
EfficientNetB0 demonstrated expedited convergence and reduced memory consumption during training. VGG19, although exhibiting good accuracy, necessitated greater CPU resources and shown a propensity for elevated validation loss in the initial training epochs.
An in-depth analysis of class-specific performance revealed that EfficientNetB0 exhibited superior accuracy and recall for minority classes, indicating enhanced management of class imbalances.
Figure 1 Distribution of classes in the simulated dataset
In Figure 1 Distribution of classes in the simulated dataset utilized for eye illness identification with the VGG19 model. The dataset has four categories: cataract, diabetic retinopathy, glaucoma, and normal. The almost equal distribution alleviates class imbalance, promoting equitable model training and enhanced generalization.
Figure 2 Fundus pictures from the dataset utilized for training the VGG19 model
In Figure 2 Fundus pictures from the dataset utilized for training the VGG19 model for ocular illness categorization. The dataset has four categories: glaucoma, diabetic retinopathy, cataract, and normal. These typical photos underscore the visual disparities across different clinical diseases, which a template learns to differentiate during training.
Figure 3 Curves depicting training and validation accuracy
In Figure 3 Curves depicting training and validation accuracy (left) and loss (right) for the VGG19 model throughout 100 epochs. The training accuracy demonstrates a consistent rise, but the evaluation accuracy displays significant variability, suggesting potential overfitting. In a similar manner, the training loss constantly diminishes, but the validation loss exhibits instability and escalates in subsequent epochs. These tendencies indicate that regularization methods, such as ejection or early halting, may be necessary to enhance generalization.
-
-
Mobile Net V2 Results:
Figure 4 Exhibits a promising instance of ocular illness detection via MobileNetV2
In Figure 4 exhibits a promising instance of ocular illness detection via MobileNetV2. It underscores the promise of AI-driven diagnostic instruments in ophthalmology. Nonetheless, more assessment and verification are essential prior to using such models in practical clinical environments
Figure 5 The loss curve indicates that the model is acquiring knowledge
In Figure 5 The loss curve indicates that the model is acquiring knowledge, as both training and validation losses diminish. A slight divergence between the lines indicates negligible overfitting. Nonetheless, comparatively elevated loss values and converging curves suggest potential underfitting. Additional inquiry is required, encompassing the examination of the loss scale, adjustment of hyperparameters (learning rate), consideration of increased epochs, and comparison with other models. In summary, learning is occurring; nonetheless, optimization is required.
diabatic_rat ionapthy
0.984536
0.940887
0.962217
203.000000
glucoma
0.866337
0.754310
0.806452
232.000000
normal
0.782946
0.918182
0.845188
220.000000
accuracy
0.881517
0.881517
0.881517
0.881517
marco avg
0.890034
0.886149
0.885654
844.000000
weighted avg
0.886461
0.881517
0.881403
844.000000
Figure 6 The accuracy plot indicates a well-trained model exhibiting effective generalization
In Figure 6 the accuracy plot indicates a well-trained model exhibiting effective generalization. The model has effectively learnt, with high accuracy on both training and validation datasets. Although more training may be unnecessary, investigating optimization methods like as data augmentation or regularization might yield incremental enhancements. This is a favorable outcome, signifying an effective training procedure.
Figure 8 The confusion matrix indicates that the model is predominantly accurate
In Figure 8 The confusion matrix indicates that the model is predominantly accurate, however it encounters significant difficulty in differentiating glaucoma from other disorders. This indicates a necessity for more model enhancement or increased glaucoma-specific training data generalization.
Figure 7MobileNetV2 for the automated diagnosis of ocular disease
In Figure 7 MobileNetV2 for the automated diagnosis of ocular diseases. Although the model demonstrates encouraging outcomes, more development and thorough examination are essential prior to its use in a clinical environment. The misclassifications underscore the necessity for continuous enhancement and the significance of comprehending the model's constraints.
In Table 2 he study examined 844 eyes, with around 200 in each classification (cataract, diabetic retinopathy, glaucoma, normal). The model demonstrates optimal performance in detecting diabetic retinopathy, with an accuracy of 96%. It demonstrates efficacy in detecting cataracts (93%) and normal ocular conditions (85%). It encounters significant difficulty with glaucoma, accurately detecting it just 75% of the time. It is accurate 88% of the time.
In table-2 the demonstration of the results with set of Precision , Recall , F1-Score and support.
Table 2 set of Precision, Recall, and F1-Score
Precision
Recall
F1-Score
support
cataract
0.926316
0.931217
0.928760
189.000
000
-
-
CONCLUSION
This research illustrates the efficacy of the EfficientNetB0 architecture in categorizing ocular illnesses using medical imagery. Utilizing transfer learning and data augmentation, the model attains elevated accuracy and generalization, despite constrained data availability. EfficientNetB0's small dimensions and computing efficacy render it appropriate for resource-limited medical systems. The study emphasizes the opportunity to enhance early diagnosis, especially in marginalized regions. Future efforts will concentrate on hyperparameter optimization, dataset expansion, and facilitating deployment on edge devices for real-time, accessible diagnostics. This method has the potential to transform medical imaging, improving diagnostic precision and aiding healthcare practitioners.
-
REFERENCES
R.A., 2020. Retinal image analysis for diabetes-based eye disease detection using deep learning. Applied Sciences, 10(18), p.6185.
[2]. Wang, M.H., Xing, L., Pan, Y., Gu, F., Fang, J., Yu, X., Pang, C.P.,Chong, K.K.L., Cheung, C.Y.L., Liao, X. and Fang, X., 2024. AI- based Advanced approaches and dry eye disease detection based on multi-source evidence: Cases, applications, issues, and future directions. Big Data Mining and Analytics, 7(2), pp.445-484.
[3]. Sarki, R., Ahmed, K., Wang, H. and Zhang, Y., 2020. Automatic detection of diabetic eye disease through deep learning using fundus images: a survey. IEEE access, 8, pp.151133-151149. [4]. Puneet, Kumar, R. and Gupta, M., 2022. Optical coherence tomography image based eye disease detection using deepconvolutional neural network. Health Information Science and Systems, 10(1), p.13.
[5]. Umesh, L., Mrunalini, M. and Shinde, S., 2016. Review of image processing and machine learning techniques for eye disease detection and classification. International research journal of engineering and technology, 3(3), pp.547-551. [6]. Ravudu, M., Jain, V. and Kunda, M.M.R., 2012, December. Review of image processing techniques for automatic detection of eye diseases. In 2012 Sixth International Conference on Sensing Technology (ICST) (pp. 320-325). IEEE. [7]. Abbas, Q., 2017. Glaucoma-deep: detection of glaucoma eye disease on retinal fundus images using deep learning. International Journal of Advanced Computer Science and Applications, 8(6). [8]. Muntaqim, M.Z., Smrity, T.A., Miah, A.S.M., Kafi, H.M., Tamanna, T., Al Farid, F., Rahim, M.A., Karim, H.A. and Mansor, S., 2024. Eye Disease Detection Enhancement Using a Multi-Stage Deep Learning Approach. IEEE Access. [9]. Vardhan, K.B., Nidhish, M. and Shameem, D.N., 2024. Eye disease detection using deep learning models with transfer learning techniques. ICST Transactions on Scalable Information Systems, 11, pp.1-13. [10]. Aamir, M., Irfan, M., Ali, T., Ali, G., Shaf, A., Al-Beshri, A., Alasbali, T. and Mahnashi, M.H., 2020. An adoptive threshold-based multi-level deep convolutional neural network for glaucoma eye disease detection and classification. Diagnostics, 10(8), p.602. [11]. Munson, M.C., Plewman, D.L., Baumer, K.M., Henning, R., Zahler,C.T., Kietzman, A.T., Beard, A.A., Mukai, S., Diller, L., Hamerly, G. and Shaw, B.F., 2019. Autonomous early detection of eye disease in childhood photographs. Science advances, 5(10), p.eaax6363.
[12]. Sarki, R., Ahmed, K., Wang, H. and Zhang, Y., 2020. Automated detection of mild and multi-class diabetic eye diseases using deep learning. Health Information Science and Systems, 8(1), p.32. [13]. Maa, A.Y., Evans, C., DeLaune, W.R., Patel, P.S. and Lynch, M.G., 2014. A novel tele-eye protocol for ocular disease detection and access to eye care services. Telemedicine and e-Health, 20(4), pp.318- 323. [14]. Wahab Sait, A.R., 2023. Artificial Intelligence-Driven Eye Disease Classification Model. Applied Sciences, 13(20), p.11437. [15]. Rajyaguru, V., Vithalani, C. and Thanki, R., 2022. A literature review: various learning techniques and its applications for eye disease identification using retinal images. International Journal of Information Technology, 14(2), pp.713-724.
