International Research Platform
Serving Researchers Since 2012

AI-Driven Cotton Leaf Disease Detection and Management System

DOI : https://doi.org/10.5281/zenodo.18163336
Download Full-Text PDF Cite this Publication

Text Only Version

AI-Driven Cotton Leaf Disease Detection and Management System

Manohar Patil, Sushil Patil, Sarang Patil, Kalpesh Patil, Rajesh Patil

Department of Computer Engineering R.C.Patel Institute of Technology, Shirpur, India

AbstractAs important as agriculture is in the maintenance of food regimes around the globe, crop diseases remain a major challenge to agricultural productivity, especially for farmers who do not have access to expert diagnosis in time. Sometimes, manual detection techniques are slow, unreliable, and require the presence of specialists, which may slow down the implementation of corrective measures and cause the loss of considerable yields. In order to overcome the mentioned obstacle, this paper introduces an AI-based Cotton Leaf Disease Detection and Management System that implements the VGG16 model to deliver fast, accurate, and accessible diagnostic services. The system applies the VGG16 con- volutional neural network to cotton leaf images with great accuracy, supported by preprocessing and data augmentation strategies that promote the performance of the system under changing lighting and environmental conditions. Furthermore, a conversational chatbot interface allows users to upload leaf images, obtain immediate predictions, and receive clear and actionable disease management recommendations. The platform is developed with React and can be accessed on any device with simplicity and ease. The results of the experiment indicate a high level of prediction accuracy and positive user interaction, which means that the system can support the process of early detection, promote smart farming, and become a part of sustainable crop management.

Index TermsCrop Disease Detection, Articial Intelligence, VGG16, Deep Learning, Chatbot System, React Interface, Smart Farming, Sustainable Agriculture.

  1. Introduction

    The agricultural sector has continued to be a key source of food and economic stability in areas where agriculture serves most households. Nevertheless, agricultural production is becoming increasingly threatened by unpredictable weather patterns, soil erosion, and the proliferation of plant diseases. The most dangerous of these factors are leaf-borne diseases, which may lower yield, impair crop quality, and inict grave nancial damage on farming communities. These illnesses are commonly caused by fungi, bacteria, or viruses and spread rapidly in favorable environments; therefore, quick and accu- rate diagnosis is necessary to respond promptly and efciently to them [1], [2].

    Conventional disease identication methods are strongly dependent on the visual examination performed by farmers or on recommendations from farm experts. Although this method has been used since ancient times, it has several limitations: the results can be slow, subjective, or inaccurate, especially when different diseases exhibit similar visual symptoms [3], [4]. The lack of access to trained specialists in many rural locations

    further contributes to delayed treatment, which results in disease progression and the expansion of crop damage. These long-standing issues demonstrate the necessity for scalable automated tools that can deliver rapid and reliable disease detection in real-world environments.

    More recent developments in Articial Intelligence (AI) and computer vision have presented promising solutions to these challenges. Convolutional Neural Networks (CNNs), a type of deep learning model, have proven to be highly effective in extracting visual features from plant data and can be used to classify different crop diseases with high accuracy when visual images are provided as input [5][7]. CNN-based systems are capable of identifying subtle texture, color, and shape variations that may be missed by the human eye, making them particularly suitable for automated disease recognition [8], [9].

  2. Literature Survey

    Many developing economies have agriculture as a foun- dation, and crop productivity is often inuenced by plant diseases, which may lead to massive harvest losses. The early and accurate detection of these diseases is important for achieving sustainable agriculture. With improvements in articial intelligence (AI) and computer vision, automated plant disease detection systems have emerged, based on ma- chine learning and deep learning algorithms that interpret leaf images and detect infection-related signs.

    1. Traditional Machine Learning Approaches

      Early research focused on classical machine learning al- gorithms such as Support Vector Machines (SVM), Decision Trees, and Random Forest classiers for plant disease classi- cation [1]. These techniques relied on manually extracted features such as texture, color, and shape. Although they performed reasonably well in controlled environments, their accuracy degraded in real eld conditions due to variations in lighting, background noise, and leaf orientation [3].

      Handcrafted feature extraction methods, including the Gray- Level Co-occurrence Matrix (GLCM) and Local Binary Pat- terns (LBP), were widely applied for leaf image analysis [4]. However, these methods were computationally expensive and lacked robustness when applied to diverse crop species or large datasets [2]. These limitations motivated the shift toward deep

      learning techniques that automatically learn discriminative features.

    2. Emergence of Deep Learning

      The introduction of Convolutional Neural Networks (CNNs) resulted in signicant improvements in disease identication accuracy. CNNs automatically extract spatial and hierarchical patterns from raw images, outperforming traditional machine learning approaches [5]. Models such as AlexNet, VGG16, and ResNet have been widely used in plant disease classication tasks on datasets such as PlantVillage [6].

      Transfer learning further enhanced performance by enabling pre-trained ImageNet networks to be ne-tuned on agricultural datasets, thereby reducing training time and computational cost [8], [10]. Among these architectures, VGG16 gained popularity due to its simple structure and ability to extract deep and meaningful features. Its 16 weighted layers, uniform 3 × 3 lters, and ReLU activation functions make it effective for capturing complex disease patterns [9].

    3. VGG16 in Crop Disease Prediction

      VGG16 has been widely used in crop disease classication tasks due to its proven stability and high accuracy. Researchers have demonstrated that VGG16 performs exceptionally well across multiple crop disease datasets, including tomato, maize, and cotton [5]. Transfer learning is commonly applied by retaining the ImageNet-trained convolutional base and adding new fully connected layers for crop-specic classication [11]. Several studies, such as [6] and [12], show that VGG16 consistently achieves accuracy levels above 95% for multi- class disease datasets. Although computationally heavier than lightweight models such as MobileNet and EfcientNet,

      VGG16 remains a strong baseline due to its interpretability and feature extraction capability [7].

      Data augmentation techniques such as rotation, ipping, scaling, and brightness adjustment are often applied during training to enhance generalization [13]. Public datasets such as PlantVillage and the Kaggle Cotton Disease Dataset further support the training and evaluation of deep models for diseases such as Bacterial Blight, Fusarium Wilt, and Leaf Curl Virus [14].

    4. Comparative Studies with Other CNN Models

      Comparative studies have evluated VGG16 against more recent architectures. Arsenovic et al. found that VGG16 per- formed comparably to ResNet and Inception architectures on several agricultural datasets [15]. Hybrid models combining VGG16 with encoder networks have also been proposed to improve classication performance [11].

      Nagasubramanian et al. demonstrated the potential of 3D CNN models based on the VGG architecture for hyperspectral image analysis, enabling early disease detection [16]. While newer architectures offer better efciency, VGG16 remains widely used in agricultural systems due to its simplicity, interpretability, and ease of deployment [7].

    5. Integration with Management Systems

    Recent studies emphasize the integration of deep learning- based disease detection with agricultural management systems. Systems combining CNN predictions with IoT sensors mea- suring temperature, humidity, and soil parameters can forecast disease occurrence under specic environmental conditions [11], [14].

    Some frameworks also integrate CNN outputs with fuzzy logic or expert systems to generate recommendations for irrigation, pesticide application, and nutrient management [12]. Although the present study focuses on disease classication, future advancements may combine VGG16 predictions with real-time sensor data to support decision-making in precision agriculture [17], [18].

  3. Methodology

    1. Dataset Preparation

      The sample data used in this research consists of cotton leaf photographs obtained from two major sources. Some images were captured manually in nearby cotton elds using a mobile camera, while the remaining images were obtained from publicly available agricultural datasets. After collection, all images were carefully examined, and those that were blurred, duplicated, or of low quality were removed. The cleaned dataset was divided into four categories, namely Healthy, Bacterial Blight, Fusarium Wilt, and Curl Virus, which represent the major diseases affecting cotton crops.

      Fig. 1. Some sample images from the cotton leaf dataset representing four classes: Healthy, Bacterial Blight, Fusarium Wilt, and Curl Virus.

      A stratied sampling method was used to split the dataset into three subsets: training, validation, and testing.

    2. Image Preprocessing

      The collected images varied in terms of environment, res- olution, and lighting conditions. To standardize the inputs, all images were resized to 224 × 224, which is the required

      input size for the VGG16 architecture. Pixel normalization was applied to scale pixel values between 0 and 1, improving numerical stability during training.

      To improve generalization and reduce overtting, several data augmentation techniques were applied, including rotation, ipping, zooming, and adjustments to brightness and contrast.

      during training. Early stopping was applied to prevent overt- ting by monitoring the validation accuracy plateau. Training and validation accuracy and loss values were recorded across epochs.

      Fig. 4. Dataset class distribution showing the number of images in each category: Bacterial Blight, Curl Virus, Fusarium Wilt, and Healthy.

      Fig. 2. Examples of cotton leaf images after preprocessing and data augmen- tation.

    3. Model Setup (VGG16)

      The VGG16 deep convolutional neural network was selected due to its strong feature extraction capabilities. The pre- trained convolutional base, trained on the ImageNet dataset, was retained, while the fully connected layers were modied and replaced with custom dense layers suitable for four-class cotton disease classication. A nal softmax layer was used to produce class probability outputs.

      Fig. 3. Modied VGG16 architecture for cotton leaf disease classication.

      Transfer learning signicantly reduced training time while improving classication accuracy, even with a moderately sized dataset.

    4. Training and Testing

    The model was trained using the training subset, while the validation dataset was used to monitor model performance

    Fig. 5. Training and validation loss curves over multiple epochs.

    After training, the model was evaluated using the test dataset. A confusion matrix was generated to analyze class- wise prediction accuracy and identify misclassication pat- terns.

    Finally, the trained model was exported and integrated into a user interface that allows users to upload cotton leaf images and receive real-time disease predictions.

  4. Experimental Setup

    Experiment Analysis aim of the analysis was to determine the level of effectiveness, stability and practicality of the suggested VGG16-based cotton leaf disease detection system. This paragraph describes how we prepared the data set, how we divided the data set into training and testing sets, the evaluation metrics, the parameter of the model, and the hardware and software environment in which the model was implemented.

    1. Dataset Description

      The dataset that is used to conduct this study is the set of four types of cotton leaf images: Bacterial Blight, Curl Virus,

      Fusarium Wilt, and Healthy. The photos were gathered not only in publicly available agricultural repositories but also by means of manual eld sampling. Poor photos and repetitive and blurred photos were eliminated to facilitate reliability of the dataset. The last dataset was balanced in all the four classes. In order to depict the real world conditions, the picture set comprised images taken in dissimilar lighting conditions, backgrounds, and angles of viewing.

    2. Data Splitting

      The data set was split into three subsets to provide the fair assessment of the model:

      item concept 70% Training set – trigger learning feature representations. item 20 percent validating set – this set is applied to control overtting and model tuning. item 10

      This branching allowed the efcient training and, also, provided that the model transferred well to the unobservable samples.

    3. Preprocessing Pipeline

      Any image was found to be scaled into a size of 224 x 224 pixels which tted the input specications of the VGG16 network. The pixel values were scaled to the scale between [0,1]. The methods of data augmentation were used in the process of training the data to enhance the diversity of the data and minimize overtting such as:

      • random rotation,

      • horizontal ipping,

      • zoom transformation, item adjustment of brightness and contrast.

        These preprocessing steps contributed to adapting the model to variations which are apparent in real eld-captured leaf images.

        The model used in this study is presented below.

        The ImageNet pre-trained weights of VGG16 were used. The initial completely connected layers were substituted with custom dense layers that were used in four-class classication. An output layer was provided with a softmax activation function. The parameters of the key training were:

      • Optimizer: Adam,

      • Learning rate: 0.0001,

      • Batch size: 32, item Epochs: 25 (early stopping).

        To stop training, there was an early stopping mechanism in which the training stops after no further validation accuracy improvement was realized in successive epochs.

    4. Evaluation Metrics

      There were several measures used to determine model performance:

      item Accuracy It is the overall correctness of predictions. item Precision Precision action of sameness to exactness within each data-set: proportion of accurately anticipated positive cases in a disease class. item Remembering (recession) -the model can identify diseasedsamples correctly. item F1-score harmonic mean of recall and

      precision. item Confusion Matrix – visualization of the performance of the class-wise prediction.

      These measures were used to give a complete evaluation of the model regarding various types of diseases.

      Hardware and Software Environment The knowledge base is made up of computer hardware and software com- ponents.¡human¿Hardware and Software Environment Computer hardware and software are components of the knowledge base.

      All experimentation took place on a workstation that is actually furnished with:

      • Intel Core i5 processor,

      • 16 GB RAM,

      • NVIDIA GTX 1650 GPU.

        The software stack included:

      • Python 3.10, TensorFlow and Keras, deep learning, item Image processing, NumPy and OpenCV, item visualiza- tion Matplotlib, item react to the user interface, item MongoDB to store the data in the back-end.

      Such an arrangement facilitated effective training, assess- ment, and implementation of the disease detecting system.

    5. Implementation Details

    The trained model was exported during the deployment stage and used as a prediction API. The images of the cotton leaf are uploaded by users on the web interface and the backend uses the image, runs the model and returns the results in the form of disease predictions, and condence scores. The results of prediction are saved in the backend database so as to be tracked and analyzed.

  5. Results and Discussion

    This section summarises the performance of the VGG16- based cotton leaf disease classication system. The model was evaluated using unseen test samples, and its ability to recognise four target classesBacterial Blight, Curl Virus, Fusarium Wilt, and Healthy leaveswas examined. Standard evaluation measures such as accuracy, precision, recall, and F1-score were used to understand the behaviour and reliability of the model in practical settings.

    1. Comparative Model Performance

      After dataset preparation, the data were divided into training, validation, and testing sets following preprocessing and augmentation procedures. The ne-tuned VGG16 model demonstrated strong classication capability, achieving an overall accuracy of approximately 92%. The precision and recall values were closely aligned at 91% and 90%, respec- tively, indicating effective identication of diseased leaves while maintaining low false detection rates. An F1-score of

      0.90 further conrms the consistency of predictions across all disease categories.

      When compared with lighter architectures such as Mo- bileNet and shallow convolutional neural networks used in related studies, the deeper feature extraction layers of VGG16

      Fig. 6. Confusion matrix showing class-wise performance of the trained VGG16 model.

      contributed to improved representation of disease-specic pat- terns. This makes the model suitable for applications where classication accuracy is prioritised over computational ef- ciency.

    2. Interpretation of Model Behaviour

      The evaluation results indicate that the model effectively captures signicant visual features such as texture variations, colour changes, leaf curling patterns, and fungal marks. These visual cues enable the classier to differentiate between dis- ease categories that often appear similar during early stages, particularly Bacterial Blight and Fusarium Wilt.

      1. Prediction Condence: During testing, most predictions were associated with condence scores ranging from 85% to 99%. Higher condence values were generally observed in images displaying clear disease symptoms, while reduced condence occurred in samples affected by poor lighting, partial visibility, or mild infections.

      2. Impact of Image Quality: The performance of the system was inuenced by the quality of input images. Conditions such as shadows, motion blur, excessive sunlight, and cluttered backgrounds occasionally reduced prediction condence. This observation highlights the importance of capturing clear and focused leaf images to obtain reliable diagnostic results.

    3. Deployment and Practical Evaluation

      Following evaluation, the trained model was deployed on a web-based platform that allows users to upload cotton leaf images and receive instant diagnostic feedback. The deployed application operated smoothly, providing rapid predictions along with the predicted disease class and associated con- dence score.

      The backend system maintains a record of all predictions, including the uploaded image, detected disease category, con- dence percentage, and timestamp. This record-keeping feature

      Fig. 7. Example of model prediction on a cotton leaf image (Predicted class: Healthy).

      supports long-term monitoring of disease trends and aids in identifying recurring infection patterns. User testing indicated that the interface was easy to use and understandable even for individuals without technical expertise. The fast response time further supports effective decision-making in eld conditions.

    4. Strengths and Limitations

      1. Strengths:

        • The model achieved high accuracy and reliable perfor- mance across all four cotton leaf disease classes.

        • Balanced precision and recall values indicate good gen- eralisation capability.

        • The system effectively handled diverse disease patterns and symptom variations.

        • Real-time prediction capability makes the model suitable for practical eld deployment.

        • Backend data storage enables long-term disease monitor- ing and analysis.

      2. Limitations:

        • VGG16 requires higher computational resources com- pared to lightweight models, limiting deployment on low- end devices.

        • Prediction condence decreases for images captured un- der poor lighting or unclear conditions.

        • The system currently supports only four cotton leaf conditions, which restricts its applicability.

        • Visual explanation techniques such as heatmaps are not included to highlight inuential regions in predictions.

    5. Summary of Findings

    Overall, the proposed system demonstrated reliable perfor- mance across varying input conditions and showed strong capability in identifying cotton leaf diseases. The combination

    of accurate classication, real-time usability, and data logging features makes the system a practical tool for early disease detection in cotton farming. Although certain limitations re- main, primarily related to image quality and computational requirements, the results suggest that the approach can support timely decision-making and help reduce crop losses through early diagnosis and intervention.

  6. Conclusion

This study demonstrates that deep learning methods, particularly the VGG16 architecture, can effectively iden- tify major cotton leaf diseases with high accuracy. The model achieved strong performance across all four targeted classesBacterial Blight, Curl Virus, Fusarium Wilt, and Healthy leavesshowing that it was able to learn relevant visual features and distinguish between similar symptom pat- terns. The close alignment of precision and recall values indicates that the classier maintained stable behaviour and produced reliable results with minimal misclassication. Be- yond the models accuracy, its integration into a web-based application highlights its practical utilty, enabling users to ob- tain rapid and clear disease predictions directly from uploaded images. Such real-time accessibility is valuable in agricultural settings where early diagnosis plays a key role in preventing crop damage. Overall, the ndings of this work show that integrating deep learning into crop monitoring workows can support farmers in making informed decisions, improving crop management, and reducing yield losses. With further dataset expansion and inclusion of more disease categories, the proposed system has the potential to evolve into a comprehen- sive solution for supporting sustainable and technology-driven cotton farming.

Acknowledgements

We express our sincere gratitude to our project guide and the faculty members of the Department of Computer Science and Engineering for their continuous guidance, constructive feedback, and encouragement throughout the development of this research work. Their insights were invaluable in shaping the methodology, experimentation, and overall direction of our project on AI-driven cotton leaf disease detection.

We also acknowledge the support provided by our institution for offering the necessary facilities, resources, and technical environment required to complete this study. We thank the cre- ators of publicly available agricultural datasets and the open- source tools that contributed signicantly to the successful implementation of our VGG16-based model.

We are grateful to our families and friends for their constant motivation and support during the course of this project. Finally, we appreciate the collaborative efforts of all team members, whose dedication and teamwork made this research possible.

References

  1. B. V. Gokulnath and G. U. Devi, A survey on plant disease prediction using machine learning and deep learning techniques, Inteligencia Articial, vol. 22, no. 63, pp. 019, 2017.

  2. K. N. Reddy et al., A literature survey: plant leaf diseases detection

    using image processing techniques, IOSR Journal of Electronics and Communication Engineering, 2017.

  3. S. Ramesh et al., Plant disease detection using machine learning, in 2018 International Conference on Design Innovations for 3Cs Compute Communicate Control (ICDI3C). IEEE, 2018.

  4. G. Dhingra et al., Study of digital image processing techniques for leaf disease detection and classication, Multimedia Tools and Applications, vol. 77, no. 15, pp. 19 95120 000, 2018.

  5. K. P. Ferentinos, Deep learning models for plant disease detection and diagnosis, Computers and Electronics in Agriculture, vol. 145, pp. 311 318, 2018.

  6. M. H. Saleem et al., Plant disease detection and classication by deep learning, Plants, vol. 8, no. 11, p. 468, 2019.

  7. E. C. Too et al., Comparative study of deep learning models for plant disease identication, Computers and Electronics in Agriculture, vol. 161, pp. 272279, 2019.

  8. S. Ashok et al., Tomato leaf disease detection using deep learning techniques, in 2020 5th International Conference on Communication and Electronics Systems (ICCES). IEEE, 2020, pp. 979983.

  9. G. Wang et al., Automatic image-based plant disease severity estimation using deep learning, Computational Intelligence and Neuroscience, 2017.

  10. V. Prashanthi and K. Srinivas, Plant disease detection using convolu- tional neural networks, International Journal of Advanced Trends in Computer Science and Engineering, vol. 9, no. 3, 2020.

  11. A. Khamparia et al., Seasonal crops disease prediction and classica- tion using deep convolutional encoder network, Circuits, Systems, and Signal Processing, vol. 39, no. 2, pp. 818836, 2020.

  12. G. Ghosh and S. Chakravarty, Grapes leaf disease detection using con- volutional neural network, International Journal of Modern Agriculture, vol. 9, no. 3, pp. 10581068, 2020.

  13. A. Lowe et al., Hyperspectral image analysis techniques for detection and classication of early onset plant disease and stress, Plant Methods, vol. 13, no. 1, pp. 112, 2017.

  14. X. Yang and T. Guo, Machine learning in plant disease research,

    European Journal of BioMedical Research, vol. 3, no. 1, pp. 69, 2017.

  15. M. Arsenovic et al., Solving current limitations of deep learning-based approaches for plant disease detection, Symmetry, vol. 11, no. 7, p. 939, 2019.

  16. K. Nagasubramanian et al., Plant disease identication using explain- able 3d deep learning on hyperspectral images, Plant Methods, vol. 15, no. 1, pp. 110, 2019.

  17. J. Smith, L. Rodriguez, and P. Kumar, Review of the state of the art of deep learning for plant diseases, Frontiers in Plant Science, 2024.

  18. M. Johnson, S. Ahmed, and T. Liu, Advances in deep learning appli- cations for plant disease and pest detection, Remote Sensing Review, 2023.