 Open Access
 Authors : Dr. Chandra Prakash Patidar
 Paper ID : IJERTV13IS090032
 Volume & Issue : Volume 13, Issue 09 (September 2024)
 Published (First Online): 28092024
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
A Data PreProcessing and Deep Neural Network Approach for Potato Leaf Blight Detection
Dr. Chandra Prakash Patidar
Department of Information Technology Institute of Engineering and Technology, DAVV, Indore
Abstract With machine learning being used in agricultural applications, a new domain of science has emerged which is termed as precision agriculture. It is the amalgamation of data science, analytics, AI and ML technologies for enhancing conventional agricultural practices. This paper addresses the challenge of identifying blight (late and early) based on a machine learning approach. In this approach, the image is first preprocessed to convert from RGB to Grayscale and subsequently denoised. Next the statistical features of the image are computed to train a machine learning models based on a probabilistic approach employing the Bayes Theorem of conditional probability. A penalty factor is included for the training purpose termed as regularization which optimize the weight updated mechanism. The final classification accuracy is computed based on the TP, TN, FP and FN rates which yield a classification accuracy of 97.69%..
Keywords Potato Leaf Disease (blight), Image Denoising, Feature Extraction, Deep Neural Networks, Classification Accuracy.
direction of research towards automation in agricultural sector [5].
Precision agriculture can have several pertinent applications such as disease detection, pest detection, identifcation of the evel of ripeness of crops, sensing and deciding the amount of water and nutrients needed in farms etc [6]. This paper examines the salient features of machine learning based models which can be used for crop disease identification focussing on the potato leaf blight disease [7]. The choice of the crop has the underpinnings in the fact that potato happens to be one of the most significant staple crops worldwide which is adversely affected by the blight disease [8].
 INTRODUCTION
Machine learning and deep learning based approaches are being extensively used for identification of blight (early and late) in potato crops which happens to be a staple in various regions of the world. To automate the process of blight detection, machine learning and deep learning based approaches have been explored [1]. An effective collection of tools for the early identification of potato leaf blight is provided by machine learning techniques. ML algorithms may be trained to discriminate between healthy and diseased potato leaf classes based on subtle visual signals including [2]:
 Minimalistic need for human intervention.
 Quicker decision making.
 Lesser chances of erroneous decision.
This has led to development of machine learning and deep learning algorithms to be employed in several precision agriculture applications, including detection of crop diseases. The use of data analytics and machine in agriculture is widely defined as precision agriculture. Predicison agriculture and IoT are being explored for various applicaitons [3]
Figure 1 depicts the applications of IoT and precision agriculture which are gaining momentum due to the rapid and mass exodus of large populations towards urban areas leading to shortage of manpower in the agriculture sector [4]. The need for providing food security to increasing populations worldwide is also a serious constrain which has led to the
Fig.1 Applications of Precision agriculture and IoT [1]
Several machine learning and deep learning algorithms have been employed to correctly identify potato leaf diseases among which the most common happen to be the support vector machine (SVM), Random Forests (RF), Convolutional Neural Networks (CNN) Residual Network (ResNet) etc [9]. A brief summary of noteworthy contribution in the filed in presented in table I..
Table 1.
Brief Summary of Noteworthy Contribution in the filed
 EXISTING MODELS
S.No. Authors Findings 1 Bonik et al. [10] CNN model used for potato leaf blight detection. Accuracy of 94.2% achieved. 2 Singh et al. [11] Blight Detection in Tomato Leaves, using different algorithms. CNN achieves: 94.07% accuracy. Support Vector Machine (SVM) achieves 92.2% accuracy.
Random Forests (RF) achieves 96.1% accuracy.
3 A. Singh et al. [12] Blight Detection in Tomato Leaves, using KMeans Clustering and SVP. Proposed Approach attains a classification accuracy of 95.9% accuracy.
4 Afzal et al. [13] GoogleNet, VGGNet, and EfcientNet used for classifying potato blight with an FScore of 0.840.98, 0.790.94 and 6.88. respectively
5 Tiwari et al. [14] Feature extraction followed by classification using the Random Forest algorithm. Classification Accuracy of 97% achieved.
6 Iqbal et al. [15] VGG19 for feature extraction followed by Logistic Regression rendered an accuracy of 97.7%. 7 Tariq et al. [16] Image processing followed by Convolutional Neural Networks used to obtain an accuracy of 99%.
8 Akther at al. [17] Transfer learning through the VGG16 deep learning model to obtain an accuracy of 96.88%.
The proposed methodology consists of 3 major parts:

 Image PreProcessing
 Image Feature Extraction
 Classification
PreProcessing: The preprocessing parts consists of the RGB to Grayscale conversion as well as denoising the image using the DWT. The mathematical analysis is presented here [20]: For the images, convert RGB to Grayscale using the following relation:
= 0.333 + 0.5 + 0.1666 (1)
Based on the analysis of existing research in the filed, it can be observed that the neural network model is particularly good at automatically learning hierarchical representations of image features; this eliminates the requirement for feature extraction that is done by hand. However, completely bypassing the feature extraction part may have its own disadvantages which are [18]:
 Need to extensively copious datasets to effectively train deep learning models [19].
 Lessened accuracy of classification due to variations in image texture and background.
 Possibility of vanishing gradient and overfitting.
Fig.1 A typical healthy image (a) and blighted image (b)
Figures 2(a) and 2(b) depict the typical normal and blight infested images. As a machine learning based approach for potato crop blight detection method would necessitate capturing images through unmanned aerial vehicles (UAVs), which results in noise and degradation effects in the image data captured, hence the proposed approach tries to incorporate image denoising (to filter out noise effects), feature extraction and subsequent classification using a deep neural network model. Each of the aforesaid concepts are presented in detail next.
Where.
Fr, Fg and Fb are the intensity of R, G and B component respctively and
Iy is the intensity of equivalent gray level image of RGB image.
The benefit of this process is the fact that it converts the function of 3 variables to one variable and renders homogeneity.
The next step is the denoising of the image based on the DWT process which tries to filter out the image in the transform domain using wavelet decomposition. The approximate low frequency components are used to retain the actual information while the detailed high frequency components are discarded to remove noise effects [21].
Feature Extraction: The feature extraction process is necessary to compute important statistical features of the images for the final classification process. The features computed in this work are energy, mean, median, standard deviation, variance, entropy, skewness, kurtosis, contrast, correlation, homogeneity, smoothness and rms value. These feature are then then demarcated for the target variable. In order to overcome the difficulties associated with picture classification, the computation of image statistical features is essential. These features are vital for creating precise and dependable classification models because they capture important traits, improve discriminative power, guarantee robustness, and allow efficiency and interpretability. To fully realise the potential of picturebased classification systems, advanced feature extraction techniques must be included as we navigate the everexpanding field of image analysis [23].
Final Classification: The final classification is based on the design of the deep neural network model which classifies the image as:
 Healthy
 Blight (early) or blight (late)
For this purpose, the computed and fed to the deep neural network. The image statistical features are measurable attributes that are taken from images and represent different facets of its texture, spatial relationships, and pixel intensity distribution [24]. These characteristics enable efficient differentiation between several groups or categories by offering insightful information about the underlying patterns and structures inside images. mage statistical traits provide resilience against changes in lighting, noise, and geometric alterations. Higherlevel properties that are more resistant to
distortions are encoded via statistical features, in contrast to raw pixel values, which are susceptible to such alterations [25].
Classifiers generate succinct yet useful representations of visual content by computing statistical parameters including mean, variance, skewness, and kurtosis. These characteristics strengthen the discriminative ability of classification models
Figure 3 depicts the flowchart of the proposed approach. The first step is collecting the annotated dataset with three categories of images. The three categories would be the normal, early and late blight stages. The next process would be the image filtration process to remove effects of noise and disturbance through the discrete wavelet transform (DWT) defined as [28]:
by encapsulating important statistical characteristics that separate one class from another. As there is no clear
demarcation among the normal and blighted potato leaf
Here,
W (Jo, k) = 1 (). ()
(4)
images, hence a probabilistic classifier is design and used for the final classification based on the Bayes Rule [26]: The weights of the network are updated such that the condition for maximization is satisfied of a new sample bearing a conditional probability defined as:
( )( )
W represents the variable in the transform domain
J is the scaling factor
K is the shifting factor.
X is the original raw sampled dataset
is the kernel of the transform
The DWT can be visualized as a filter/seive which can be
Here,
( ) =
,1,2,
,2, 1,
( )
1,2,
(2)
used iteratively for data filtering through iterative decomposition of the data, by retaining and remvoing
denotes the probability of occurrence of an event.
denotes the vector corresponding to the bias and weight values of the network.
denotes the training data set
The essence of the algorithm happens to be the factor termed as penalty = which controls the movement of the weight
vector based on the modified cost function for the network
(as a function of weights) given by [27]:
() = + [1 ( )2] (3)
values, and acts as an effective image filter.
The subsequent step is computing the images features (statistical attributes). Subsequently, the features for the dataset are to be annotated as [29]:
= [, , ] (5)
Here,
denotes the overall feature vector.
denotes features of normal class.
denotes features of early blight class.
denotes features of late blight class.
=1
Two major conditions arise in this case:
If ( ): in case the errors exhibit a low magnitude.
else if ( ): in case the error exhibit a higher magnitude.
Next, the features are to be applied to the Deep Neural Network for classification. The network is trained based on the Regularization based Bayes Model with the penalty factor. The probabilistic classification is done as:
For N samples in a set U, the probability for a sample to
belong to a category can be given by [30]:
( ), (
), .. (
).
1
Here,
2
X denotes the new random testing sample.
1 . , denotes the classes
denotes probability.
The decision is based on finding the maxima among:
() =
( )
1
( )
2
(6)
Fig.3. Flowchart of Proposed System
( )
The maximum probability of any category would result in the
decision being in the favour of the particular category.:
= (7)
=1
Where, Here,
(
) =
=
(8)
=1
= cumulative overall probability.
The back propagation based training rule for the network is
given by [31]:
Here,
+1
=
(9)
& +1 denote the weights of the present and subsequent iterations.
denotes the learning rate.
denotes the error in the present iterations.
denotes the error vector.
Based on the classifications of the network for the testing phase, the classification accuracy can be calculated as:
Fig.4. Original Test Image
Figure 4 depicts an image for analysis. The subsequent step is the histogram analysis of the images.
Here,
= +
+++
(10)
TP represents true positive TN represents true negative FP represents false positive FN represents false negative
The next section presents the results associated with the proposed approach.

 EXPERIMENTAL RESULTS
This
The experiment has been performed on MATLAB 2020a on a PC with 16BG RAM and Intel i7 processor, coupled with an NVIDIA GTX GPU unit. The results obtained are resented in this section sequentially. The data has been fetched from:
data.mendeley.com/datasets.
The data is annotated in three categories i.e. normal, early and late blight. The 70:30 splitting ratio has been adopted in this case as the generic rule. The subsequent figures depict the step by step process.
Fig.5. Wavelet Analysis of Image (3rd Level)
Fig.6.Histogram and Cumulative Histogram of Original Image at 3rd level
Table II.
Tabulation of data statistical values for original image I
S.No. Parameters Vaues 1 Maximum 254.4 2 Minimum 1.47 3 Mean 152.2 4 Median 134.8 5 Standard Deviation 64.17 6 Medium Absolute Deviation 55.92 7 L1 Norm 9.97 x 106 8 L2 Norm 4.23 x 104 Fig.7. Histogram and Cumulative Histogram of Approximations (3rd Level)
Fig.8. Histogram and Cumulative Histogram of Details at Level 3 of Haarlet
Figures 5, 6, 7 and 8 depict the DWT decomposition of the original image followed by the histogram analysis of the original image, the approximate coefficients and detailed coefficients respectively.
Table II depicts the statistical DWT features of the original image. A similar analysis can be done for the approximate and detailed coefficient values.. The observation which can be made is the fact that the values for the original image are closer to the approximations while completely different from the details. This clearly indicates the statistical dissimilarity of the details w.r.t. the original image, and hence can be considered as exogenous noise effects which can be filtered through the DWT approach. The same inference can be drawn from the respective graphical histograms.
The next phase is the feature extraction which is computing the statistical features of the denoised image which are energy, entropy, mean, median, variance, standard deviation, skewness, kurtosis, inverse difference, homogeneity, correlation and contrast. The features of the annotated image set is computed and the values are depicted in figure 9.
Fig.9. Feature extraction
Figure 9 depicts the feature extraction phase of the network, with the annotated feature vectors used to train the network. The total number of images for the classification purpose have been considered as 130 out of the 430 images, whose 300 images are used for training.
Fig.10. Training Parameters
Figure 10 depicts the training parameters of the network which are the Bayes Network parameters such as the training
gradient , the combination coefficient = ,
The accuracy of the proposed approach is thus 97.69% for the proposed approach.
A summary of the results is presented next:
Table III
Summary of Experimental Results
S.No. Parameters Values 1 Data Source https://data.mendeley.com/d atasets/v4w72bsts5/1 2 Image Type jpg 3 Split Ratio 70:30 4 Feature Extraction 12 statistical features 5 ML Model Neural Network 6 Algorithm Back Propagation with Bayesian Regularization 7 Accuracy: Bonik et al., 2023 [10]
94.2% 8 Accuracy: Singh et al., 2022 [11]
94.07% 9 Accuracy: A.K. Singh et al., 2022 [12]
95.9% 10 Accuracy (Proposed Work) 97.69% Table III summarizes the results obtained from the implementation of the proposed approach. A comparison
+1
with existing work also shows that the proposed approach
changing parameters, sum square parameters and the validation checks to convergence. It can be observed that the model attains convergence at 15 iterations without any validation fail event.
Fig.11. Confusion Matrix
The confusion matrix is generated after the network is tested for the new cases of the samples used for testing for true and false classed. The networks confusion matrix can be used to compute the systems classification accuracy as:
64 + 63
outperforms the exiting work in the domain in terms of classification accuracy. The feature extraction phase allows to control the features or attributes based on which the neural network model would further classify the images rather than being dependent on conventional architectures. This allows much more flexibility compared to baseline approaches.
 CONCLUSION
In conclusion, it can be said that the potato plant (especially the leaf) is prone to blight disease. If left untreated, potato leaf blight, which is brought on by fungi like Phytophthora infestans, can seriously harm potato crops all over the world and result in large yield losses. Agronomists’ subjective and timeconsuming visual inspection is the foundation of traditional disease detection techniques. However, there is a chance to completely transform the identification and treatment of potato leaf blight with the introduction of machine learning (ML) and deep learning (DL) approaches. This paper presents not only a machine learning based approach, but rather integrates it with image denoising and statistical feature extraction to train a deep neural network which attains a classification accuracy of 97.69%. The Back Propagation with Bayesian Regularization has been designed to train the probabilistic neural network model with annotated statistical features..
=
64 + 63 + 1 + 2
= 97.69%
REFERENCES
 R. K. Singh, R. Berkvens and M. Weyn, “AgriFusion: An Architecture for IoT and Emerging Technologies Based on a Precision Agriculture Survey,” in IEEE Access, vol. 9, pp. 136253 136283, 2021.
 ASMM Hasan, D Diepeveen, H Laga, MGK Jones, “Image patch based deep learning approach for crop and weed recognition,” Ecological Informatics, Elsevier 2023, vol. 78,. 102361.
 BK Hu, Z Wang, G Coleman, A Bender, T Yao, S Zeng, “Deep learning techniques for incrop weed recognition in largescale grain production systems: a review,” Precision Agriculture, Springer Nature 2024, , vol. 25, pp. 129.
 FH Juwono, WK Wong, S Verma, N Shekhawat, Machine learning for weedplant discrimination in agriculture 5.0: An indepth review, “Artificial Intelligence in Agriculture, Elsevier, 2023, vol. 10, pp. 13 25.
 H. L. Lee, C. L. Chang and C. C. Chang, “Precision Agriculture and Its Applications in Taiwan,” in IEEE Technology and Engineering Education (ITEE), 2016, pp. 200205
 D. Li, Z. Zhou and S. Li, “A Review of UAV Applications in Precision Agriculture,” in Journal of Sensors, vol. 2017, pp. 113, 2017
 J. R. RosellPolo, D. SanzRobinson and J. J. VallÃ©sRos, “A Review of Methods and Applications of the Geometric Characterization of Tree Crops in Agriculture,” in Sensors, vol. 15, no. 6, pp. 13035 13061, 2015.
 A. Nayyar, J. Irvin and C. A. Grift, “A Review of UAVBased Precision Agriculture and Its Contributions to Sustainability in Crop Production,” in Sustainability, vol. 11, no. 17, pp. 47064722, 2019.
 S. Arabi and S. G. McClure, “Integration of Unmanned Aerial Vehicles (UAVs) in Agriculture: A Review,” in Computers and Electronics in Agriculture, vol. 144, pp. 4961, 2017.
 C. C. Bonik, F. Akter, M. H. Rashid and A. Sattar, “A Convolutional Neural Network Based Potato Leaf Diseases Detection Using Sequential Model,” 203 International Conference for Advancement in Technology (ICONAT), Goa, India, 2023, pp. 16
 AK Singh, SVN Sreenivasu, U Mahalaxmi, H Sharma, DD Patil, E Asenso, Hybrid featurebased disease detection in plant leaf using convolutional neural network, bayesian optimized SVM, and random forest classifier, Artificial Intelligence in Food Quality Improvement Hindawi 2022, Article ID 2845320.
 A Singh, H Kaur, Potato plant leaves disease detection and classification using machine learning methodologies, Proceedings in IOP Conference Series: Materials Science and Engineering, 2022, IOP Conference Series: Materials Science and Engineering, vol. 1022, pp.19.
 H Afzaal, AA Farooque, AW Schumann, N Hussain, Detection of a potato disease (early blight) using artificial intelligence, Remote Sensing, MDPI, 2021, vol.13, pp.117.
 M. A. Iqbal and K. H. Talukder, “Detection of Potato Disease Using Image Segmentation and Machine Learning,” 2020 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET), Chennai, India, 2020, pp. 4347.
 D. Tiwari, M. Ashish, N. Gangwar, A. Sharma, S. Patel and S. Bhardwaj, “Potato Leaf Diseases Detection Using Deep Learning,” 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 2020, pp. 461466
 M. I. Tarik, S. Akter, A. A. Mamun and A. Sattar, “Potato Disease Detection Using Machine Learning,” 2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV), Tirunelveli, India, 2021, pp. 800803.
 J. Akther, M. HarunOrRoshid, A. A. Nayan and M. G. Kibria, “Transfer learning on VGG16 for the Classification of Potato Leaves Infected by Blight Diseases,” 2021 Emerging Technology in Computing, Communication and Electronics (ETCCE), Dhaka,
Bangladesh, 2021, pp. 15
 G. Franchi, A. Bursuc, E. Aldea, S. Dubuisson and I. Bloch, “Encoding the Latent Posterior of Bayesian Neural Networks for Uncertainty Quantification,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 4, pp. 20272040, April 2024.
 J. Zhang, L. Liu, R. Zhao and Z. Shi, “A Bayesian MetaLearning Based Method for FewShot Hyperspectral Image Classification,” in IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 113, 2023.
 R. Dhivya and N. Shanmugapriya, “An Analysis Study of Various Image Preprocessing Filtering Techniques based on PSNR for Leaf Images,” 2022 International Conference on Advanced Computing Technologies and Applications (ICACTA), Coimbatore, India, 2022,
pp. 18.
 U. Tuba and D. Zivkovic, “Image Denoising by Discrete Wavelet Transform with Edge Preservation,” 2021 13th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Pitesti, Romania, 2021, pp. 14.
 P. Lottes, R. Khanna, J. Pfeifer, R. Siegwart and C. Stachniss, “UAV based crop and weed classification for smart farming,” 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 2017, pp. 30243031.
 P. Bosilj, T. Duckett and G. Cielniak, “Analysis of Morphology Based Features for Classification of Crop and Weeds in Precision Agriculture,” in IEEE Robotics and Automation Letters, 2018 vol. 3, no. 4, pp. 29502956.
 V. N. Thanh Le, G. Truong and K. Alameh, “Detecting weeds from crops under complex field environments based on Faster RCNN,” 2020 IEEE Eighth International Conference on Communications and Electronics (ICCE), 2021, pp. 350355.
 M. Alam, M. S. Alam, M. Roman, M. Tufail, M. U. Khan and M. T. Khan, “RealTime MachineLearning Based Crop/Weed Detection and Classification for VariableRate Spraying in Precision Agriculture,” 2020 7th International Conference on Electrical and Electronics Engineering (ICEEE), 2020, pp. 273280.
 S Sabzi, Y AbbaspourGilandeh, Using video processing to classify potato plant and three types of weed using hybrid of artificial neural network and partincle swarm algorithm, Measurement, Elsevier 218, vol.126, pp.Pages 2236.
 TinYau Kwok and DitYan Yeung, “Constructive algorithms for structure learning in feedforward neural networks for regression problems,” in IEEE Transactions on Neural Networks, vol. 8, no. 3,
pp. 630645.
 M. Alam, M. S. Alam, M. Roman, M. Tufail, M. U. Khan and M. T. Khan, “RealTime MachineLearning Based Crop/Weed Detection and Classification for VariableRate Spraying in Precision Agriculture,” 2020 7th International Conference on Electrical and Electronics Engineering (ICEEE), Antalya, Turkey, 2020, pp. 273 280
 A. Sharma, A. Jain, P. Gupta and V. Chowdary, “Machine Learning Applications for Precision Agriculture: A Comprehensive Review,” in IEEE Access, vol. 9, pp. 48434873, 2021.
 A. Naveed et al., “SaliencyBased Semantic Weeds Detection and Classification Using UAV Multispectral Imaging,” in IEEE Access, vol. 11, pp. 1199112003, 2023.
 J TorresSospedra, P Nebot, Twostage procedure based on smoothed ensembles of neural networks applied to weed detection in orange groves, Biosystems Engineering, Elsevier 2014, vol.123, pp.4055.