Malaria Detection using Image Processing and Machine Learning

Download Full-Text PDF Cite this Publication

Text Only Version

Malaria Detection using Image Processing and Machine Learning

Prof. Kirti Motwani Computer Science Department

Xavier Institute of Engineering,Mahim Mumbai,India

Abhishek Kanojiya Computer Science Department Xavier Institute of Engineering, Mahim Mumbai, India

Cynara Gomes Computer Science Department Xavier Institute of Engineering,

Mahim Mumbai, India

Abhishek Yadav Computer Science Department Xavier Institute of Engineering,

Mahim Mumbai, India

Abstract Malaria is a deadly, infectious and life- threatening mosquito-borne blood disease caused by Plasmodium parasites. The conventional and most standard way of diagnosing malaria is by visually examining blood smears via microscope for parasite- infected red blood cells under the microscope by qualified technicians. This method is inefficient and time consuming and the diagnosis depends on the experience and the knowledge of the person doing the examination. Automatic image recognition technologies based on image processing have been applied to malaria blood smears for diagnosis before. However, the practical performance has not been upto the mark so far. This gives us all the motivation to make malaria detection and diagnosis fast, easy and efficient. Our main aim is to build a model that can detect cells from images of multiple cells in thin blood smear on standard microscope slides and classify them as either infected or uninfected with early and effective testing using image processing. And also perform classification on the infected cell image using machine learning.

Keywords Malaria, Falciparum, Watershed, Morphological Segmentation, Edge Detection, Segmentation.

  1. INTRODUCTION

    Malaria is a deadly, infectious disease caused by the Plasmodium parasite which is transmitted by the bites of female Anopheles mosquitoes. According to the World Malaria Report 2019 published by WHO [1], there were an estimated 405,000 malaria related deaths in the preceding year. The disease is curable but early detection holds the key. Existing methods used to detect Malaria include microscopic detection of infected cells in a laboratory. The method is both expensive and tedious. An estimated 93 percent of all Malaria cases in 2018 were reported in the WHO African region. The region also has one of the lowest per capita incomes across the world. A faster, low cost, and reliable alternative to microscopic detection of Malaria is proposed.

    The study proposes an image processing model for detection of malaria infected cells. We use image processing

    techniques to detect parasite-infected red blood cells in thin smears on standard microscope slides The most widely used present day method is examining thin blood smears under a microscope, and visually searching for infected cells. A clinician manually counts the number of parasitic red blood cells – sometimes up to 5,000 cells (according to WHO protocol) [2].Malaria could be forestalled, controlled, and relieved all the more adequately if an increasingly precise and effective symptomatic techniques were accessible. We have utilized image processing procedures to identify the nearness of malaria contaminated cells.And to classify the stage of malaria i.e falciparum or non-falciparum we use machine learning.

  2. RELATED WORK

    Malaria being one of the most fatal disease has been at the focal point of some major studies in the recent past. Some of them are described briefly here. The paper [3] uses a CNN model to detect parasite infected red blood cells in thin smears on standard microscopic slides prepared using routine methods. It is inspired by experiments on the underlying physiological mechanisms in the visual cortex of felines for recognizing objects. Another study [4] presents that there are many systems which describe the computerized methods of image analysis that commonly involves three main phases. In the first phase of preprocessing, luminance of the image is corrected and transformed to a constant color space. At the second step, a histogram-based image segmentation process is used which helps in avoiding maximum artifacts and over stained objects. Later, a back propagation neural network was used for classifying objects. A more accurate method of counting blood cells using Python OpenCV is explored in [5]. It uses images of blood obtained by keeping blood samples under a microscope to calculate number of cells. of the entire proceedings, and not as an independent document. Image processing is a method which involves signal processing and mathematical procedure. In this study, the images were processed and a blob detection

    algorithm was used to detect and differentiate RBCs from WBCs. A cell counting method was also used to provide an actual count of the RBCs and WBCs detected. The automation comes with a gui backed up with a database. The literature [7] proposes a parasite detection technique which is based on digital image processing. Images of thin blood smear are used and with the help of image processing approach the parasite in the cells are identified.

  3. METHODOLOGY

    At present, the recognition of Malaria parasite in single cell slide is totally manual. This procedure could be rearranged by capturing an image of the blood smear and afterward utilizing the proposed model to arrange whether the cells are contaminated or not. The proposed model uses the utilization of image processing systems to improve existing techniques and abbreviate the time taken for recognition of malaria parasite in blood tests.The dataset is manually collected from the CDCs Division of parasitic infection and Malaria [6]. Our concentration here is to make a mechanized capacity to distinguish the nearness of Malaria parasite in slight blood spread and measure the segment of RBC in the example that are tainted, primary assignment is to fragment the contaminations, for which segmentation of the cells is the earlier undertaking. Segmentation techniques involve methodologies based on Edge Detection[8], Watershed Segmentation[9,11] and Morphological segmentation[10]. Once the cells are segmented, the infections are segmented. This is done via, using a threshold intensity pixel value for the infection, i.e. if the pixel value is in the range of the threshold the infections are identified.

    1. Segmentation of cells

      When we "segment" an image, we distinguish the regions of interest (ROIs) from the non-ROI portion, generally creating a binary mask of what we want to qualify, quantify, track, etc. Segmentation is a critical part of many image processing problems, and is worth considering in some depth. Here also we will segment our cell images. One of the major issues in segmenting human blood cells is the contiguity of the cells, one may find a lot of overlapping of cells. Segmenting touching object is one of the most difficult task in image processing. We use different approaches like Edge Detection, Watershed Segmentation, Morphological Segmentation.

      1. Edge Detection(Gradient Based Techniques): The gradient based edge detections look for the first derivative of an image where the maxima and minima are occur. These techniques used sobel, prewitt and roberts cross operator for finding the edges..

      2. Edge Detection(Gaussian Based Techniques): The main purpose of this technique is to detect the zero crossings in the second order derivative of an image to find edges. The Gaussian based techniques are Laplacian of Gaussian (LOG) and Canny edge detection.

        The LOG operator is the best in all of the edge detection techniques.

        Fig 1. Comparison between edge detection techniques

      3. Watershed Segmentation: Watershed segmentation algorithm is applied on the images in the dataset, it segments or divides the different cells present in the image. These divisions help in simplifying the detection and classification of malaria parasite in the blood samples.

        Fig 2. Watershed Segmentation

      4. Morphological Segmentation: The term morphology refer to the shape and size of the objects within the image. In our study we have used the shapes of the cells, as most of the cells where circular in shape and of a particular radii range. Thus a circle detection algorithm was designed which could detect the cells based on the shape of the cells.hlight all author and affiliation lines.

        Fig 3. Morphological Segmentation

    2. Segmentation of infection

      After successful detection of the cells the next task is to spot the infections inside the cells. So, for this we would be using the pixel intensity values, and observe the range of intensities where ever the infections.But the problem with it is that the illumination and color scale of one image may differ to other. For this we would use histogram matching[12] to have a uniform color scale.

      Fig 4. Histogram Matching

    3. Classification of infected cells

    The approach uses machine learning to classify the cells if infected.Support Vector Machine(SVM) is a supervised machine learning algorithm that can be employed for both classification and regression purposes. SVMs are based on the idea of finding a hyperplane that best divides a dataset into two classes. An SVM classifies data by finding the best hyperplane that separates data points of one class from those of the other class. The best hyperplane for an SVM means the one with the largest margin between the two classes. Margin means the maximal width of the slab parallel to the hyperplane that has no interior data points[4].

    Here cubic SVM type classifier is employed where the kernel function of the classifier is cubic given as k(xi,xj)[12].

    [12]
  4. MODEL EVALUATION

    Fig 5. Flowchart

    1. Data Source

      The imageset used archived blood smear images acquired from the CDCs Division of parasitic infection and Malaria

      [6] as input. Our data set contains images of different malaria parasites like Plasmodium falciparum,Plasmodium Knowles, Plasmodium malariae, Plasmodium ovale, Plasmodium vivax.

    2. Steps used in image processing

      1. The paper applies an auto-generated segmenter that performs five steps: (1)Convert to grayscale. (2)Initialize segmentation with Otsus threshold. (3)Filter components by area. (4)Form masked from input image and

        segmented image.

      2. Detecting edges-We have used various edge detection techniques like Canny, Laplacian of Gaussian, Prewitt, Roberts, Sobel and Zerocross operator.LOG gives us the best result.

      3. Combining the edges(logically) with the segmented regions.

      4. Improve edge mask by performing morphological operations like imclose, skeleton, etc.

      5. Cleaning of the mask and refine the mask.

      6. Use different segmentation techniques like watershed, thresholding, k-means clustering based segmentation. Of which we select watershed segmentation since the other techniques have tiny pores and do not give good results for further segmentation.

      7. To improve the result and flatten the pools from the results of watershed we use imhmin function.

      8. We use another segmentation technique i.e. Morphological segmentation to overcome the shortcomings in watershed segmentation. This technique detects the circles by setting a specific range of radius.

      9. Histogram matching is done since all the images have different intensities and for classification we require all the images to be standard with the same intensity.

      10. Finally, if the intensity values are within the threshold we display the percentage of cells infected.

    3. Classification Stage

      The Study uses Classification Learner to automatically train a selection of different machine learning(classification) models on our data if infected. We use automated training to quickly try a selection of model types, then explore promising models interactively. Here, the two classes are falciparum and non- falciparum.If the cell is infected in the image processing stage, then we undergo a test for classifying whether the malaria parasite is falciparum or some other malaria parasite(non- falciparum).

      Machine learning algorithms like Cubic SVM,Linear SVM and cosine KNN are used and out of which the best algorithm with the highest accuracy score is selected for classification purpose.

    4. Outputs

      All the images are read in MATLAB and an objective/target image is chosen on which all the tasks are performed and afterward on all the images. After performing all the steps of image processing, the images of blood cells are displayed whether infected or not infected, and if infected the number of cells infected and their percentage is shown and then we train the model for classification of the infected cell.

      Fig 6. Input images

      Fig 7. Detection of cells

      Fig 8. Output(Percentage of infected cells)

    5. Result

    Table 1 presents the accuracy of detecting the cell whether infected or not infected.

    TABLE I. ACCURACY OUTCOME OF INFECTION DETECTION

    Actual data

    Systems detection(ip)

    Accuracy

    Infected

    113

    102

    90.2%

    Normal

    11

    11

    100%

    Table II. presents the overall accuracy of the machine learning algorithms. The performance of the model was evaluated using the performance measures that include: Accuracy, Precision, Recall and F-Score.

    TABLE II. OVERALL ACCURACY

    Algorithm

    Accuracy

    Precision

    Recall

    F-Score

    Cubic SVM

    86.1

    71.2

    86.3

    77.9

    Linear SVM

    79.2

    51.2

    84.3

    63.87

    Cosine KNN

    74.4

    70.2

    64.7

    67.33

  5. CONCLUSION

Considering our above methodology, we come to the conclusion that the newly designed system of image processing technologies is suitable for parasite detection. The morphological segmentation techniques proves to be the best in detection of the cells and segmenting them from the non- region of interest portion.To characterize the sort of plasmodium parasite for which we utilized machine learning technologies and cubic SVM proves to be the best by having the highest accuracy score.

The system has been tested with around 110 thin film blood smear images and the results are satisfactory. The inconsistencies among the images was a challenge but we have tried to make our system robust.

REFERENCES

  1. World Health Organization, Malaria, https://www.who int/news- room/fact-sheets/detail/malaria-report-2019.

  2. World Health Organization, Malaria, https://www.who int/news- room/fact-sheets/detail/malaria(2018).

  3. CNN-based image analysis for malaria diagnosis. https://ieeexplore.ieee.org/document/7822567.

  4. Afkhami, S., Rashidi Heram-Abadi, H., (2017). Detection of Malarial Parasite in Blood Images by two classification Methods:Support Vector Machine (SVM) and Artificial Neural Network (ANN), Int. J. of Comp. & Info. Tech. (IJOCIT), 5(2): 81-100. www.ijocit.org/journal/v05_i02/IJOCIT-V05I02P01.pdf.

  5. Blood Cells Counting using Python OpenCV https://ieeexplore.ieee.org/document/8652384.

  6. CDC DPDX MAARIA https://www.cdc.gov/dpdx/malaria/index.html.

  7. Bashir, A., Mustafa, Z. A., Abdelhameid, I., & Ibrahem, R. (2017). Detection of malaria parasites using digital image processing. 2017 International Conference on Computing and Electronics Engineering (ICCCCEE) doi:10.1109/iccccee.2017.7867644.

  8. Amer, G. M. H., & Abushaala, A. M. (2015). Edge detection methods. 2015 2nd World Symposium on Web Applications and Networking (WSWAN). doi:10.1109/wswan.2015.7210349.

  9. Chourasiya, S. (2014). Automatic Red Blood Cell Counting using Watershed Segmentation.

    https://pdfs.semanticscholar.org/ef24/ae99ca56cdb65d66d23cf376 7ee8a 73972d7.pdf

  10. Bhusare, M.P., & Bhosale, S.D. (2019). Automatic Blood Cell Counting using Morphological Image Processing operations. doi: https://doi.org/10.1007/978-3-319-96139-2_9

  11. Mohamad Sharif, Johan & Miswan, M. & Ngadi, Md & Abdul Jamil, Muhammad Mahadi. (2012). Red blood cell segmentation using masking and watershed algorithm: A preliminary study 2012 International Conference on Biomedical Engineering, ICoBE 2012 10.1109/ICoBE.2012.6179016.

  12. U. Jain, K. Nathani, N. Ruban, A. N. Joseph Raj, Z. Zhuang and

V. G.V. Mahesh, "Cubic SVM Classifier Based Feature Extraction and Emotion Detection from Speech Signals," 2018 International Conference on Sensor Networks and Signal Processing (SNSP), Xi'an, China, 2018, pp. 386-391. doi: 10.1109/SNSP.2018.00081

Leave a Reply

Your email address will not be published. Required fields are marked *