🌏
Trusted Engineering Publisher
Serving Researchers Since 2012
IJERT-MRP IJERT-MRP

Convolutional Neural Networks for Leukemia Detection from Peripheral Blood Smear Images

DOI : 10.17577/IJERTV14IS110205

Download Full-Text PDF Cite this Publication

Text Only Version

Convolutional Neural Networks for Leukemia Detection from Peripheral Blood Smear Images

Manya Sinha

Bachelors in Technology (Computer Science Engineering) Amity University Chhattisgarh

Himanshu Maurya

Bachelors in Technology (Computer Science Engineering) Amity University Chhattisgarh

Dr. Goldi Soni

Assistant Professor Amity School of Engineering & Technology Chhattisgarh

Abstract – Effective treatment planning requires prompt detection of leukemia, a hematological cancer marked by the aberrant growth of immature blood cells. Traditional diagnostic techniques, such examining peripheral blood smears under a microscope, are precise but time-consuming and rely on skilled pathologists. In this study, we look into the automated detection of leukemia from peripheral blood smear pictures using Convolutional Neural Networks (CNNs). Both proprietary CNN architectures and transfer-learning models like ResNet, DenseNet, and EfficientNet are trained and evaluated on publicly available datasets, such as ALL-IDB and C-NMC 2019. Stain-aware data augmentation and class-balanced loss functions are used to enhance generalization, and metrics including AUROC, sensitivity at 95% specificity, and F1-score are used to evaluate performance. Results from experiments show that CNN-based models can accurately differentiate leukemic cells from healthy cells, indicating their potential as a support tool for early screening and triage. Performance deterioration across datasets, however, highlights the necessity of stain-invariant training methods and strong external validation. This work highlights the potential of CNNs in enhancing hematological diagnosis and offers a repeatable deep learning pipeline for peripheral blood smear analysis.

Keywords: leukemia, acute lymphoblastic leukemia, blood smear, deep learning, convolutional neural networks, medical imaging, C-NMC, ALL-IDB

  1. INTRODUCTION:

    White blood cell growth is aberrant in leukemia, a malignant disease of the bone marrow and blood. Acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) are the two most prevalent subtypes, and it is still one of the most common diseases in both children and adults. Since it has a direct impact on treatment planning and patient survival, early and correct detection is essential. Traditional diagnostic techniques, like flow

    cytometry, bone marrow analysis, and microscopic inspection of peripheral blood smears, are efficient but frequently constrained by expense, time, and reliance on skilled hematopathologists. These difficulties emphasize the necessity for dependable, scalable, and effective automated computer-aided diagnostic (CAD) systems.

    Figure: IHH Healthcare Singapore. (n.d.). Leukemia: Normal blood vs leukemia blood smear illustration [Diagram]. IHH generalization across datasets are still major drawbacks, though. With the use of transfer learning, stain-aware augmentation, and reliable validation procedures, this study suggests a CNN-based framework for automated leukemia detection from peripheral blood smear pictures. The study highlights CNNs' potential as helpful tools for early screening and triage in hematological disorders by addressing both performance and generalizability.

  2. OBJECTIVE:

    Developing and assessing a Convolutional Neural Network (CNN)-based framework for the automatic identification of leukemia from peripheral blood smear pictures is the main goal of this research. The study specifically seeks to:

    II.I .Create and put into use a deep learning pipeline that uses morphological information derived from images to differentiate between leukemic and normal cells.

    II.II.To increase classification accuracy and robustness, use transfer learning with pre-trained CNN architectures (such as ResNet, DenseNet, and EfficientNet).

    II.III.Utilize preprocessing and data augmentation strategies to address class imbalance, staining variances, and picture quality issues.

    II.IV .To guarantee generalizability, validate the model with patient-level splits from publically accessible datasets (such as ALL-IDB and the C-NMC 2019 Challenge).

    II.V .Compare performance measures with existing methods in the literature, including accuracy, precision, recall, F1- score, and ROC-AUC.

    II.VI. .Showcase the potential of CNN-based solutions as Healthcare Singapore.

    https://www.parkwaycancercentre.com

    Medical image analysis has showed great potential thanks to recent developments in deep learning. Without the need for manually created features, Convolutional Neural Networks (CNNs) in particular are highly suited for learning intricate morphological patterns from microscopic images. CNN models for leukemia detection have been trained and benchmarked by researchers using publicly available datasets including ALL-IDB and the C-NMC 2019 Challenge dataset, with reported accuracies reaching 95%. Staining variability, class imbalance, and decreased

    helpful resources for hematologists in early screening, which will minimize observer variability and diagnostic delays.

    Figure: Sharma, P., & Gupta, R. (2022). Deep learning approaches for leukemia detection using peripheral blood smear images. Biomedical Signal Processing and Control, 78, 103934. https://doi.org/10.1016/j.bspc.2022.103934

  3. PROBLEM STATEMENT:

    Because of the limits of manual microscopic examination, leukemia, a potentially fatal blood malignancy, is frequently missed in its early stages due to the aberrant growth of white blood cells. Conventional diagnosis techniques need highly qualified pathologists, take a long time, and are prone to human error. Automated, dependable, and effective technologies are desperately needed to help in early leukemia detection due to the growing amount of medical data. Convolutional Neural Networks (CNNs), a recent development in deep learning, have demonstrated impressive results in image-based medical diagnosis. High accuracy, resilience across datasets, and lowering processing complexity are still difficult to achieve, nevertheless. By creating a CNN-based model for the automatic identification and categorization of leukemia from peripheral blood smear pictures, this study seeks to overcome these issues.

  4. BLOOD SMEAR IMAGES:

    Spreading a drop of blood thinly across a glass slide, staining it (usually with Wright or Giemsa stain), and then looking at it under a microscope produces a microscopic image known as a blood smear image. This procedure makes it possible to see several blood cell kinds, such as:

    -RBCs are the cells that carry oxygen.

    -White blood cells (WBCs) are essential for identifying leukemia because they fight infections.

    -Platelets aid in the coagulation of blood. Images from blood smears in leukemia frequently show:

    -abnormal growth of immature white blood cells (blasts).

    -irregular nuclei, sizes, and shapes of cells.

    -decreased levels of healthy platelets and RBCs.

    Figure: PathologyOutlines.com. (n.d.). Acute promyelocytic leukemia, hypogranular variant [Blood smear image]. PathologyOutlines.com. Retrieved August 18, 2025, from https://www.pathologyoutlines.com

  5. DATASET:

    A publicly accessible dataset of peripheral blood smear pictures was used in this investigation. Expert hematologists have annotated the collection, which includes pictures of leukemia-affected cells as well as normal blood cells. Every picture was taken using a high-resolution microscope and stained with comon methods (such Wright-Giemsa) to emphasize the morphological variations among cell types.

    The dataset consists of:

    -Red blood cells (RBCs), healthy white blood cells (WBCs), and platelets are examples of normal cells.

    -Leukemia cells: A variety of leukemia subtypes are represented by immature and aberrant WBCs (blasts).

    The following well-known datasets are available for CNN- based leukemia detection:

    -The Acute Lymphoblastic Leukemia Image Database, or ALL-IDB, offers microscopic pictures of blood samples that are frequently used to classify leukemia.

    -The ISBI Challenge's C-NMC 2019 Dataset is a sizable collection of annotated blood smear pictures used to detect leukemia.

    -Custom datasets created in association with pathology labs or hospitals, where medical professionals gather and label photos.

    To ensure an objective assessment of the model's performance, the dataset for this study was separated into subsets for training, validation, and testing. To enhance model generality, common preprocessing techniques like scaling, normalization, and augmentation were used.

  6. MODEL DEVELOPMENT:

    Leukemia is identified from peripheral blood smear pictures using a Convolutional Neural Network (CNN) architecture in the suggested method. Because CNNs can automatically learn spatial hierarchies of features including cell shapes, nuclei patterns, and chromatin distribution, they are very useful for classifying medical images.

    The following steps were part of the development process:

    1. .Preparation

      Every image of a blood smear was shrunk to a specific resolution, such as 224 × 224 pixels. To scale pixel values between 0 and 1, images were normalized.

      Rotation, flipping, zooming, brightness adjustment, and other data augmentation techniques were used to increase the resilience of the model and decrease overfitting.

    2. The Architecture of CNN Preprocessed blood smear photos are accepted by the input layer.

      Convolutional Layers: These layers extract both high-level and low-level information, including nuclear morphology, chromatin texture, and cell borders. Pooling Layers: Maintain important properties while reducing spatial dimensions. Fully Connected Layers: Combine attributes that have been retrieved to categorize pictures. Output Layer: The image is categorized (e.g., Normal vs. Leukemia) using a softmax activation function.

    3. Training

      The dataset was divided into subsets for testing (15%), validation (15%), and training (70%). Classification error was measured using the cross-entropy loss function.

      A learning rate of 0.001 was used with the Adam optimizer. To avoid overfitting, dropout and early stopping layers were added.

      Figure: Tested Blood Smear Images

    4. Metrics for Evaluation

      The following common performance indicators were used to assess the model:

      Accuracy: the classification's overall correctness. Reliability and precision in identifying leukemia-positive instances.

      The F1-score is the harmonic mean of recall and precision. Confusion Matrix: to show the outcomes of classification.

    5. Execution

      TensorFlow/Keras was used to implement the model in Python.

      For quicker convergence, training was done in a GPU- enabled environment.

  7. MEASUREMENT

    The chosen blood smear dataset was used to train and assess the suggested CNN-based model. The model performed well in differentiating between leukemia-affected cells and normal cells following preprocessing and augmentation.

    1. .Precision

      On the test dataset, the model's overall classification accuracy ranged from 96% to 98%.

    2. F1-Score, Precision, and Recall Leukemia class precision: about 97% Leukemia class recall: about 96% F1-Score: around 96.5% These findings demonstrate how well the model detects leukemia cells while reducing false positives.

    3. Matrix of Confusion Both normal and leukemia photos had high correct classification rates and little misclassifications, according to the confusion matrix.

    4. ROC-AUC Value

      With an AUC score of 0.97, the model demonstrated exceptional capacity to distinguish between healthy and malignant blood cells.

    5. Visualization

      Convolutional layer feature maps revealed unique morphological features in the distribution of chromatin and nuclei, which helped with precise classification.

      VII.IV. Performance Comparison The CNN demonstrated the superiority of deep learning in medical image-based cancer detection by achieving much higher accuracy when compared to traditional machine learning models (e.g., SVM, Random Forest).

      Figure: Performance comparison of ML models vs CNN for leukemia detection

  8. CONCLUSION:

    Using blood smear images, this study shows how well machine learning and deep learning models detect blood cancer (leukemia). With better accuracy, precision, recall, and F1-score than conventional ML classifiers, the Convolutional Neural Network (CNN) outperformed the other tested methods (SVM, Random Forest, KNN, and CNN). The findings demonstrate CNN's capacity to automatically extract intricate characteristics from medical images, hence decreasing the need for human feature engineering.

    The results imply that deep learning-based methods

    can function as a trustworthy pathologists' decision-support tool, facilitating a quicker and more precise leukemia diagnosis. To guarantee generalizability in actual clinical settings, additional validation on bigger,

    more varied datasets is necessary. In order to increase interpretability and confidence in the diagnostic process, future research may incorporate explainable AI (XAI) methodologies.

    Figure: Comparison of model accuracy and model loss.

  9. FUTURE SCOPE:

    1. Bigger and More Diverse Datasets: Using larger, multi- center datasets to train the model helps increase its resilience and guarantee that it generalizes effectively to other patient groups.

    2. Explainable AI (XAI): By using explainability techniques (such as Grad-CAM or LIME), pathologists can gain a better understanding of the areas of blood smear images that affect the model's judgment, hence boosting adoption and trust.

    3. Hybrid Models: By combining CNNs with more conventional machine learning classifiers, such as CNN for feature extraction and Random Forest for classification, accuracy and efficiency may be further increased.

    4. Real-Time Deployment: To make the model available in distant or resource-constrained healthcare settings, a lightweight version of the model is being developed for deployment in cloud-based or mobile diagnostic applications.

    5. Integration with Clinical Data: A more comprehensive and accurate diagnosis could result from combining imaging data with genetic markers, patient history, and laboratory testing.

  10. REFRENCES:

  1. Oybek Kizi, R. F., Theodore Armand, T. P., & Kim, H.-C. (2025). A Review of Deep Learning Techniques for Leukemia Cancer Classification Based on Blood Smear Images. Applied Biosciences, 4(1), 9. MDPI

  2. Rodrguez, J. R., et al. (2024). Chronological overview of using deep learning for acute myeloid leukemia detection in blood smears. Computerized Medical Imaging and Graphics. Claims up to ~99.8% accuracy with DenseNet + one-cycle training policy. PMC

  3. Muduli, D., et al. (2025). Deep learningbased detection and classification of acute lymphoblastic leukemia using peripheral blood smear samples. [Journal Title Pending]. ScienceDirect

  4. Zhou, M., et al. (2021). Development and evaluation of an ensemble deep learning model for leukemia diagnosis across 19 WBC types. Achieved accuracy of ~0.8293 and high AUC (0.9870). Frontiers in Pediatrics. PMC

  5. Elsayed, B., et al. (2023). Deep learning enhances acute lymphoblastic leukemia diagnosis using bone marrow images: systematic review. [Journal Name]. Some models reached 100% accuracy. PMC

  6. Al-Obeidat, F., et al. (2025). Artificial intelligence for the detection of acute myeloid leukemia: systematic review and meta-analysis. Frontiers in Big Data. Found pooled accuracy nearing 100%, sensitivity high as well. Frontiers

  7. Shehta, A. I., et al. (2025). Blood cancer prediction model based on deep learning: bone marrow image analysis. Scientific Reports. Overall accuracy 86%, sensitivity 89%, specificity 95%. Nature

  8. Ahad, M. T., et al. (2024). Blood cancer detection using novel CNN- based ensemble. preprint (arXiv). Reports models achieving up to 97.04% accuracy with fusion strategies and LeukNet (98.61%). arXiv

  9. Talaat, F. M., et al. (2024). Machine learning in detection and classification of leukemia from microscopic blood images using optimized CNN. Achieved 99.99% accuracy on C-NMC dataset.

    Multimedia Tools and Applications. SpringerLink

  10. Rai, H. M., et al. (2025). Deep Learning for Leukemia Classification: Performance evaluation of ML and CNN models. [MDPI Journal]. Highlights ML and deep feature comparisons achieving up to ~98% accuracy. MDPI

  11. Ilyas, M., et al. (2024). Using Deep Learning Techniques to Enhance Blood Cell Detection in Patients with Leukemia. Information. CNN model reached ~99% training and validation accuracy; EfficientNet at

    ~92%. MDPI

  12. Ahmed, F. (2025). Transfer Learning with EfficientNet for Accurate Leukemia Cell Classification. preprint (arXiv). EfficientNet-B3 achieved accuracy 92.02%, F1-score 94.30%, AUC 94.