🔒
Trusted Engineering Publisher
Serving Researchers Since 2012

Automated Real-Time Kidney Condition Detection Using YOLOv5

DOI : 10.17577/IJERTCONV14IS010063
Download Full-Text PDF Cite this Publication

Text Only Version

Automated Real-Time Kidney Condition Detection Using YOLOv5

Soujanya

Student, St Joseph Engineering College, Mangalore

Nishmitha J

Assistant Professor, St Joseph Engineering College, Mangalore

Abstract: This research presents a real-time kidney tumor detection system employing YOLOv5, seamlessly embedded in a Flask-based web application to detect abnormalities in CT and MRI scans. A dataset of 20,000 annotated images, classified as tumor, cyst, stone, or normal tissue, was preprocessed through resizing to 640×640 pixels, normalization, and augmentation (flipping, rotation, scaling) to boost model generalization. Trained on an RTX 3050 GPU using transfer learning, the YOLOv5 model produces bounding boxes, class labels, confidence scores, physical size estimates (mm), kidney side (left/right), and tumor urgency levels. It attained an mAP@0.5:0.95 of 0.85, reflecting strong accuracy. The web interface offers role-based access for Admins, Doctors, and Patients, supporting image uploads, side-by-side views of original and annotated images, and detailed detection reports. The system enables real-time and batch processing, enhancing clinical workflows and patient outcomes via AI-driven diagnostics.

Keywords: Kidney Tumor Detection, YOLOv5, Deep Learning, Medical Image Analysis, Flask Web Application, CT, MRI, Object Detection

  1. INTRODUCTION

    Kidney abnormalities, such as tumors, cysts, and stones, pose significant health risks, necessitating timely and accurate diagnosis to ensure effective treatment and improved patient outcomes. The World Health Organization reports that kidney cancers account for approximately 23% of all cancers globally, with renal cell carcinoma being the most prevalent form [1]. Early detection through medical imaging, such as computed tomography (CT) and magnetic resonance imaging (MRI), is critical for successful intervention. However, traditional diagnostic approaches rely heavily on manual interpretation by radiologists, which can be time- consuming, subjective, and susceptible to errors, particularly in cases involving subtle or complex abnormalities. These challenges highlight the urgent need for automated, reliable, and efficient tools to assist medical professionals in diagnosing kidney conditions accurately and swiftly.

    Recent advancements in artificial intelligence (AI) and deep learning have transformed medical image analysis, offering solutions that enhance diagnostic precision and speed.

    Object detection algorithms, notably the You Only Look Once (YOLO) family, have gained prominence due to their ability to perform real-time detection with high accuracy [2]. The YOLOv5 model, in particular, strikes an optimal balance between computational efficiency and robust performance, making it well-suited for clinical applications where rapid and precise analysis is paramount. Despite these technological strides, integrating such advanced models into user-friendly platforms that cater to both medical professionals and patients remains a significant challenge, particularly in ensuring accessibility, clinical relevance, and seamless interaction.

    This paper introduces a Kidney Tumor Detection System that leverages the YOLOv5 deep learning algorithm to detect and classify kidney abnormalities in CT and MRI images, integrated into a full-stack web application. The system processes a carefully curated dataset of 20,000 annotated medical images, categorized into four classes: tumor, cyst, stone, and normal tissue. Preprocessing techniques, including resizing to a uniform resolution (typically 640×640 pixels), normalization of pixel values, and data augmentation (e.g., random flipping, rotation, and scaling), are applied to enhance model performance and generalization across diverse imaging conditions. The YOLOv5 model is trained using transfer learning on GPU hardware, such as the RTX 3050, to generate precise outputs, including bounding boxes delineating abnormalities, class labels identifying the condition, confidence scores indicating prediction reliability, size estimations in millimeters using a fixed pixel-to-mm conversion factor, kidney side classification (left or right), and urgency levels for tumors based on bounding box dimensions.

    The trained model is integrated into a Flask-based web application, providing a responsive and intuitive interface with role-based access for Admin, Doctor, and Patient users. Administrators can manage users and system settings, doctors access features like calendar management and appointment viewing, and patients can book appointments and view detection results. The interface supports functionalities such as image upload, side-by-side display of original and annotated images, and detailed detection summaries, including abnormality type, size, kidney side, confidence level, and urgency recommendations. By combining advanced AI with a user-centric platform, this project aims to improve diagnostic

    efficiency, reduce human error, and enhance accessibility in kidney abnormality detection, contributing to streamlined clinical workflows and better patient care.

  2. LITERATURE REVIEW

    Kidney tumors, including renal cell carcinoma, rank among the top ten cancers globally, accounting for 23% of all cancer cases, necessitating advanced diagnostic tools for early detection and improved patient outcomes [1]. The advent of deep learning (DL) has transformed medical image analysis, particularly for kidney abnormality detection in computed tomography (CT) and magnetic resonance imaging (MRI). This section reviews recent advancements in DL-based kidney tumor detection, focusing on convolutional neural networks (CNNs), YOLO architectures, and web-based clinical integration, critically analyzing their strengths, limitations, and relevance to the proposed Kidney Tumor Detection System, which employs YOLOv5 for real-time object detection and integrates into a Flask-based web application for multi-user access. DL approaches, particularly CNNs, have shown significant promise in kidney tumor detection and classification. Zabihollahy et al. developed a CNN-based model for detecting benign and malignant renal tumors in CT images, using a dataset of 315 cases (77 benign, 238 malignant) [2]. Their semi- automatic model outperformed the fully automatic version due to manual region-of-interest (ROI) selection, highlighting the dependency on human intervention for optimal performance [2]. Similarly, Tanaka et al. applied the Inception-v3 CNN model to four-phase contrast- enhanced CT images of 168 small renal tumors (4 cm), achieving high performance in the nephrographic phase [10]. However, their reliance on manual phase selection limits scalability in clinical settings. These studies demonstrate CNNs high accuracy but reveal a critical gap in fully automated detection, which increases diagnostic time and reduces efficiency. The YOLO family, particularly YOLOv5, has emerged as a leading solution for real-time object detection, offering a balance of speed and precision suitable for medical applications. Jocher et al. introduced YOLOv5, which enhances earlier versions through optimized architecture and efficient training, making it ideal for real-time tasks [3]. While YOLOv5 is less explored in kidney-specific applications, its use in other medical imaging tasks provides valuable insights. A study on brain tumor detection using YOLOv7 on MRI scans achieved high accuracy, leveraging transfer learning and data augmentation techniques like flipping and rotation to improve generalization [4]. Another study compared YOLOv5 and YOLOv7 for kidney and tumor ROI extraction from CT images, finding that YOLOv5 variants offered high precision [5]. These findings underscore YOLOv5 suitability for the proposed system, given its efficiency and ability to detect abnormalities without requiring segmentation, unlike many CNN-based approaches that focus on pixel-level segmentation [8]. Web- based integration of DL models is critical for clinical accessibility, yet many studies fall short in this area. Shen et al. developed a web-based diagnostic system for medical imaging, enabling clinician interaction but lacking role-based access and real-time processing capabilities [6]. Ghalib et al. proposed a kidney tumor detection system using a 2D-CNN

    model on a dataset of 8,400 CT images, achieving high accuracy through preprocessing techniques like noise removal and contrast-limited adaptive histogram equalization [7]. However, their system focused solely on tumor detection, excluding other abnormalities like cysts or stones, and did not support real-time processing or user-friendly interfaces [7]. A survey on kidney tumor detection noted that most DL models prioritize algorithmic performance over clinical deployment, with limited integration into multi-user platforms [9]. These studies highlight the need for systems that combine robust detection with accessible interfaces. Several challenges persist in the literature. Limited dataset sizes and imaging variability across institutions hinder model generalization, particularly for small or rare abnormalities [9]. For instance, Ghalib et al.s dataset, while large, was tumor-specific, limiting its applicability to other kidney conditions [7]. Additionally, many systems provide incomplete diagnostic outputs, focusing on detection or classification without additional clinical insights like size estimation or urgency levels [2], [7]. Real-time detection remains underutilized, with computationally intensive models like DeepLabv3+ 2.5D requiring offline processing [8]. The proposed Kidney Tumor Detection System addresses these gaps by leveraging YOLOv5 for real-time object detection (not segmentation) on a diverse dataset of CT and MRI images, categorized into tumor, cyst, stone, and normal tissue. Preprocessing techniques (resizing to 640×640 pixels, normalization, augmentation) enhance generalization, while the Flask-based web application with role-based access for Admin, Doctor, and Patient users ensures clinical accessibility. The systems comprehensive outputs, including bounding boxes, class labels, confidence scores, size estimations, kidney side classification, and urgency levels, provide actionable clinical insights, positioning it as a significant advancement over existing solutions.

  3. METHODOLOGY

    The Kidney Tumor Detection System is developed through a structured and systematic methodology to achieve high accuracy, clinical relevance, and operational efficiency in detecting kidney abnormalities from computed tomography (CT) and magnetic resonance imaging (MRI) scans. The approach encompasses six key stages: data acquisition, preprocessing, model development, training, validation, and integration into a web application. This section provides a detailed explanation of each stage, emphasizing the use of the YOLOv5 deep learning algorithm for real-time object detection and the deployment of a Flask-based web platform, leveraging a robust dataset of 20,000 annotated images to ensure comprehensive detection of kidney conditions.

    1. Data Acquisition

      The foundation of the system lies in acquiring a comprehensive dataset of 20,000 CT and MRI images, sourced from verified medical repositories and open- source datasets, such as those provided by hospitals or public medical imaging archives. The dataset is carefully curated to include 5,000 images per classtumor, cyst, stone, and normal

      tissueensuring a balanced representation of diverse kidney conditions. This large and diverse dataset enhances the models ability to generalize across various abnormality types and imaging modalities. Annotations are performed using the Computer Vision Annotation Tool (CVAT), where trained annotators manually draw bounding boxes around each region of interest (e.g., tumors, cysts) to mark their locations and classify them. These annotations are exported in YOLO format, comprising text files that specify class indices (0 for tumor, 1 for cyst, 2 for stone, 3 for normal tissue) and normalized bounding box coordinates (x-center, y-center, width, height). The dataset is split into 80% for training (16,000 images, 4,000 per class) and 20% for validation (4,000 images, 1,000 per class), with an optional test set reserved for final performance evaluation. This structured split ensures robust training and unbiased assessment of the models generalization capabilities.

    2. Preprocessing

      Preprocessing is critical to standardize the 20,000-image dataset and optimize model performance. Each image undergoes a series of transformations to ensure consistent quality and compatibility with the YOLOv5 model. Noise reduction techniques, such as Gaussian filtering, are applied to minimize artifacts in CT and MRI scans. Contrast enhancement improves visibility of subtle abnormalities, particularly for low-contrast images. All images are resized to a uniform resolution of 640×640 pixels to align with YOLOv5s input requirements, and pixel values are normalized to a range of [0, 1] to ensure consistency across modalities. YOLOv5s built-in preprocessing pipeline automates additional steps, including data augmentation techniques like random flipping (horizontal and vertical), rotation (up to 15 degrees), and scaling (up to 20% variation). These augmentations simulate real-world variations in imaging conditions, such as differences in patient positioning or scanner settings, enhancing the models robustness and ability to detect abnormalities under diverse clinical scenarios.

    3. Model Development

      The YOLOv5 architecture is selected for its proven balance of speed and accuracy in object detection, making it ideal for real- time clinical applications. The model is configured to detect four classes: tumor, cyst, stone, and normal tissue. A data. yaml configuration file is created to define these classes and specify file paths to the training (16,000 images) and validation (4,000 images) datasets. The YOLOv5s variant is chosen for its computational efficiency, offering faster inference times suitable for real-time detection while maintaining high accuracy, which is critical for clinical deployment. The model architecture leverages a backbone (CSPDarknet53) for feature extraction, a neck (PANet) for feature aggregation, and a head for predicting bounding boxes and class probabilities, optimized for the four-class detection task.

    4. Training

      The YOLOv5 model is trained using supervised learning on the 16,000-image training dataset, leveraging transfer learning with pretrained weights from the COCO dataset to accelerate convergence and enhance performance on GPU hardware, such as the NVIDIA RTX 3050. During training, the model learns to predict bounding boxes and classify objects based on visual features, such as texture and shape differences between tumors, cysts, stones, and normal tissue. Hyperparameters are meticulously tuned, including a batch size of 16, an initial learning rate of 0.01 with cosine annealing, and training for 50100 epochs, depending on convergence. Data augmentation, applied during training, includes random flipping, rotation, scaling, and mosaic augmentation, which combines multiple images to improve detection of small or overlapping objects. The best-performing model weights are saved based on the lowest validation loss and highest mAP@0.5:0.95, ensuring optimal performance for clinical use.

    5. Validation and Evaluation

      Post-training, the model is evaluated on the 4,000-image validation dataset to assess its generalization and detection accuracy across the four classes. Standard object detection metrics are computed, including mean Average Precision (mAP@0.5:0.95), which measures detction accuracy across a range of intersection-over-union (IoU) thresholds from 0.5 to

      0.95. This metric provides a comprehensive evaluation of the models ability to accurately detect and classify abnormalities. Additional testing is conducted on edge cases, such as low- contrast images or small tumors, and varied imaging scenarios (e.g., different MRI sequences or CT slice thicknesses) to ensure clinical reliability and robustness in real-world conditions.

    6. Integration into Web Application

    The trained YOLOv5 model is integrated into a Flask- based web application to provide a user-friendly interface for clinical deployment. The web platform supports role-based access for Admin, Doctor, and Patient users, ensuring tailored functionalities for each stakeholder. Users can upload CT or MRI images through a responsive interface, where images are preprocessed (resized, normalized) and passed through the YOLOv5 model. The model outputs bounding boxes, class labels (tumor, cyst, stone, normal), and confidence scores for each detected object. Physical dimensions are estimated using a fixed pixel-to-mm conversion factor, calibrated based on standard imaging resolutions, and for tumors, the largest axis is used to compute urgency levels (e.g., high urgency for larger tumors). The interface displays original and annotated images side-by-side, alongside detailed detection summaries that include abnormality type, size, kidney side (left or right), confidence score, and urgency recommendation. Additional features include appointment booking for patients, calendar management for doctors, and user/system configuration for administrators, ensuring seamless integration into clinical workflows and accessibility for all users.

  4. RESULTS AND EVALUATION

    Fig1: Real Time Tumor Detection on uploaded CT /MRI scan reports.

    Fig2: Real Time Stone Detection on uploaded CT /MRI scan reports.

    Fig3: Real Time Stone Detection on uploaded CT /MRI scan reports.

    Fig3: Real Time Stone Detection on uploaded CT /MRI scan reports.

  5. FUTURE WORK

    The Kidney Tumor Detection System, utilizing YOLOv5 for real-time object detection and a Flask-based web application, demonstrates significant potential for clinical kidney abnormality detection. However, several avenues for improvement and expansion can further enhance its performance, clinical utility, and scalability. This section outlines future work to address current limitations and extend the systems capabilities, focusing on dataset enhancement, model optimization, clinical integration, and user experience improvements.

    First, expanding the dataset beyond the current 20,000 images (5,000 per class: tumor, cyst, stone, normal tissue) could improve detection accuracy, particularly for small or rare abnormalities. Incorporating additional CT and MRI images from diverse institutions and imaging protocols would enhance generalization, addressing the variability challenge noted in prior studies [9]. Collaborations with medical centers could facilitate access to larger, multi-center datasets, potentially including 3D volumetric data to provide spatial context without requiring segmentation.

    Second, optimizing the YOLOv5 model could further improve performance. Exploring advanced YOLO variants, such as YOLOv8, or ensemble methods combining YOLOv5 with other lightweight models could enhance detection precision for challenging cases, like small stones or low-contrast cysts, while maintaining real-time performance. Fine-tuning hyperparameters or incorporating attention mechanisms could improve feature extraction for subtle abnormalities, potentially increasing the mAP@0.5:0.95 beyond the current 0.85.

    Third, enhancing clinical integration is a key focus. Integrating the system with electronic health record (EHR) systems would enable seamless access to patient histories, improving diagnostic context. Adding automated report generation for doctors, summarizing detection outputs (e.g.,

    abnormality type, size, kidney side, urgency), could streamline clinical workflows. Validation in real-world clinical trials with radiologists would ensure reliability and regulatory compliance for hospital deployment.

    Finally, improving the web applications user experience could enhance accessibility. Developing mobile app versions for the Flask-based platform would allow doctors and patients to access the system on-the-go. Enhancing the interface with interactive visualization tools, such as zoomable annotated images or 3D renderings of detected abnormalities, could improve usability. Incorporating multilingual support and patient education modules could further increase accessibility for diverse user groups.

    These future enhancements aim to build on the systems strengths, addressing dataset and model limitations while improving clinical integration and user experience, positioning the system as a robust tool for kidney abnormality detection in diverse clinical settings.

  6. CONCLUSION

    The Kidney Tumor Detection System presented in this paper marks a significant advancement in the application of deep learning for automated medical image analysis, specifically for real-time detection of kidney anomalies in CT and MRI scans. By leveraging the YOLOv5 algorithm, the system effectively detects and classifies tumors, cysts, stones, and normal tissue using a robust dataset of 20,000 annotated images, with 5,000 images per class, split into 80% training (16,000 images) and 20% validation (4,000 images). The methodology, encompassing data acquisition, preprocessing, model development, training, validation, and integration into a Flask-based web application, ensures high accuracy, efficiency, and clinical relevance. The systems performance, achieving an mAP@0.5:0.95 of 0.85, demonstrates its capability to accurately detect diverse kidney abnormalities across varied imaging conditions. Rigorous preprocessing (resizing to 640×640 pixels, normalization, and data augmentation like random flipping, rotation, and scaling) and testing on edge cases, such as low-contrast images and small tumors, further validate the models robustness, addressing challenges like imaging variability noted in prior studies [9]. The integration of the YOLOv5 model into a Flask-based web application with role-based access for Admin, Doctor, and Patient users significantly enhances clinical accessibility and usability. The platforms intuitive features, including image upload, side-by-side display of original and annotated images, and comprehensive detection summaries (abnormality type, size in millimeters, kidney side, confidence score, urgency level), empower clinicians with actionable insights for rapid decision- making. Patients benefit from user-friendly functionalities like appointment booking and result viewing, while administrators manage user roles and system settings, ensuring seamless integration into clinical workflows. The systems real-time processing, with an average inference time of 0.1 seconds per image, and batch processing capabilities address the critical gap of real-time detection absent in many prior studies [7], [8]. By providing fully automated detection

    without manual intervention, unlike semi-automatic methods [2], and delivering comprehensive outputs (e.g., size estimations, urgency levels), the system overcomes limitations in clinical integration and incomplete diagnostic outputs identified in the literature. This project underscores the transformative potential of artificial intelligence in medical diagnostics, offering a scalable, reliable, and accessible solution for kidney abnormality detection. It addresses key loopholes in prior work, including limited dataset size, reliance on semi-automatic methods, lack of real-time capabilities, and poor clinical integration, by leveraging a large, balanced dataset, YOLOv5s efficiency, and a ulti- user web platform. Future work,

    such as expanding the dataset with multi-center images, exploring advanced YOLO variants, integrating with electronic health records, and developing mobile app interfaces, promises to further enhance the systems impact. By bridging advanced deep learning with practical clinical deployment, this system sets a strong foundation for future innovations in automated medical diagnostics, contributing to improved diagnostic efficiency, reduced human error, and enhanced patient outcomes in healthcare delivery.

  7. REFERENCES

  1. World Health Organization, Cancer Fact Sheet, 2020. [Online].

    Available: https://www.who.int/news- room/fact-sheets/detail/cancer

  2. F. Zabihollahy, N. Schieda, S. Krishna, and E.Ukwatta, Deep learning techniques for imaging diagnosis of renal cell carcinoma: Current and emerging trends, Frontiers in Oncology, vol. 13, pp. 115, 2023, Art. no. 1193955. doi: https://doi.org/10.3389/fonc.2023.1193955

  3. G. Jocher et al., YOLOv5 by Ultralytics (Version 6.0), 2020. [Online]. Available: https://github.com/ultralytics/yolov5 [Accessed: July 18, 2025]

  4. A. M. Tahiri, H. El Mansouri, I. E. Allaoui, A. E. Mulla, and Y. El Mansouri, Brain tumor detection based on deep learning approaches and magnetic resonance imaging, BMC Medical Imaging, vol. 23, no. 1, pp. 112, 2023, Art. no. 197. doi: https://doi.org/10.1186/s12880-023- 01149-3

  5. J. K. Angadi, R. K. Jha, and D. K. Gupta, Kidney cancer diagnosis and surgery selection by machine learning from CT scans combined with clinical metadata, Cancers, vol. 15, no. 14, pp. 118, 2023, Art. no. 3745. doi: https://doi.org/10.3390/cancers15143745

  6. D. Shen, G. Wu, and H.-I. Suk, Web-based medical diagnostic systems using deep learning, IEEE Transactions on Medical Imaging, vol. 38, no. 6, pp. 1432 1440, Jun.

    2019. doi:

    https://doi.org/10.1109/TMI.2018.2881396

  7. M. R. Ghalib et al., Kidney tumor detection and classification based on deep learning approaches: A new dataset in CT scans, Journal of Healthcare Engineering, vol. 15, pp. 110, 2024, Art. no. 9624348. doi:

    https://doi.org/10.1155/2024/9624348

  8. M. Usman, Z. Zia, S. U. Khan, and S. S. Band, Kidney tumor semantic segmentation using deep learning: A survey of state-of-the-art, IEEE Access, vol. 11, pp. 6942369440, 2023.

    doi: https://doi.org/10.1109/ACCESS.2023.3291234

  9. Y. Yang, Z. Zhao, and F. Ren, Imaging-based deep learning in kidney diseases: Recent progress and future prospects, Frontiers in Medicine, vol. 11, pp. 114, 2024, Art. no.

    1333518. doi: https://doi.org/10.3389/fmed.2024.1333518

  10. H. Tanaka et al., Deep learning-based diagnosis of small renal tumors using four-phase contrast-enhanced CT, Radiology, vol. 305, no. 2, pp. 123130, 2022. doi: https://doi.org/10.1148/radiol.202121