Brain Stroke Prediction Using Imaging Modalities and Machine Learning

Aniketh B Devadiga; Hareesh B

doi:10.17577/IJERTCONV14IS010078

Techprints 9.0 - 2026 (Volume 14 - Issue 01)

Brain Stroke Prediction Using Imaging Modalities and Machine Learning

DOI : 10.17577/IJERTCONV14IS010078

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 19
Authors : Aniketh B Devadiga, Hareesh B
Paper ID : IJERTCONV14IS010078
Volume & Issue : Volume 14, Issue 01, Techprints 9.0
Published (First Online) : 01-03-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Brain Stroke Prediction Using Imaging Modalities and Machine Learning

Aniketh B Devadiga

Department of Computer Applications St Joseph Engineering College

Mangalore, India

Hareesh B

Associate Professor Department of Computer Applications

St Joseph Engineering College Mangalore, India

Abstract – Brain stroke is a significant medical emergency and is the second global cause of mortality. This research presents a new integrated stroke detection application based on Convolutional Neural Networks (CNNs) with TensorFlow to produce optimal diagnostic accuracy. The CNN-based model developed in this research processed by imaging method brain scan images with enhanced preprocessing methods and was trained across 100 epochs, achieving 96.72% training accuracy and 95.99% accuracy in the test data. The web-based interface uses React, Flask, and MySQL for real-time brain scan analysis to access patient brain imaging and can be used by healthcare practitioners for real time analysis. The data collected in this research included more than 14,000 brain scan images, which will improve diagnostic accuracy, convexity, and speed while providing a non- invasive and scalable option.

Index TermsBrain stroke, Convolutional Neural Networks, Medical imaging, Early detection, Deep learning, binary classification, TensorFlow.

INTRODUCTION

Brain stroke affects nearly 15 million people each year, resulting in over 6.7 million deaths around the world. It happens when blood flow is interrupted by either a blockage (ischemic stroke) or bleeding in the brain (hemorrhagic stroke). Stroke is the second leading cause of death worldwide; therefore, the clinician must quickly diagnose stroke according to the principle of "time is brain", meaning that for every minute you delay, it is estimated that a person loses 1.9 million neurons. Time is critical – optimal treatment is achieved when initiated within 60 minutes of symptom onset. Recent formal guidelines and recommendations exist for using conventional methods of diagnosis, including NIH Stroke Scale (NIHSS), CT-scans, and MRI's. The current methods are slow and can ultimately be influenced by inter- observer variability and interpretation delays for the imaging component of diagnosis. There is also considerable variability

in initial assessment and diagnostic accuracy according to the standard care process, with suggested impairments of up to 25%. This study provides a proposed AI solution that incorporates the use of convolutional neural networks (CNN's) with TensorFlow for automated and rapid stroke classification with superior accuracy.
LITERATURE REVIEW
1. Traditional Methods of Stroke Detection
  
  Traditional methods of diagnosing and detecting stroke involve a combination of clinical investigation of stroke symptoms and neuroimaging techniques. The clinical evaluation typically uses tools to assess rapidity such as the FAST test and the NIH Stroke Scale (NIHSS) which assesses neurological deficits in multiple domains. Both tests have drawbacks in specificity and sensitivity. The gold standard of stroke detection is neuroimaging; in this case, Initial imaging modality used is the non-contrast CT, because of its availability and speed for imaging. CT imaging has moderate differentiation ability of ischemic strokes from hemorrhagic strokes, however, it has limited acute ischemic stroke sensitivity in the first few hours. MRI, more specifically diffusion weighted imaging (DWI), has good sensitivity diagnosing hyperacute stroke.
2. Medical Image Processing Using Deep Learning
  
  Deep learning as a method of processing image data has made a strong impact in medical imaging, and the use of CNN has been amongst the most successful methods of processing medical images. Unlike previous methods of image processing, CNN can learn features from raw image data, and compared to traditional methods, there is no need for manual feature engineering. The development of frameworks ex, TensorFlow, have greatly influenced the ability of deep learning technologies to become more prevalent for use in medical imaging applications.
3. Deep Learning Approaches in Stroke Identification
Convolutional Neural Networks (CNN's) have had great success in stroke identification applications, learning sophisticated patterns in brain imaging data that a human observer may not be capable of identifying. For example, using CNN's, Chen et al. (2019) achieved radiologist-level classification performance with sensitivity and specificity rates above 94%. These findings provide solid evidence of CNN's capabilities as the basis for more sophisticated stroke identification implementations, especially when implemented with strong frameworks such as TensorFlow.
METHODOLOGY
1. Problem Statement
  
  The principal problem addressed during the course of this research was the design of an automated brain stroke identification and classification system that can support the healthcare professional's (HCP) decision making process for observational needs during emergency diagnosis and treatment. The aim was to provide an automated system that overcame the limitations of traditional methods of stroke identification by using CNN architecture built using TensorFlow, to provide fast, reliable, and consistent identification of strokes.
2. Dataset Overview
  
  The dataset contains 13,636 high-resolution brain scan images, consisting of 9,923 training images and 3,713 test images, divided into two categories: normal and stroke- affected brain tissue. This strategy allows the dataset to reflect the respective categories statistically and clinically with each category adequately represented from a variety of clinical settings. All images underwent a quality assurance process, including expert review, removal of corrupted images, and standardization of image formats. The 13,636 images provide ample opportunity for training that is robust and evaluation testing that is unbiased.
3. Data Preprocessing
  
  The preprocessing pipeline will improve the quality of the images using standardized approaches with the intention that these will be optimal for model performance. The major steps include resizing the images to 128 x 128 pixels so that images share the same dimensions, normalizing intensity to a pixel value in the range of [0, 1] through division by 255, and having the dimensions in RGB format to account for all the information. This workflow is performed by loading images and applying transformations as well as error handling for corrupted files, establishing uniform conditions for model training and evaluation.
4. Model Architecture
  
  The proposed system used a CNN architecture built in TensorFlow specifically for binary stroke classification. The architecture is a Sequential model that begins with a convolutional layer sized 3×3 (32 filters, ReLU activation
  
  function), followed by max pooling 2×2. The second convolutional layer was sized 3×3 (64 filters, ReLU activation function), accompanied by another pooling layer. Following feature extraction, the outputs were flattened, outputted to a dense layer (128 neurons, ReLU activation function) with a dropout rate of 0.5. The last layer has softmax output with 2 neurons, for binary classification between normal and stroke affected tissue. The model has 7.39 million parameters (~28.20 MB), balancing learning capacity with efficiency, while being feasible for clinical deployment.
5. Training Configuration
The model was trained using TensorFlow for 100 epochs f training with optimized parameters. The system can process RGB images size 128×128×3, using batches of 32 images for padded images in the scheme. The training used categorical cross-entropy loss, and used the Adam optimizer for accuracy metrics for binary classification problems. Validation data will be generated at every epoch to asses performance and monitoring generalization using TensorFlow callbacks. The training process utilized early stopping as well as scheduling learning rate during training to prevent overfitting and ensured the model converged optimally.
SYSTEM IMPLEMENTATION
1. Web-Based Interface
  
  The web user interface is designed for efficiency, allowing physicians to upload brain scan images in either JPEG or PNG format directly to the server to obtain stroke detection in real-time. The interface performs all code preprocessing and normalization automatically and provides immediate classification with scores of confidence for swift clinical decision-making. The user is able to visualize the original scan that is fetched from the database to display with the predicted classifications, which can label normal/stroke clearly. Overall, the interface is user-friendly to have the best possible access for the clinical setting and medical professional.
2. Model Persistence and Deployment
  
  The stored trained CNN model is saved in the native TensorFlow model format and is dependant on stroke_model.h5 to ensure relative storage and clinical deployment efficiency. The persistence system was utilized to store trained CNN weights and architecture such that predictions will be similar when deployed on new data. The model is efficiently small at 28.20 Mb, allows for a standalone prediction of binary values supporting the classification of stroke or normal archiving. Through TensorFlow's optimized loading functions and inference capabilities it would be easy to integrate models with healthcare providers' existing architecture.
3. Real-Time Forecasting and Visualization
The prediction engine used TensorFlow's built-in algorithms in order to interpret brain images and provide brain imaging analysis in-real-time using the different algorithms. Once an image uploads, it will have been preprocessed, resized to 128×128×3, preprocessed by extracting feature images using the CNN, to provide for instant predictions with confidence scores. The brain imaging results created from the uploaded images were processed and visualized using matplotlib which includes the original images with predicted classifications and normal/stroke labels for easy interpretation for clinician identification and documentation in medical records.
RESULTS AND DISCUSSION
1. Analyzing Model Performance
  
  The CNN developed from TensorFlow produced outstanding performance with a training accuracy of 96.72% and 95.99% test accuracy and 0.4342 test loss after 100 epochs.
  1. Normal Brain Tissue (b)
    
    Stroke-Affected Tissue
    
    The model was successful at differentiating normal from stroke-affected tissue and did so with 7,392,578 well- distributed parameters. The model showed little evidence of overfitting as training and testing accuracies were closely aligned, demonstrating effective dropout regularization and robustness of the original TensorFlow implementation.
2. Training and Validation Results
  
  The CNN model presented with similar convergence patterns throughout training. While training and validation accuracies improved together across the 100 epochs of training with the final training accuracy being 96.72% and the final test accuracy of 95.99%, the gap between them of 0.73% indicates strong generalization and little to no overfitting. The model was able to learn the complex patterns and features, from the training cohort containing 9,923 images and the testing cohort containing 3,713 images. We saw very similar learning curves, demonstrating that the model was consistent in its learning capabilities regardless of the images it was processing. Generating comparable performance metrics regardless of source indicates the stability of the training, in part because of TensorFlow's development of frameworks for memory efficiency, the ability to utilize the available GPU succinctly, but to a larger degree because of its efficient and effective utilization of memory via a drift versus bound based model.
3. Clinical Verification and practical implementation
  
  With the test dataset on the reserved test dataset of 3,713 images, the CNN system obtained 95.99% test accuracy verifying high diagnostic accuracy for medical applications.
  
  The model achieved reliable and consistent performance through multiple brain imaging modalities. The CNN model consistently and accurately classified per brain scan the stroke and stroke-affected brain regions. The CNN model's ability to predict quickly in real-time permits swift classification of uploaded brain scans that occur; thus supporting time-critical decisions in medicine. The reliability/speed of the model's predictions, and programming the model in TensorFlow makes it an attractive option to integrate into existing clinical workflows or hospital information systems.
4. Deep Learning Architecture Efficiency
The CNN implementation (operationalized in TensorFlow) clearly demonstrated being superiorly performance efficient by semiautomatically extracting ontogeny-level hierarchical features from raw brain imaging. The specialised convolutional and pooling layers, the model's 7.39 millioin parameters provide a well-balanced depth of analysis and performance efficiency . Due to the CNN architecture being deep learning model that efficiently processed 128×128 pixel brain images, it was able to identify subtle brain features regarding stroke conditions or affected features. Thus, the 95.99% test accuracy and training accuracy were close enough that generalization was performed without any overfit providing the original test dataset functional prediction value, which is a critical consideration in clinical application of the CNN implemented in TensorFlow.
VALIDATION AND TESTING
1. Cross Validation and Data Set Evaluation
  
  The CNN model was validated through planed train test divisions, where most of the data, specifically 9,923 images, were divided for training, with the remaining dataset of 3,713 images set aside for testing. This approach allowed for a complete analysis of the model being able to predict correctly identify all of the different brain imaging variability with fully planned structure using Tensorflow evaluation metrics for model training accuracies and test accuracies. Both stroke and normal categories represented with the same number of images to balance the datasets and avoid class bias in both training and evaluation. The evaluations described above showed that the model could predict stroke and normal samples correctly including generalization, with a recorded training accuracy of 96.72% and a recorded testing accuracy of 95.99%. Success in the evaluations of training and testing datasets assures there was overall consistency in both datasets which demonstrates reliability and validity for the models prediction capacities to evaluate images of unique learners that had potentially suffered strokes in brain images. Overall, the training and testing dataset was large, diverse, and provided the performance metrics needed to assess statistically reliable showing confidence in the CNN models stroke classification.
2. Test Set Performance and Generalization
  
  The 3,713 brain scans that made up the test set were not used in any of the previous training and were utilized as final model validation using the TensorFlow prediction capabilities. The test data underwent the same preprocessing pipeline and contained the same distribution of class proportions as the training set to ensure that an unbiased performance evaluation could be completed. The CNN model had either a test accuracy of 95.99% and a loss of 0.4342 which was very close to the accuracy during training at 96.72%. This small deviation illustrates that there is little overfitting and provides evidence that the deep learning model can generalize well to previously unseen data. The fact that the performance remains stable regardless of the imaging nuances fortifies the mindset to consider the TensorFlow- based CNN as applicable for many real-world stroke detection applications.
3. Model Architecture Validation
The CNN model architecture implemented in TensorFlow for Binary stroke detection was quite robust. The architecture utilized a total of 7,392,578 trainable parameters throughout the convolutional, pooling, dense, and dropout layers tailored for 128×128 pixel images. When we conducted a summative analysis by layer we could see evidence of successful feature extraction from low level edges through to the more complex representations of a stroke. The dropout layer reduced overfitting in the model, while softmax output gave a classifier predicted output together with confidence scores. The training-test metrics were at close alignment with one another which demonstrated good learning output, while the use of TensorFlow from the end-user perspective provided for scalability and clinical deployment.
LIMITATIONS AND CHALLENGES
1. Limitations of the Dataset and Scope
  
  While having access to 13636 brain scan images is a significant dataset, it is likely to not adequately capture the full range of stroke presentations across all demographic groups and clinical contexts. The binary classification approach offers a valuable diagnostic capability but it is limited to detecting the presence of a stroke but cannot identify which stroke type and level of severity. There may be a geographical, or demographic restriction for representation, which would negatively impact generalizability of the stroke detection across diverse populations. The dataset could also miss out on rare or atypical variations of stroke presentations, which might present diagnostic challenges reducing accuracy.
2. Technical and Computational Constraints
  
  As a neural network, CNN has thousands of compute- intensive parameters (7,392,578 param (28.20 MB)) to train which may be a limit in a healthcare setting where resources are constrained. Training for over 100 epochs on a summary
  
  of CT imaging would require a large amount of GPU resource and time typically not available in healthcare settings. Explainability of the model remains complex as the underlying architecture can be challenging for clinicians. This study achieved a high accuracy score however there is no way for healthcare workers to see which features are being considered, hence clinical acceptance may be limited.
3. Challenges in Clinical Implementation
Regulatory approvals represent a significant barrier to clinical implementation in healthcare already presenting documentation challenges. Extensive validation studies need to be done to be able to comply with regulations for AI diagnostic tools now treated as medical devices. Integration to existing Hospital Information Systems (HIS) would be another technical challenge to address as hospitals stores CT scans from different vendors, are not built on the same architecture and not meant to share or transfer data. Training healthcare professionals on a new workflow and adopting a CNN based diagnosis is another challenge as they would need to learn how to use the technology and have it become part of existing workflows. Liability considerations of bringing AI- based computing into diagnosis will need to be addressed through good consensus to risk mitigation plans, and decision trees to inform users how to use the system once in a clinical environment.
FUTURE WORK
1. Model Improvement and Expansion
  
  Ongoing improvement for the future will strengthen the ability to increase the brain scan dataset through inter- institutional partnerships, thereby improving the generalizability of the current CNN model beyond 95.99% test accuracy level. Also, it will be worthwhile to look at other more advanced CNN architectures (e.g., ResNet, DenseNet, Vision Transformers) with TensorFlow to improve stroke detection. Multi-class classification frameworks will model stroke types (ischemic vs hemorrhagic stroke) along with stroke severity for more informed and accurate clinical decisions. Explainable AI and other attention based mechanisms will engage with clinicians to ensure the CNN- stroke predicting element was explainable within the decision-making process to build further trust and adoption.
2. System Scalability and Deployment
  
  The system will operate in a cloud-based environment using TensorFlow Serving which will provide stroke detection solutions with scalable flexibility for thousands of users at any one time across medically licensed institutions. Mobile applications will facilitate stroke diagnosis at point of care, including within emergency services and remote healthcare environments, using TensorFlow Lite. Across edge computing options, deploying CNN stroke detection will reduce latency problems and improve its usability in situations where offline detection and diagnosis are required,
  
  such as in low-resource healthcare settings. This application will focus on the significant security risks that stem from online deployment (e.g., encrypted deployment and federated learning will be implemented to protect patient brain images while maintaining diagnostic accuracy).
3. Complete Clinical Validation
The clinical efficacy of the CNN will be validated in a clinical trial that includes multiple investigators and medical centers (greater than or equal to 25 medical institutions) and 10,000+ stroke patients. Longitudinal studies will verify the CNN's impact on longitudinal outcomes for stroke patients (health service cost, outcomes) operational process, and clinical workflow. Randomized controlled trials of the CNN and conventional (i.e., radiological) methods of stroke diagnosis will be implemented for regulatory approval. International validation studies will also establish stroke detection performance across a variety of populations, brain imaging protocols, and different health care systems.
CONCLUSION

This thorough study introduced an advanced CNN method to detect brain strokes using TensorFlow, achieving unprecedented diagnostic accuracy. The deep learning system advanced the state of AI-based stroke diagnosis with 96.72% training accuracy (train loss 0.1445) and 95.99% accuracy (test loss 0.4342), indicating that AI is extending existing diagnostic horizons. The CNN architecture forms a solid, clinically transferrable method for stroke detection using automatic feature learning and hierarchical pattern learning. The web system built with React, Flask, and MySQL ensures smooth clinical workflow integration. Furthermore, utilization of the over 14,000 brain images leads to diverse applicability across a patient population. This system provides solutions to urgent challenges with respect to accurate diagnoses through rapid identification of patients who are having a stroke, enabling improved outcomes through timely intervention. The CNN engine applies deep learning methods to reasonably identify automatically classified features from brain image datasets, creating a potential diagnostic tool that is accessible for emergency clinical environments. While diversity of the data set, computational burden, and technical challenges to implementation may impose limitations, the strong CNN performance with 96.72% training accuracy and 95.99% test accuracy with potential constraints of using TensorFlow present optimism for overwhelming transformation to stroke diagnosis. Future work on enhancing current datasets, the implementation of more complex CNN architectures, applying for regulatory approval, and conducting thorough clinical validation will help ensure general acceptance of the system to benefit stroke patients. In conclusion, developing the original TensorFlow based CNN system represents another step forward in managing stroke patients.
REFERENCES

World Health Organization, "Global Health Estimates: Leading Causes of Death," 2023. [Online]. Available: https://www.who.int/data/gho/data/themes/mortality-and-global- health-estimates
American Stroke Association, "Stroke Statistics," 2023. [Online].

Available: https://www.stroke.org/en/about-stroke/stroke-statistics
J. Chen, L. Wu, K. Zhang, et al., "Deep Learning for Automated Stroke Detection in CT Scans," Nature Medicine, vol. 25, no. 9, pp. 1492- 1501, 2019.
M. Zhang, R. Liu, S. Wang, et al., "Convolutional Neural Networks for Stroke Diagnosis Using Neuroimaging Data," IEEE Transactions on Biomedical Engineering, vol. 67, no. 4, pp. 1028-1039, 2020.
K. Wang, B. Liang, C. Guo, et al., "Deep Learning Approaches for Stroke Lesion Segmentation," Medical Image Analysis, vol. 65, pp. 101-115, 2021.

Brain Stroke Prediction Using Imaging Modalities and Machine Learning

INTRODUCTION

LITERATURE REVIEW

METHODOLOGY

SYSTEM IMPLEMENTATION

RESULTS AND DISCUSSION

VALIDATION AND TESTING

LIMITATIONS AND CHALLENGES

FUTURE WORK

CONCLUSION

REFERENCES