🌏
Global Research Authority
Serving Researchers Since 2012

Severity Staging of Alzheimer’s Disease from MRI Utilizing a Convolutional Neural Network with Alternating Pooling and SMOTEENN-Balanced Training

DOI : https://doi.org/10.5281/zenodo.18901329
Download Full-Text PDF Cite this Publication

Text Only Version

Severity Staging of Alzheimer’s Disease from MRI Utilizing a Convolutional Neural Network with Alternating Pooling and SMOTEENN-Balanced Training

Deepika V.

Department of Computer Application, St. Marys College (Autonomous), Affiliated to University of Calicut, India.

Betsy Chacko Assistant Professor,

Department of Computer Application, St. Marys College (Autonomous), Affiliated to University of Calicut, India.

Abstract – Alzheimer's disease (AD) is a progressive neurodegenerative condition responsible for most dementia cases globally. Diagnosing it accurately and promptly remains a significant clinical challenge due to the subtle brain structure changes in the early stages. This research introduces a framework based on a convolutional neural network (CNN) for automatically classifying the severity of AD using T1-weighted magnetic resonance imaging (MRI) scans. The model features a four-layer convolutional design with Exponential Linear Unit (ELU) activations, alternating between average and max pooling, and includes dropout regularization. To tackle the class imbalance often found in neuroimaging datasets, the Synthetic Minority Over-sampling Technique combined with Edited Nearest Neighbors (SMOTEENN) was utilized before training. The model underwent training for 40 epochs using stochastic gradient descent (SGD) and categorical cross-entropy loss. On a separate test set, the model achieved an accuracy of 99.87%, an AUC of 99.91%, and an F1-score of 99.86% across four severity categories: Non-Demented, Very Mild Demented, Mild Demented, and Moderate Demented. These findings indicate that the proposed architecture provides a computationally efficient and highly precise method for automated AD staging, with the potential to be integrated into clinical decision-support systems.

Keywords: Alzheimer's disease; convolutional neural network; MRI; deep learning; SMOTEENN; dementia classification; neuroimaging

  1. INTRODUCTION

    Alzheimer's disease (AD) is the leading cause of dementia worldwide, imposing significant challenges on individuals, caregivers, and healthcare systems. The World Health Organization estimates around 10 million new dementia cases annually, with the global AD population expected to reach 152 million by 2050 [1]. AD is characterized by the progressive accumulation of amyloid-beta plaques and tau protein tangles, resulting in cortical atrophy, hippocampal volume loss, and ventricular enlargement, which clinically manifest as memory impairment, cognitive decline, and loss of functional

    independence [2]. Early detection is critical due to the absence of disease-modifying treatments, as intervention during the preclinical or mild cognitive impairment (MCI) stages can slow progression, improve care planning, and facilitate patient participation in clinical trials [3].

    Structural magnetic resonance imaging (MRI) is a valuable non-invasive tool for detecting AD-related neurodegeneration. However, visual assessment of MRI scans is time-consuming, prone to inter-rater variability, and challenged by subtle early- stage brain changes [4]. Deep learning, particularly convolutional neural networks (CNNs), has revolutionized medical image analysis by enabling hierarchical spatial feature extraction directly from pixel data without manual feature engineering. Prior studies using CNNs for AD classification have achieved accuracies exceeding 95%, often relying on transfer learning or custom architectures [5, 6]. Nonetheless, class imbalancewhere pathological cases are underrepresentedremains a significant issue, often biasing models toward majority classes and reducing reliability [7].

    This study presents a four-block CNN framework designed for fine-grained, four-class AD severity classificationNon- Demented, Very Mild Demented, Mild Demented, and Moderate Dementedoffering a more clinically informative and methodologically challenging approach compared to the common binary classification paradigm. The model incorporates an alternating pooling strategy, applying max pooling and average pooling in different convolutional blocks to capture complementary focal and diffuse brain changes, a technique not previously reported in AD neuroimaging classification. Additionally, the study applies Synthetic Minority Over-sampling Technique combined with Edited Nearest Neighbors (SMOTEENN) for hybrid resampling to address multi-class imbalance, filling a gap identified in recent systematic reviews [12]. Collectively, these innovations enable the model to achieve a test accuracy of 99.87%, surpassing existing state-of-the-art results in multi-class AD staging.

  2. RELATED WORK

    Recent research has extensively explored the use of convolutional neural networks (CNNs) and transfer learning techniques for Alzheimer's disease (AD) classification from MRI data. Mehmood et al. [8] demonstrated the effectiveness

    of transfer learning by fine-tuning a VGG16 model pre-trained on ImageNet, achieving 97.08% accuracy on the ADNI dataset, particularly benefiting scenarios with limited labeled medical images. Similarly, Ebrahimi et al. [9] compared a ResNet50 model pre-trained on ImageNet with a CNN trained from scratch using the OASIS dataset, finding that the pre- trained network consistently outperformed the latter, reaching 96.25% accuracy. Odusami et al. [10] reported that fine-tuning a shallow AlexNet architecture with carefully selected hyperparameters could achieve competitive results, obtaining 95.73% accuracy on ADNI data. Naz et al. [11] adopted a domain-informed approach by segmenting gray matter using independent component analysis (ICA) prior to CNN classification, which improved performance to 98.60%, underscoring the value of preprocessing tailored to neuroimaging data.

    A systematic review by Wen et al. [12] highlighted persistent challenges in deep learning applications for AD detection, including small dataset sizes, heterogeneous preprocessing methods, and a predominant focus on binary classification tasks rather than multi-class severity staging. Moreover, few studies have addressed the combined issues of class imbalance and multi-class classification, which are critical for clinically relevant AD staging. The present study aims to fill these gaps by employing SMOTEENN hybrid resampling to mitigate class imbalance and targeting four distinct AD severity categories, advancing beyond the common binary classification framework.

  3. MATERIALS AND METHODS

    1. Dataset

      This study utilized T1-weighted brain MRI images obtained from a publicly accessible repository, which aggregates data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). The images were categorized into four clinically recognized stages of Alzheimer's disease: Non-Demented (ND), Very Mild Demented (VMD), Mild Demented (MD), and Moderate Demented (MOD). The dataset exhibited notable class imbalance, with the ND and VMD classes being substantially more represented than the MD and MOD classes, reflecting clinical prevalence but posing challenges for model training.

    2. Preprocessing and Data Augmentation

      All MRI images were uniformly resized to 176 × 208 pixels with three color channels, and pixel intensities were normalized to the [0, 1] range. To enhance sample diversity and mitigate overfitting, an ImageDataGenerator pipeline applied real-time data augmentation during training, incorporating random horizontal flips, slight rotations, and brightness variations. To address the multi-class imbalane, the Synthetic Minority Over-sampling Technique combined with Edited Nearest Neighbors (SMOTEENN) was employed, which integrates oversampling of minority classes with the removal of noisy borderline samples [7]. Following resampling, the dataset was split in a stratified manner into training (60%), validation (20%), and testing (20%) subsets, with sequential splitting ensuring class balance was maintained across partitions.

    3. CNN Architecture

      The proposed convolutional neural network (CNN) consists of four convolutional blocks with progressively increasing filter counts of 16, 32, 64, and 128, implemented using

      TensorFlow/Keras. Input tensors of shape (176, 208, 3) pass through these blocks, where Blocks 1 and 3 utilize max pooling (2 × 2), and Block 2 applies average pooling (2 × 2). This alternating pooling strategy aims to capture complementary features, preserving both focal peak responses and diffuse texture information across spatial scales. Convolutional layers predominantly use 3 × 3 kernels, except for Block 4, which employs a larger 5 × 5 kernel to broaden the receptive field. All layers are initialized with Glorot uniform distribution and activated using Exponential Linear Units (ELU) to maintain non-zero mean activations and mitigate vanishing gradient issues [13]. Dropout regularization is applied twice: at a rate of 0.05 following the final convolutional block and 0.20 after the first dense layer containing 512 units. The output layer comprises four neurons with softmax activation, yielding probability distributions across the four AD severity classes. The full architecture details are summarized in Fig. 1.

    4. Training Configuration

    The model was compiled using stochastic gradient descent (SGD) with a learning rate of 0.05 and optimized with categorical cross-entropy loss. During training, six performance metrics were tracked: accuracy, loss, area under the receiver operating characteristic curve (AUC), precision, recall, and F1-score. Training was conducted over 40 epochs with a mini-batch size of 8. Data augmentation was applied in real-time through augmented flow generators for both training and validation sets. Additionally, early stopping and a learning rate scheduler callbacks were employed to prevent overfitting.

    Fig.1. Proposed CNN Architecture

  4. RESULTS

    1. Performance Metrics

      The model exhibited exceptional performance across training, validation, and test datasets. On the test set, it achieved an accuracy of 99.87%, with precision, recall, and F1-score all nearing 99.86%, demonstrating consistently accurate classification across the four Alzheimer's disease severity categories. The low test loss of 0.0081 indicates minimal

      classification errors, while an AUC of 99.91% confirms the models robust discriminative capacity. The close alignment between precision and recall values further reflects balanced performance without bias toward false positives or false negatives.

      During training, accuracy increased rapidly within the initial epochs and stabilized above 99% by the 15th epoch for both training and validation sets. The convergence of accuracy and loss metrics between these partitions indicates effective generalization and successful control of overfitting, attributed to the implemented dropout regularization and data augmentation strategies. Loss values steadily declined throughout training, converging below 0.01 for both datasets, underscoring the robustness and stability of the training process. Table 1 summarizes these performance metrics across all data partitions.

    2. Impact of SMOTEEN on Class Imbalance

    The original dataset exhibited a pronounced imbalance, with significantly fewer samples in the Mild Demented and Moderate Demented categories. To address this, the Synthetic Minority Over-sampling Technique combined with Edited Nearest Neighbors (SMOTEENN) was applied, generating synthetic instances for the underrepresented classes while removing ambiguous borderline samples. This resampling approach effectively equalized class distributions, which was essential for reliable model performance. Preliminary experiments without SMOTEENN showed a notably reduced recall for the Moderate Demented class, falling below 85%, demonstrating that class imbalance would have significantly compromised classification accuracy without this intervention [7, 12].

  5. DISCUSSION

    The proposed CNN model demonstrated near-perfect performance across all evaluated metrics, achieving a test accuracy of 99.87%, which exceeds previously reported benchmarks for both transfer learning methods (9597%) and custom CNN architectures (9698%) on comparable Alzheimers disease (AD) classification tasks [811]. As shown in Table 2. This superior outcome is attributed to three key innovations.

    Firstly, the study advances beyond the prevalent binary classification paradigm by addressing a clinically meaningful four-class severity stagingNon-Demented, Very Mild Demented, Mild Demented, and Moderate Demented providing detailed diagnostic granularity that directly informs treatment and care strategies. To the authors knowledge, few prior works have attained over 99% accuracy on this multi- class problem without relying on pre-trained transfer learning weights.

    TABLE 1. MODEL PERFORMANCE ACROSS DATA PARTITIONS

    Secondly, the alternating pooling strategy represents a novel architectural contribution. Unlike existing CNN models that employ a uniform pooling method, this model integrates max pooling and average pooling in different convolutional blocks. Max pooling captures prominent localized features linked to discrete structural lesions and focal atrophy, whereas average pooling preserves distributed, lower-intensity signals associated with diffuse cortical thinning. Combining both pooling types within a single network enables the extraction of complementary spatial features without increasing model complexity.

    Thirdly, the application of the Synthetic Minority Over- sampling Technique combined with Edited Nearest Neighbors (SMOTEENN) addresses the critical issue of class imbalance in multi-class MRI staging, a gap previously highlighted in systematic reviews [7, 12]. SMOTEENNs hybrid approach, which synthesizes minority class samples while removing ambiguous borderline instances, proved essential; without it, recall for the clinically important Moderate Demented class fell below 85%, indicating substantial underdiagnosis risk.

    Despite these strengths, certain limitations warrant consideration. The dataset, sourced from a curated repository, may not fully capture the variability present in clinical MRI acquisitions, such as differences in scanner vendors, field strengths, or imaging protocols. The models generalizability to such heterogeneous data remains untested. Additionally, standard neuroimaging preprocessing stepssuch as skull stripping, bias field correction, and spatial normalization were not explicitly detailed in the dataset documentation and are presumed minimal, which could affect reproducibility and performance in other settings. Future research should validate the model on diverse, multi-site datasets like ADNI-3 or OASIS-3, incorporate interpretability techniques such as Gradient-weighted Class Activation Mapping (Grad-CAM) to elucidate relevant brain regions, and assess robustness under domain-shift conditions.

    From a clinical standpoint, this high-performing automated tool has the potential to augment radiological assessment, particularly in resource-limited environments lacking specialized neuroimaging expertise. However, integration into clinical workflows will require prospective validation, regulatory clearance, and rigorous evaluation of failure modes, especially regarding cases with low classification confidence.

    TABLE 2. COMPARATIVE ANALYSIS WITH PUBLISHED METHODS

  6. CONCLUSION

This study demonstrates that a specifically designed four- block convolutional neural network (CNN), trained on SMOTEENN-balanced T1-weighted MRI data with Exponential Linear Unit (ELU) activations, can accurately classify Alzheimer's disease severity across four clinical stages. The model achieved a test accuracy of 99.87%, an F1- score of 99.86%, and an area under the curve (AUC) of 99.91%, outperforming existing transfer learning and custom CNN approaches [7, 812]. These results confirm the viability of automated deep learning-based AD staging from structural MRI and emphasize the critical role of addressing class imbalance in neuroimaging datasets. Future research should focus on validating the model with multi-site datasets, applying interpretability techniques such as attribution mapping to elucidate decision mechanisms, and conducting prospective clinical evaluations to determine readiness for practical deployment.

REFERENCES

  1. World Health Organization. (2023). Dementia. WHO Fact Sheet. Retrieved from https://www.who.int/news-room/fact- sheets/detail/dementia

  2. Alzheimer's Association. (2023). Alzheimer's disease facts and figures. Alzheimer's & Dementia, 19(4), 15981695. https://doi.org/10.1002/alz.13016

  3. Livingston, G., Sommerlad, A., Orgeta, V., et al. (2017). Dementia prevention, intervention, and care. The Lancet, 390(10113), 26732734. https://doi.org/10.1016/S0140-6736(17)31363-6

  4. Jack, C. R., Bernstein, M. A., Fox, N. C., et al. (2008). The Alzheimer's disease neuroimaging initiative (ADNI): MRI methods. Journal of Magnetic Resonance Imaging, 27(4), 685691. https://doi.org/10.1002/jmri.21049

  5. Litjens, G., Kooi, T., Bejnordi, B. E., et al. (2017). A survey on deep learning in medical image analysis. Medical Image Analysis, 42, 6088. https://doi.org/10.1016/j.media.2017.07.005

  6. Suk, H. I., Lee, S. W., Shen, D., & Alzheimer's Disease Neuroimaging Initiative. (2014). Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. NeuroImage, 101, 569

    582. https://doi.org/10.1016/j.neuroimage.2014.06.077

  7. Batista, G. E., Prati, R. C., & Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter, 6(1), 2029. https://doi.org/10.1145/1007730.1007735

  8. Mehmood, A., Maqsood, M., Bashir, M., & Shuyuan, Y. (2021). A deep Siamese convolution neural network for multi-class classification of Alzheimer disease. Brain Sciences, 10(2), 84.

    https://doi.org/10.3390/brainsci10020084

  9. Ebrahimi, A., Luo, S., & Chiong, R. (2020). Introducing transfer learning to 3D ResNet-18 for Alzheimer's disease detection on MRI images. In Proceedings of the 35th International Conference on Image and Vision Computing New Zealand (pp. 16). IEEE.

  10. Odusami, M., Maskelinas, R., Damaeviius, R., & Krilaviius, T. (2021). Analysis of features of Alzheimer's disease: Detection of early stage from functional brain changes in magnetic resonance images using a fine-tuned AlexNet. Applied Sciences, 11(4), 1437. https://doi.org/10.3390/app11041437

  11. Naz, S., Ashraf, A., & Zaib, A. (2022). Transfer learning using freeze features of VGG19 pre-trained model for binary classification of Alzheimer's disease. Multimedia Systems, 28(1), 8394. https://doi.org/10.1007/s00530-021-00811-6

  12. Wen, J., Thibeau-Sutre, E., Diaz-Melo, M., et al. (2020). Convolutional neural networks for classification of Alzheimer's disease: Overview and reproducibility evaluation. Medical Image Analysis, 63, 101694. https://doi.org/10.1016/j.media.2020.101694

  13. Clevert, D. A., Unterthiner, T., & Hochreiter, S. (2016). Fast and accurate deep network learning by exponential linear units (ELUs). In Proceedings of the 4th International Conference on Learning Representations (ICLR). arXiv:1511.07289.