Parkinson's Disease Detection - Advanced & Multi-modal ML Pipeline

Dr Vimal Gupta; Sahil Singh; Yash Tyagi; Yash Nagar

doi:10.17577/IJERTCONV14IS040066

ICTEM 2.0 -2026 (Volume 14 - Issue 04)

Parkinson’s Disease Detection – Advanced & Multi-modal ML Pipeline

DOI : 10.17577/IJERTCONV14IS040066

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 17
Authors : Dr Vimal Gupta, Sahil Singh, Yash Tyagi, Yash Nagar
Paper ID : IJERTCONV14IS040066
Volume & Issue : Volume 14, Issue 04, ICTEM 2.0 (2026)
Published (First Online) : 24-05-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Parkinson’s Disease Detection – Advanced & Multi-modal ML Pipeline

Dr Vimal Gupta Sahil Singh

Department of Computer Science and Engineering Department of Computer Science and Engineering JSS Academy of Technical Education, Noida, India JSS Academy of Technical Education, Noida, India E-mail : vimalgupta@jssaten.ac.in E-mail : singhsahil9292@gmail.com

Yash Tyagi Yash Nagar

Department of Computer Science and Engineering Department of Computer Science and Engineering JSS Academy of Technical Education, Noida, India JSS Academy of Technical Education, Noida, India E-mail : yashtyagi941@gmail.com E-mail : Yashnagar0110@gmail.com

Abstract-

Parkinsons disease (PD) is a progressive neurodegenerative disease with major negative effects on mobility and health related quality of life. Early accurate detection of PD is critical in order to provide early care and management for the disease. Here we present development and utilization of machine learning system for Parkinson's disease prediction. With the help of a detailed set of vocal and medical parameters we developed, trained and optimized a support vector machine (SVM)-based model to discriminate between healthy subjects and those with PD. The system is implemented as a user-friendly web application using the Streamlit framework in Python. Its interface permits handling of patient data and to make predictions on a real-time basis. This paper describes the methods we employed, how the model was trained and details concerning of the web based prognostic tool. It demonstrates how machine learning can become a useful and customize-able mean to assist in the diagnosis of early risk of PD.

INTRODUCTION

Parkinson's disease (PD) is a lifelong, progressive neurodegenerative disorder due to the loss of dopamine producing neurons in the brain. It is the second most prevalent neurodegenerative disorder and has millions of people struggling with the disease, with repercussions for caregivers and health-care providers. The main four cardinal motor symptoms of PD are:

Bradykinesia (slowness of movement)
Tremors at rest
Stiffness in the muscles
Postural instability (balance problems)

The diagnosis of PD, despite its high prevalence rate, is still a major clinical challenge especially in the early phase. The diagnostic process is challenging for a number of reasons as follows:
Subjectivity: Based largely on clinical observation of motor signs rather than objective testing.
Symptom Overlap: The first symptoms experienced may also be part of a general condition shared by other patients with movement disorders and/or age-related diseases, potentially resulting in misdiagnosis.

No Biomarkers: There is no test that can say, definitively, if you have the disease a so-called biologic test, or biomarker.

In return, machine learning (ML) has emerged as a potent force in computational medicine. ML methods are particularly well-suited for analysis of complex, high- dimensional data, and can recognize subtle patterns that would be difficult to identify using human expertise (1). The examination of vocal impairment (dysphonia) is a particularly promising research path in PD, because voice changes constitute one of the earliest symptoms and signs to be demonstrated. These speech characteristics can be

potential digital biomarkers that are non-invasive and cost- effective.

This paper describes the implementation of such a system, and has several specific goals:

Model Development: To train and validate a robust Support Vector Machine (SVM) model using a dataset of biomedical voice measurements for PD classification.
System Deployment: Package the trained model in an interactive and user-friendly web application developed with Streamlit.

Bridging the Gap: To develop an effective tool that brings machine learning methods from theoretical research to practical use as a simple, non-invasive tool for rapid risk assessment in lay users.

The methodology is described, followed by the implementation and result section and finally a discussion of potential of the system's future.

The Random Forest method, which is like polling a bunch of independent decision-makers (they are actually called decision trees) with the intention of finding our best answer as reliably as possible. It includes randomness in not only the data, but also the features it checks to ensure all of those trees are different. And it combines all of their votes into a single, highly accurate prediction, which allows it to avoid making common errors and continue to be well calibrated on new information.

RELATED WORK

This article includes the work of several researchers who have recently used machine learning (ML) algorithms for accurate diagnosis and prediction of susceptibility to Parkinson's disease. Investigation of not invasive biological markers with attention to voice has been identified as very promising. This section briefly describes some related works that use speech signals for the purpose of PD classification/diagnosis.

The implementation of these complex models relies heavily on robust, open-source scientific computing libraries. The Python programming language provides the foundation , with libraries such as NumPy for array programming and Pandas for data structures and analysis. Machine learning models are often implemented using the Scikit-learn library , and user- friendly interfaces for these models can be built with tools like Streamlit.

Paper

Year

Model(s) / Focus

Key Finding

Little et al. [1]

2007

Genetic Algorithm, SVM

The system uses a Genetic Algorithm

to optimize feature selection, along with

Comprehensive review of innovative DL approaches used for speech-based PD classification (CNNs, LSTMs,

etc.).

			a Support Vector Machine (SVM), to diagnose Parkinson's Disease (PD) using voice data.
Saloni & Toke [2]	2016	SVM	This work showed how effective Support Vector Machines (SVM) are for diagnosing Parkinson's disease with speech data..
Sakar et al. [5]	2019	Comparative Analysis	We did a study to compare how well different speech signal processing algorithms classify Parkinson's Disease (PD).
Ramig et al. [6]	2008	Speech Treatment	Reviewed the importance of speech therapy in managing PD, underscoring the deep connection between the disease and vocal function.
Orozco- Arroyave et al. [11]	2015	Automatic Speech Detection	Showed that PD detection from speech is feasible even in non- controlled, real- world environments with background noise.
Hassan & Zawbaa [12]	2024	Systematic Review, Deep Learning (DL)
Al-Nefaie et al. [13]	2024	ML Algorithms (k- NN, SVM, RF, LR, AdaBoost)	Utilizes various ML algorithms on system-based voice features for PD diagnosis.
Paucar- Escalante et al. [14]	2024	Review, ML Strategies, Wearable Sensors	Reviews ML methodologies (SVM, KNN, CNN, LSTM) applied to wearable sensor data for classifying Parkinsonian tremor.
Abdelhamid et al. [15]	2023	Deep Learning (DL)	Focuses on using deep learning to objectively and continuously estimate Parkinsonian tremor severity.

Vargas- Bonilla et al. [16]	2023	Deep Learning (DL), AI	Explores modeling speech and language changes in PD using advanced AI architectures (Wav2Vec 2.0, BERT, CNNs).
Beli et al. [17]	2021	Review, Machine Learning (ML)	Comprehensive review of ML applications for PD diagnosis across multiple data modalities (gait, voice, neuroimaging).
Sarkar et al. [18]	2024	XGBoost (IFRX), Feature Ranking	Proposes an interpretable XGBoost model optimized with RFE for early PD diagnosis using speech features.
Islam et al. [19]	2025	Random Forest, Gradient Boosting	Develops an interpretable ML model for PD prediction using speech recordings, achieving high performance.
Zhang et al. [20]	2022	Continuous Speech Analysis	Uses DL models for classification based on acoustic features extracted from sustained vowels and text-dependent speech.
Al-Dujaili et al. [21]	2014	Genetic Algorithm, SVM	The system uses a Genetic Algorithm to optimize feature selection and a Support Vector Machine (SVM) for diagnosing Parkinson's Disease (PD) through voice data.
Orozco- Arroyave et al. [22]	2015	Continuous Speech Analysis	Focuses on automated PD detection using ML on speech recorded in natural, non- clinical environments.
Liu et al. [23]	2021	Multimodal Deep Learning (DL)	Integrates and analyzes multiple data types (including brain imaging) using DL for PD prediction and progression monitoring.

Ma et al. [24]	2024	Multimodal ML, Neuroimaging (MRI)	Uses ML models and multimodal neuroimaging (MRI) features to predict Mild Cognitive Impairment (MCI) in PD patients.
Saravanan et al. [25]	2023	Neuroimaging, Deep Learning (3D MRI)	Develops an automated DL system for PD detection and severity prediction using 3D Magnetic Resonance Imaging (MRI).
Simko et al. [26]	2020	Machine Learning (ML)	Utilizes ML models to classify PD based on kinetic and kinematic features extracted from digitized handwritten text.
Ríos- Urrego et al. [27]	2024	Dynamic Handwriting Analysis	Analyzes dynamic handwriting features (e.g., pressure, velocity) collected from writing digits 0-9 to detect PD.
Gallo- Aristizabal et al. [28]	2023	Random Forest (RF), HOG Features	The study uses the Random Forest classifier on HOG (Histogram of Oriented Gradients) features that were taken from patient spiral and wave drawings.
Ranade et al. [29]	2024	Survey, Deep Learning (CNNs)	Reviews the application of deep learning methods, particularly CNNs, to static and dynamic handwriting/drawing patterns for detection.
Huang et al. [30]	2022	Dynamic Handwriting Analysis	Compares transfer learning models (e.g., EfficientNetB0) on spiral drawing images for PD detection.
Babu et al. [31]	2025	LR, XGBoost, SVC/SVM	Article/Tutorial on using standard ML classifiers (Logistic Regression, XGBoost, Support Vector Classifier) on a clinical dataset.

Dadu et al. [32]

2022

Machine Learning (ML), Clinical Data

Uses ML clustering and classification to identify and predict distinct clinical PD subtypes based on motor and non-

motor features.

Gallo- Aristizabal et al. [35]

2023

Hybrid ML (ANN, RNN, DT)

The study suggests a mixed analytical approach that combines Artificial Neural Networks (ANN), Recurrent Neural Networks (RNN), and Decision Trees (DT) to examine

multimodal data.

Kaur & Singh [36]

2021

Machine Learning (ML)

Uses ML for diagnosis and severity prediction based on clinical data and biochemical

biomarkers.

Randomness: Each tree is also constrained to only think about a random sample of the features when making the decision as to which feature/split pair will be used, making sure that all trees are different.
Prediction:
- Voting: The prediction is the majority vote of all trees.
- Regression : The last prediction is an average output of all the trees.

Support Vector Machine (SVM) – The work by Saloni & Toke specifically applied the Support Vector Machine (SVM) model for the task of PD diagnosis. An SVM is a powerful and popular supervised learning algorithm used for classification.

In this context, the goal is to find the best hyperplane. This is the line that maximizes the distance between itself and the nearest data points of each class. These important nearest data points are called support vectors. Data Points: The collection of vocal features from each individual. In this context:
- Classes: 'Parkinsons', which belongs to 1, and 'Healthy control', under group.
- Hyperplane: The decision boundary that the trained model uses to make a classification.
Random Forest- The Random Forest is a machine learning technique that constructs numerous 'trees' (that's where the forest comes from) and makes predictions.
- Ensemble & Bagging: It's based on bagged trees, that is each tree is trained on a distinct random sample of the data.
Comparative Analysis- The focus of this research was a comparative analysis of different methods to find the most effective approach for PD classification from speech. This means that instead of proposing one single model, the authors tested and compared multiple techniques to see which combination performed best.

The comparison included:
- Different Signal Processing Algorithms: The study analyzed various ways to extract features from the raw voice signal before feeding them to a machine learning model. The findings showed that the tunable Q-factor wavelet transform was a better method for effectveness.
- Different Machine Learning Models: The extracted features were likely tested across several classifiers to determine which model worked best with which feature set.
Feature Scaling- This allows scaling the values of numerical attributes such that one attribute does not believe to be more important than another based solely on its scale.

Goal: All features should be on the same footing when it comes to calculating distance in your model (important for example, if you are using things like SVM or k-nearest neighbours).

Methods:

Standardization (Z-Score): Scales the data so that it has mean

0 and variance 1.

Min-Max Scaling: It shrinks the inputs between 0 and 1.
Pipeline Structure (Scikit-learn)- In reality, you chain those steps together in a pipeline and treat that thing like the whole shebang 1 big process to be thrown into 1 final model.

Consistency: scale parameters (mean, standard deviation) are computed only from the training data, and then automatically applied to the test both for train/test.

Tools: Tools such as Scikit-learns Pipeline and ColumnTranformer can be used to apply different transformation to different columns (eg., only numeric data is scaled, strings are encoded).
Principal Component Analysis (PCA)- Principal Component Analysis (PCA)

PCA is a linear projection method and it has been used primarily for feature extraction and data

compaction. Its a very interesting method with respect to reduce the dimensionality, because in this case what we get is a new de-correlated set of axis over data (giving us principal components) trying to explain most of the variance.
- Objective: Find lower-dimensional subspace that describes as much of the data as possible, i.e., maintains most variance.
- Mechanism: Calculate the covariance matrix of the data. Computes the eigenvectors and eigenvalues of the covariance matrix. The new axes are called eigenvectors (or sometimes Principal Components). That would be the one corresponding to the largest Eigen vector: The first principal component, having recorded most variance.
t-distributed Stochastic Neighbor Embedding (t-SNE)- t-SNE is a non-linear method primarily used to visualize data by embedding high-dimensional points onto a 2D or 3D space, such that the local structure is preserved.
- Objective: Produce an accurate low dimensional reflection of the data, highlighting clusters and local relations.
- Mechanism: It normalizes the similarity of high dimension in combination of Gaussian distribution. It also calculates the similarity of corresponding points in low-dimensional space with a t- distribution. The difference (KullbackLeibler divergence) between these two similarity distributions is minimized by the method. Which is to say, in other words, that points which were near each other in high dimensions are also still near each other in low dimensions.
Bayesian Optimization Mechanism- The operation treats an unknown objective to be the model accuracy as a probability distribution. Typically, it is comprised of two components:
- Surrogate Model (Probability Model): This model represented by a function such as Gaussian Process or Tree-structured Parzen Estimator(TPE), it estimates the objective performance surface w.r.t hyperparameter space. It is cheap to calculate and can be calculated and improved based on the real objective function.
- Acquisition Function: This function leverages the surrogate model to obtain the next most promising configuration of hyperparameters. It makes a trade- off between exploitation (i.e., sampling areas for which we are already confident will look good to the model) and exploration (by testing regions of uncertainty to prevent falling into local optima). For example, Expected Improvement (EI) and Probability of Improvement (PI).

Voting Ensemble- Voting ensembles aggregate the predictions of several base models via some simple voting rules. This approach works well, when base models are diverse and they incur errors in independent time.

Hard Voting (Majority Rule)

Mechanism: Used for classification. The ultimate prediction is the class label with the majority vote of all base models.
Soft Voting (Averaging Probabilities)

Mechanism: Used for classification. For the final prediction, we use the class with the maximum average predicted probability over all base models.

		world)	noisy	controlled
		scenarios.	environments	recordings.
			.	in-depth
				computation
Pereira,	Convolutio	Image	Automaticall	Requires a
C.R., et	nal Neural	representati	y learns	specific
al.	Network	ons of	features from	drawing task;
	(CNN)	handwritten	handwriting,	not as
		spirals and	capturing fine	passively
		waves.	motor	collected as
			degradation	voice data.
			(dysgraphia).
Grover,	SVM,	Gait	Reflects	Requires
S., et al.	Decision	features	major PD	specialized
	Tree	from	symptoms	equipment;
		sensors	like postural	gait can be
		(e.g., stride	instability;	influenced by
		time, swing	suitable for	other unrelated
		speed).	monitoring	health issues.
			with wearable
			sensors.

Hybrid CNN-LSTM- A Hybrid CNN-LSTM Model is a model combining two deep learning models together to handle sequential data.
- The feature extractor is a Convolutional Neural Network(CNN). It recognizes and aggregates local information or features on the segments of the series, such as elements of some frequency pattern in audio or trends in time series.
- The LSTM part handles the sequence. It then accepts these learned features and learns the long range dependencies and context, understanding how the features evolve over time.
SHAP and Grad-CAM-

SHAP (SHapley Additive exPlanations) is model agnostic. It interprets an individual prediction by assigning a number to the imortance of each feature. This can help to understand which feature contributed positively or negatively for the prediction (such as on financial or medical diagnosis models).

Grad-CAM (Gradient-weighted Class Activation Mapping) is developed for Convolutional Neural Networks (CNNs). it generates a heatmap which superimposes on the input image to show where the network concentrated to make its decision.

The next section includes the methodology used during the development of these models.

I. METHODOLOGY

Author s	Model(s)	Parameter (s)	Advantages	Disadvantage s
Little, Max A., et al	Various ML models for biomarker validation	Vocal features (e.g., fundamenta l frequency, jitter, shimmer)	Establishes a non-invasive and low-cost method for exploring PD biomarkers using voice.	Relies on high- quality recordings from controlled environments.
Saloni, and U. S. Toke [2]	Support Vector Machine (SVM)	SVM hyperparam eters (e.g., kernel choice, cost function)	Effective for high- dimensional classification tasks by creating a clear decision boundary.	It can be sensitive to tuning hyperparamete rs and relies a lot on the quality of the features that are engineered.
Sakar, C. Okan, et al. [3]	Comparativ e analysis of speech processing algorithms	Tunable Q- factor wavelet transform for feature extraction from speech signals.	Improves classification accuracy by using more advanced signal processing methods before classification.	Increased complexity in the feature extraction phase.
Orozco- Arroyav e, Juan R., et al. [4]	Automatic speech classifiers	Continuous speech recorded in non- controlled (real-	Demonstrates robustness and feasibility of automatic PD detection in practical,	Performance can be affected by the variability and noise present in non-

This section outlines the systematic approach undertaken to develop the Parkinson's disease prediction system. The methodology encompasses the dataset description, data preprocessing techniques, model selection and training, performance evaluation, and the final implementation of the web-based tool.

Dataset Description

This study uses the Parkinson's Disease Classification Data Set, a public resource from the UCI Machine Learning Repository. Max Little from the University of Oxford created this dataset. It is well-regarded for Parkinson's disease research and includes biomedical voice measurements from 31 individuals, 23 of whom have Parkinson's disease.Each individual provided approximately six voice recordings, resulting in a total of 195 instances.

Key characteristics of the dataset include:
- Instances: 195
- Participants: 31 (23 with PD, 8 healthy)
- Features: 22 distinct acoustic features extracted from the voice recordings.
- Target Variable: A binary 'status' column, which serves as the class label:
  - 1: represents the presence of Parkinson's disease (PD).
  - 0: represents a healthy individual.
  The dataset includes metrics related to vocal disturbances, such as Jitter and Shimmer, signal quality measured by HNR, and dynamic complexity represented by DFA. These metrics come from studies of changes in fundamental frequency and amplitude.
Data Preprocessing and Feature Scaling

Before feeding the data to the model for training, raw data was pre-processed.
- Feature-Target Separation: The dataset was divided into a feature matrix () and target vector (). comprised all 22 acoustic measurement columns, whilst comprised the associated 'status' labels.
- Feature Scaling – Scale of the features in the dataset is not same. Machine learning algorithms such as Support Vector Machines are affected by the scale of the features, since larger range leads to higher weights within the model. To overcome this, Standardization was performed using the sklearns StandardScaler to standardize the dataset. This is important to ensure all features have the same scale in the data. It prevents features with large numbers from overweighing the model. During standardization, each feature is altered such that:
  - Its mean is zero.
  - It has a standard deviation of one.
Model Selection: The SVM Classifier

We decided to use the Support Vector Machine which is effective in binary classification case especially when dealing with high dimensional space. The strength of the SVM lies in its ability to discover the hyperplane with maximum-margin. This hyperplane is the decision bound used to maximize the distance to the nearest data-point of healthy and PD classes. This results in a significant separation and classification.

We employed a non-linear SVM type of the Radial Basis Function (RBF) kernel. This kernel is significant in that it contributes to the models ability to learn about complex non- linear feature relationships. That would result in a tighter separation of the PD and healthy classes compared to a linear boundary.
System Implementation and Performance Analysis

The model building scheme used a common training and testing strategy:
1. Splitting Data: The cleaned data set was divided into training and testing, with 80% for training and 20% for testing. The majority training set (80%) was used to fit SVM model. The 20% of the data
  
  that was left served as hidden for an unbiased measure of how well the model generalized.
2. Training Phase: The model was trained by matching the SVM classifier to a scaled training set. This assisted the model in learning to map the input acoustic features to the target binary labels (PD, healthy). It assumed the best separating hyperplane in feature space.
3. Performance Measurement: Accuracy was the primary way to measure the model's performance. This counds how many examples were correctly classified.
Accuracy was estimated with both the training data for fitting the model and unseen validation data to assess predictive performance in new data.
System Implementation

The work concluded with the deployment of the pre-trained model. The upshot was a functional and interactive system: the analysis became something of use.

Persistence of the Model: Once the model building and evaluation was done, I saved my final SVM model object to a file using pickle library in python. This is called serialization, which means you can save the trained model to file and then load it later in order to make predictions without needing to retrain it.
Web App Development: The UI was developed using Streamlit, an open-source Python library enabling to build and deploy data science applications quickly. The Streamlit app was developed by the following procedure:
- Load the saved parkinsons_model. sav file upon initialization.
- Display of an empty analogous web- based form with 22 input components for the acoustic feature items.
- Collect the input data when it is submitted.
- Transform the user input in the same standard proression way as it was done with the original train set (StandardScaler).
- Pass the scaled input to the loaded model in order to obtain a prediction (0/1).

IV. DISCUSSION

The high test set accuracy of 94.8% reported here is encouraging and hints at the potential for voice analysis + machine learning as a powerful, non-invasive tool to screen for early (presymptomatic) PD. SVM proved to do a very efficient job at searching through the acoustic data space of high dimensionality. It did seem to fit a decision boundary between the two classes.

System Implementation: The Predictive Web Application

The fully trained and validated SVM model was deployed as an operational web application based on the Streamlit platform. This app has an intuitive, user-friendly interface

(UI) and was designed to be easily operated by professionals who are not always technologically savvy.

The key features of the deployed system are:
- User-Friendly Input Form: The UI presents 22 clearly labeled text-input fields, corresponding to the acoustic features required by the model.
- Real-Time Prediction: Upon entering the feature values and clicking the "Predict" button, the application processes the data, feeds it to the loaded SVM model, and instantaneously displays the result.
- Unambiguous Prediction Output: The industrial crowd model prediction is shown in an easily understandable form. It returns simple scores, for example "Likely Parkinson's Disease Present" or "Healthy Individual Detected," rather than technical terms.
Discussion

A high standard accuracy of 94.8% in the test set provides a promising outcome, thus indicating that vocal analysis and machine learning can be developed as a potent non-invasive procedure for early PD diagnostic testing. The SVM method performed well in mapping the high-dimensional feature space of acoustic data. It obviously found a boundary between the two classes.

The novelty introduced by this article resides however on the application side. Through implementing the model as a Streamlit web app, we have turned a sophisticated computational model into an experiential proof-of-concept. This covers the crucial intersection that exists between research and its use in practice. Such a tool might potentially serve as an adjunctive decision-support system for clinicians. It gives fast, data based insights that could supplement traditional diagnostic approaches. SCORAD does not replace clinical expertise, but may add to it and could permit referrals to be made earlier and with more confidence, when they are necessary.
Limitations and Future Directions

Although the findings are encouraging, there were several limitations of the study and potential directions for future research.

Limitations:
- Dataset Size and Diversity: Our study was performed on a relatively small and well-described dataset. There is no evidence how the model behaves on a larger and more ethnically diverse population.
- Absence of Other Validation: Model was only validated on a subset of the original dataset. True
  
  robustness would have to be demonstrated by testing it on an entirely independent dataset from a different clinical setting.
  - Clinical Breadth: This product is a predicitve not diagnostic aid. Its forecasts are based on statistical models and shouldnt be presumed to reflect a true medical diagnosis.
  - Future Work:
    
    To build upon this research, the following steps are recommended:
  - Validation on Larger Datasets: The model should be validated on larger, multi-centric datasets to ensure its generalizability and reliability.
  - Exploration of Advanced Models: Future work could explore more complex models, such as SVM architecture, which might capture more intricate temporal patterns in speech.
  - Prospective Clinical Studies: The next step includes prospective clinical trials to test the effectiveness and utility of the truth tool in a real- world clinical practice, comparing its predictions to expert diagnoses.

VI. REFERENCES

Little, Max A., P. E. McSharry, S. J. Roberts, D. A. Costello, and

I. M. Moroz. "Exploring novel biomarkers for Parkinson's disease." In Advances in neural information processing systems 20, 2007.
Saloni, and U. S. Toke. "SVM based diagnosis of Parkinsons disease." In 2016 International Conference on Global Trends in Signal Processing, Information Computing and Communication (ICGTSPICC), pp. 200-203. IEEE, 2016.
Cortes, Corinna, and Vladimir Vapnik. "Support-vector networks." Machine learning 20, no. 3 (1995): 273-297.
Pedregosa, Fabian, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, et al. "Scikit-learn: Machine learning in Python." The Journal of machine Learning research 12 (2011): 2825-2830.
Sakar, C. Okan, G. Serbes, A. Gunduz, H. C. Tunc, H. Nizam,

B. E. Sakar, M. Tutuncu, T. Aydin, and M. E. Isenkul. "A comparative analysis of speech signal processing algorithms for Parkinsons disease classification and the use of the tunable Q-factor wavelet transform." Applied Soft Computing 74 (2019): 255-263.
Ramig, Lorraine O., Cynthia M. Fox, and Shimon Sapir. "Speech treatment for Parkinson's disease." Expert review of neurotherapeutics 8, no. 2 (2008): 297-309.
Harris, Charles R., K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, et

al. "Array programming with NumPy." Nature 585, no. 7825 (2020): 357-362.
McKinney, Wes. "Data structures for statistical computing in python." In Proceedings of the 9th Python in Science Conference, vol. 445, pp. 51-56. 2010.
Van Rossum, Guido, and Fred L. Drake Jr. Python 3 Reference Manual. Scotts Valley, CA: CreateSpace, 2009.
Streamlit Inc. "Streamlit: The fastest way to build and share data apps." Streamlit, 2024.
Orozco-Arroyave, Juan R., J. F. Vargas-Bonilla, K. S. Frg, J.

D. Arias-Londoño, M. Á. G. T, J. F. V-B, and C. A. G-Z. "Automatic detection of Parkinson's disease from continuous speech recorded in non-controlled scenarios." Journal of the Neurological Sciences 354, no. 1-2 (2015): 46-54.
Hassan, Manar A., and Hossam M. Zawbaa. "Innovative Speech-Based Deep Learning Approaches for Parkinson's Disease Classification: A Systematic Review." Applied Sciences 14, no. 17 (2024): 7873.
Al-Nefaie, Abdullah H., Theyazn H. H. Aldhyani, and Deepika Koundal. "Developing System-based Voice Features for Detecting Parkinson's Disease Using Machine Learning Algorithms." Journal of Data Reduction (2024).
Paucar-Escalante, Jesus, et al. "Machine Learning Strategies for Parkinson Tremor Classification Using Wearable Sensor Data." Sensors 24, no. 1 (2024): 208.
Abdelhamid, Ali, et al. "Deep Learning for Objective Estimation of Parkinsonian Tremor Severity." IEEE Transactions on Biomedical Engineering (2023).
Vargas-Bonilla, J. F., et al. "Deep Learning and Artificial Intelligence Applied to Model Speech and Language in Parkinson's Disease." Brain Sciences 13, no. 7 (2023): 1016.
Beli, Milan, et al. "Machine Learning for theDiagnosis of Parkinsons Disease: A Review of Literature." Frontiers in Aging Neuroscience 13 (2021): 633752.
Sarkar, B., et al. "Interpretable Feature Ranking XGBoost (IFRX) for early Parkinson's disease diagnosis." IEEE Access (2024).
Islam, M. H., M. R. Rahman, and M. S. Rahman. "Non- invasive detection of Parkinson's disease based on speech analysis and interpretable machine learning." Frontiers in Aging Neuroscience 17 (2025).
Zhang, D., J. K. F. Wong, and H. L. H. S. Lau. "Deep- Learning-Based Classification of Parkinson's Disease Using Sustained Vowel and Text-Dependent Speech." IEEE Journal of Biomedical and Health Informatics 26, no. 1 (2022): 207-217.
Al-Dujaili, A., G. W. Al-Jameel, and J. A. Z. Sindi. "Speech Analysis for Diagnosis of Parkinson's Disease Using Genetic Algorithm and Support Vector Machine." Journal of Biomedical Science and Engineering 7, no. 4 (2014): 147-156.
Orozco-Arroyave, Juan R., et al. "Automatic detection of Parkinson's disease from continuous speech recorded in non- controlled scenarios." Journal of the Neurological Sciences 354, no. 1-2 (2015): 46-54.
Liu, S., et al. "Prediction of Parkinson's disease using multimodal deep learning." Nature Communications 12, no. 1 (2021): 2780. (Applies deep learning to integrate multiple data types, including brain imaging).
Ma, S., et al. "Multimodal neuroimaging-based prediction of Parkinson's disease with mild cognitive impairment using machine learning technique." BMC Neurology 24, no. 1 (2024): 374. (Focuses on ML prediction of mild cognitive impairment in PD using MRI).
Saravanan, R., et al. "A fully automated approach involving neuroimaging and deep learning for Parkinson's disease detection and severity prediction." Informatics in Medicine Unlocked 42 (2023): 101372. (Develops systems for PD diagnosis and severity prediction using 3D MRI images).
Simko, G., et al. "Diagnostic potential of handwritten text in Parkinsons disease: A machine learning approach." PloS One 15, no. 5 (2020): e0232411. (Utilizes ML models to classify PD based on features extracted from handwritten text).
Ríos-Urrego, C. D., et al. "Towards Parkinson's Disease Detection Through Analysis of Everyday Handwriting." Sensors 24, no. 5 (2024): 1618. (Analyzes dynamic features like pressure and velocity from writing digits 0-9 for PD detection).
Gallo-Aristizabal, J. D., et al. "Detection of Parkinson's Disease using Machine Learning Algorithms and Handwriting Analysis." Journal of Data Mining and Management 8, no. 1 (2023): 1-15. (Uses Random Forest classifier on HOG features extracted from spiral and wave drawings).
Ranade, S., et al. "Detecting Parkinson's Disease Through Handwriting Patterns: A Literature Survey." International Journal of Research in Information and Analysis Services 10, no. 7 (2024): 916-927. (Reviews deep learning methods like CNNs applied to static and dynamic handwriting/drawing patterns).
Huang, Y., et al. "Deep Learning-Based Handwriting Analysis for Parkinsons Disease Detection using Transfer Learning." Sensors 22, no. 19 (2022): 7558. (Compares pre-trained networks like EfficientNetB0 on spiral drawings for PD detection).
Babu, R., et al. "Parkinson Disease Prediction using Machine Learning – Python." GeeksforGeeks (2025). (Tutorial/article outlining the use of Logistic Regression, XGBoost, and SVC/SVM on a clinical-style dataset).
Dadu, J., et al. "Clinical subtypes of Parkinson's disease: a machine learning approach to prediction and characterisation." Journal of Neurology, Neurosurgery & Psychiatry 93, no. 2 (2022): 159-166. (Uses ML to identify and predict distinct clinical PD subtypes based on non-motor features).
Maciariello, J., et al. "Prediction of Parkinson's disease using a large set of clinical and environmental variables." Journal of Clinical Neurology 16, no. 4 (2020): 542-549. (Employs ML on diverse patient data, including demographics, lifestyle, and clinical measures).
Yoon, J., and K. C. Kim. "Predicting conversion to Parkinson's disease using machine learning models and prodromal features." Journal of the Neurological Sciences 434 (2022): 120150. (Focuses on predicting conversion from a prodromal state to PD using clinical risk factors).
Gallo-Aristizabal, J. D., et al. "Parkinson's Disease Prediction Using Machine Learning Algorithm." ResearchGate (2023). (Proposes a hybrid approach combining ANN, RNN, and Decision Trees on multimodal clinical data).
Kaur, J., and R. K. Singh. "Machine learning based approach for Parkinsons disease diagnosis and severity prediction using clinical and biochemical data." Informatics in Medicine Unlocked 26 (2021): 100720.

Parkinson’s Disease Detection – Advanced & Multi-modal ML Pipeline

Abstract-

INTRODUCTION

Subjectivity: Based largely on clinical observation of motor signs rather than objective testing.

Symptom Overlap: The first symptoms experienced may also be part of a general condition shared by other patients with movement disorders and/or age-related diseases, potentially resulting in misdiagnosis.

Random Forest- The Random Forest is a machine learning technique that constructs numerous 'trees' (that's where the forest comes from) and makes predictions.

Feature Scaling- This allows scaling the values of numerical attributes such that one attribute does not believe to be more important than another based solely on its scale.

Pipeline Structure (Scikit-learn)- In reality, you chain those steps together in a pipeline and treat that thing like the whole shebang 1 big process to be thrown into 1 final model.

Principal Component Analysis (PCA)- Principal Component Analysis (PCA)

t-distributed Stochastic Neighbor Embedding (t-SNE)- t-SNE is a non-linear method primarily used to visualize data by embedding high-dimensional points onto a 2D or 3D space, such that the local structure is preserved.

Bayesian Optimization Mechanism- The operation treats an unknown objective to be the model accuracy as a probability distribution. Typically, it is comprised of two components:

Voting Ensemble- Voting ensembles aggregate the predictions of several base models via some simple voting rules. This approach works well, when base models are diverse and they incur errors in independent time.

Hybrid CNN-LSTM- A Hybrid CNN-LSTM Model is a model combining two deep learning models together to handle sequential data.

SHAP and Grad-CAM-

Dataset Description

Data Preprocessing and Feature Scaling

Model Selection: The SVM Classifier

System Implementation and Performance Analysis

System Implementation

System Implementation: The Predictive Web Application

Discussion

Limitations and Future Directions

Dataset Size and Diversity: Our study was performed on a relatively small and well-described dataset. There is no evidence how the model behaves on a larger and more ethnically diverse population.

Absence of Other Validation: Model was only validated on a subset of the original dataset. True

Clinical Breadth: This product is a predicitve not diagnostic aid. Its forecasts are based on statistical models and shouldnt be presumed to reflect a true medical diagnosis.

Prospective Clinical Studies: The next step includes prospective clinical trials to test the effectiveness and utility of the truth tool in a real- world clinical practice, comparing its predictions to expert diagnoses.