

- Open Access
- Authors : Joyassree Sen, Dr. Paresh Chandra Barman
- Paper ID : IJERTV14IS050076
- Volume & Issue : Volume 14, Issue 05 (May 2025)
- Published (First Online): 15-05-2025
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
A Hybrid Machine Learning Approach for Heart Disease Prediction: Integrating Random Forest and Multi-Layer Perceptron for Enhanced Diagnostic Accuracy
Joyassree Sen
Department of Computer Science and Engineering Islamic University, Kushtia, Bangladesh
Dr. Paresh Chandra Barman
Department of Information and Communication Technology Islamic University, Kushtia, Bangladesh
Abstract
Heart disease remains one of the leading causes of death globally, affecting millions of lives each year and placing a significant burden on healthcare systems. Early and reliable prediction of heart conditions is essential for preventing severe outcomes and enabling timely medical intervention. This study presents a hybrid machine learning approach that integrating a Random Forest (RF) classifier with a Multi-Layer Perceptron (MLP) meta-learner to enhance diagnostic accuracy. The RF model is first trained on standardized clinical features to produce probabilistic outputs, which are then used as inputs to the MLP for final prediction. The dataset, derived from a publicly available heart disease repository, was preprocessed with feature scaling and class balancing techniques to ensure fairness and model robustness. On the test set, the proposed model achieved an accuracy of 84%, with an F1-score of 0.86 for detecting heart disease and a recall of 0.91, indicating strong sensitivity. The average precision (AP) score of 0.9143 on the precision-recall curve confirms the models effectiveness, particularly in identifying true positives in imbalanced data. These results suggest that the proposed ensemble-based approach can be a valuable tool in clinical decision support systems, helping healthcare professionals identify high-risk patients with greater confidence.
Keywords: Heart disease prediction, Meta-learning, Random Forest, Multi-Layer Perceptron, Ensemble learning, Machine learning, Clinical decision support, Precision-recall, Medical diagnosis, Class imbalance
-
INTRODUCTION
Worldwide heart disease stands as the leading reason for mortality since the World Health Organization reports annual deaths amount to approximately 18 million. Heart disease-related deaths create substantial stress for global healthcare institutions primarily because preventive care facilities are scarce in low- to middle-income nations [1]. The diagnosis and prediction of heart disease at early stages provides the most effective approach to minimize fatal consequences because of prompt intervention therapy.
The development of artificial intelligence (AI) and machine learning (ML) offers modern opportunities to improve both the accuracy and speed of medical diagnoses, particularly in the field of cardiology. Machine learning models can identify compelling patterns in patient data and often achieve higher prediction accuracy than traditional statistical methods. Ungrammatical sentence. The Random Forest (RF) classifier, an ensemble learning method, is known for its robustness, high accuracy, and strong ability to prevent overfittingmaking it one of the most reliable machine learning techniques. The prediction output of a Random Forest (RF) model is based on the combined results of multiple decision trees, making its forecasts more stable and reliable [3]. This technique is widely used in medical research because it effectively handles clinical records, which often contain high-dimensional and noisy data [4]. The
capabilities of Random Forest (RF) are limited when it comes to identifying nonlinear relationships between features, which are common in biological systems.
The limitation is resolved through the application of Multi-Layer Perceptrons (MLPs) within deep learning methods. Artificial neural networks of the feedforward and hierarchical representation type known as MLPs have demonstrated high capability to learn complex data structures. The results of these models have been successfully applied in various healthcare applications, including electrocardiogram (ECG) classification, diabetic retinopathy detection, and cancer diagnosis [5]. The appropriate training of MLPs enables them to detect unique patterns within patient records that standard models might miss. The data-intensive nature of MLPs becomes a problem when combined with challenging medical datasets whose disease cases comprise a small percentage of the patient population.
Research into hybrid learning frameworks has emerged to combine the strengths of Random Forest (RF) and Multilayer Perceptron (MLP) models while minimizing the weaknesses of each when used individually. Within the frameworks a base model receives training to retrieve intermediate output or probability values that become inputs for a meta-learner to perform the final classification. This meta-learning approach helps physical therapists expand their generalization capabilities by providing an organized fusion method for heterogeneous learning systems [6]. Hospital heart disease prediction benefits from ensemble methods, which have been shown to outperform individual models in terms of sensitivity, specificity, and F1-score metrics [7]. For reliable implementation of models in practical scenarios, the process requires class adjustment, feature transformation, and training cessation management to achieve optimal outcomes. ROC curves and precision-recall scores are essential evaluation metrics, as they provide a more reliable performance assessment, especially in situations where accuracy alone is unreliable due to class imbalance [8].
The research develops a hybrid two-stage model that combines the probabilistic outputs of a Random Forest classifier with the learning capabilities of a Multi-Layer Perceptron. The Random Forest (RF) model receives preprocessing heart disease data before performing training on the data to obtain class-based probability distributions. The trained MLP receives these probabilities from the Random Forest as inputs for its final prediction process. The proposed combination method takes advantage of RF classical decision systems with neural network ability to learn complex patterns. The identified hybrid technique demonstrates effectiveness through superior recall and precision metrics, highlighting its potential for use in clinical support systems to identify high-risk individuals early, before the onset of heart disease.
-
RELATED WORK
As advancements in healthcare data accessibility and computational power continue, sophisticated machine learning algorithms play an increasingly crucial role in assessing the risk of cardiac disease. These advanced systems improve the analysis of medical data, enabling earlier detection and more accurate risk assessments. Medical diagnostic procedures today need invasive processes and a high degree of professional expertise. The drive to discover data- based solutions exists because researchers want to help clinicians get accurate and timely risk assessments.
The application of supervised learning algorithms throughout studies exists to forecast heart disease cases. Logistic regression and decision trees are widely used in predictive models because they provide interpretable results. However, both techniques struggle when dealing with complex, multidimensional clinical data patterns [9]. Support Vector Machines (SVM) and k-Nearest Neighbors (KNN) are effective in predicting heart disease when appropriate feature tuning methods are applied by researchers [10].
Multiple research studies have used Artificial Neural Networks (ANNs) for their ability to learn complex relationships, making them effective in predicting eart disease. The Cleveland Heart Disease dataset received positive diagnostic outcomes when Detrano and colleagues evaluated neural network performance [11]. Standalone ANNs demonstrate difficulty in stable performance due to overfitting problems and data sensitivity when operating on small or unbalanced datasets.
Ensemble methods have emerged as powerful alternatives, combining multiple weak or strong learners to improve generalization. Gradient boosting machines (GBMs) and random forests (RFs) have demonstrated robustness in clinical prediction tasks, offering high accuracy and resilience to overfitting [12]. Bagging and boosting strategies, when applied to medical data, have also proven effective in stabilizing predictions and handling data variance.
More recently, hybrid and meta-learning approaches have been explored to combine the strengths of different algorithms. For instance, research by Uddin et al. presented a hybrid architecture combining RF and neural networks to leverage both interpretability and deep feature learning [13]. Such architectures are particularly suited for medical diagnostics, where performance and explainability are critical.
Furthermore, the challenge of class imbalance, which is common in medical datasets where positive cases are often underrepresented, has been addressed using various techniques, including resampling, cost-sensitive learning, and
synthetic data generation methods such as SMOTE [14]. These strategies ensure that predictive models are not biased toward the majority class, thus improving sensitivity and recall in identifying patients at risk.
Overall, the literature indicates that no single model consistently outperforms others across all types of datasets. Instead, combining complementary algorithms through ensemble or hybrid learning often yields better results, especially when dealing with heterogeneous and imbalanced clinical data. In general, the literature suggests that no single model consistently outperforms all others across various datasets. This highlights the need for context-specific model selection based on the characteristics of the data.
-
Proposed Methodology
In this study, we designed a hybrid machine learning model that combines the strengths of two well-known techniques: Random Forests and Multi-Layer Perceptrons (MLPs) to enhance heart disease prediction. The core concept is to leverage the Random Forest model to capture essential decision patterns from the data, and then pass its output to an MLP, which serves as a more flexible decision-making model. This section outlines the process of data preparation, model development, and performance evaluation.
-
Dataset and Preprocessing
To build our heart disease prediction model, we worked with a dataset containing 303 patient records, each representing a set of medical and personal health attributes. After a quick quality check, we noticed that six records had missing values. Rather than filling in those gaps with assumptions, we removed them entirely to ensure the dataset stayed clean and reliable. That left us with 297 complete and usable records for our analysis.
Since our goal was to predict whether or not a patient has heart disease, we treated this as a binary classification problem. We assigned a value of 1 for patients with heart disease and 0 for those without. Before training our machine learning models, we standardized all numerical features to ensure consistency and improve model performance. This ensured that each feature contributed equally during training, preventing any single variable from dominating due to its scale.
TABLE 1. UCI dataset attributes detailed information.
Sr.
No.
Attribute
Icon
Description
1
Age
Age
Patients age in years
2
Sex
Sex
Biological sex (0 = female; 1 = male)
3
Chest Pain
Cp
Type of chest pain (1 = typical angina, 2 = atypical angina, 3 =
non-anginal pain, 4 = asymptomatic)
4
Rest Blood Pressure
Trestbps
Resting systolic blood pressure (in mm Hg)
5
Serum Cholesterol
Chol
Serum cholesterol level (in mg/dl)
6
Fasting Blood Sugar
Fbs
Blood sugar > 120 mg/dl (0 = false; 1 = true)
7
Rest ECG
Restecg
Electrocardiographic results (0 = normal, 1 = ST-T abnormality, 2 = LV hypertrophy)
8
Max Heart Rate
Thalach
Maximum heart rate achieved
9
Exercise-Induced
Angina
Exang
Angina induced by exercise (0 = no; 1 = yes)
10
ST Depression
Oldpeak
Depression in ST-segment relative to the rest
11
Slope
Slope
Slope of peak ST segment (1 = upsloping, 2 = flat, 3 = downsloping)
12
No. of Vessels
Ca
Number of major vessels coloured by fluoroscopy (03)
13
Thalassemia
Thal
Blood disorder type (3 = normal, 6 = fixed defect, 7 = reversible defect)
14
Heart Disease Status
Class
Diagnostic outcome (0 = no disease, 14 = increasing severity of
risk)
-
Proposed Hybrid Model Architecture
To leverage the complementary strengths of ensemble learning and deep neural networks, we implemented a two- stage stacked model. The architecture integrates a Random Forest classifier as the base learner and a Multi-Layer Perceptron (MLP) as the meta-learner. This hybrid framework is designed to extract robust feature interactions via Random Forest and then refine classification decisions using a non-linear neural architecture.
-
Base Learner: Random Forest Classifier
The Random Forest classifier is an ensemble learning algorithm that constructs many decision trees during training and outputs the mode of the classes or mean prediction for classification and regression tasks, respectively. In our implementation, the Random Forest model was configured with 300 trees (n_estimators = 300) and a maximum depth of 12 (max_depth = 12). These settings were chosen to balance model complexity and prevent overfitting while maintaining predictive power.
To address class imbalance, we applied class weighting based on inverse frequency. The class weight wj for class j is computed as:
(1)
Where n is the total number of samples, k is the number of classes, nj is the number of instances of class j. Instead of final class labels, we extracted class probabilities from the trained Random Forest, denoted as:
(2)
Where T is the number of trees and ht(x) is the prediction from the t-th tree. These probabilities serve as input features for the subsequent MLP classifier.
-
Meta-Learner: Multi-Layer Perceptron (MLP)
The meta-learner is a fully connected neural network comprising three hidden layers with ReLU activation functions. Each hidden layer is followed by dropout (rate = 0.3) to reduce overfitting. The final output layer consists of a single neuron with a sigmoid activation function, producing a probability y^ [0,1] of heart disease presence.
The forwardpass through the MLP can be described as:
(3)
Here, z represents the input vector (probabilities from the Random Forest), W and b are the weight matrices and bias vectors, respectively, and (·) denotes the sigmoid activation function.
Algorithm 1: Hybrid Heart Disease Prediction Using Random Forest and MLP 1: Load dataset D from source
2: Remove rows in D containing missing values 3: Extract features X and labels y from D
4: Standardize X using Z-score normalization
5: Split X and y into X_train, X_val, X_test and y_train, y_val, y_test using stratified sampling 6: Train Random Forest model RF on X_train and y_train with class_weight='balanced'
7: for each of X_train, X_val, X_test do
8: Generate meta-features by RF.predict_proba(X_set) 9: end for
10: Define the MLP model with architecture:
-
Input layer: size = number of RF output classes
-
Hidden layers: [128, 64, 32] with ReLU and Dropout(0.3)
-
Output layer: 1 neuron with Sigmoid activation
11: Compile MLP using:
-
Loss = Binary Crossentropy
-
Optimizer = Adam (learning_rate = 0.0001)
12: Compute class weights from y_train for imbalance correction
13: Train MLP on RF meta-features from X_train with y_train, validate on X_val
-
Use EarlyStopping and ModelCheckpoint callbacks
14: Predict probabilities y_pred_prob on X_test meta-features using trained MLP 15: Compute ROC curve and Youden's J statistic: J = TPR – FPR
16: Identify optimal threshold * where J is maximized 17: Convert y_pred_prob to final labels y_pred using:
if y_pred_prob > * then label = 1 else label = 0 18: Evaluate predictions using:
-
Classification Report
-
Confusion Matrix
-
ROC-AUC Curve
-
Precision-Recall Curv
Figure 1: Diagram of the proposed hybrid model
3.3. Evaluation Metrics
The performance of the proposed model was evaluated using four fundamental classification outcomes: True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). Based on these, we computed five key quantitative metrics commonly used in medical classification studies: precision, sensitivity (also known as recall), F1 score, specificity, and accuracy. These metrics were selected due to relevance in binary classification problems and their widespread use in closely related research. Together, they provide a comprehensive view of the models ability to identify heart disease cases while minimizing diagnostic errors correctly.
Accuracy measures the overall correctness of the models predictions, defined as the proportion of all correctly predicted instances (both positive and negative) among the total predictions:
Recall = TP TP + FN
(4)
Precision evaluates the reliability of positive predictions. It quantifies how many of the cases predicted as having heart disease are correct:
Precision = TP TP + FP
(5)
Also known as recall or the true positive rate, sensitivity indicates how effectively the model identifies actual cases of heart disease:
Accuracy = TP +TN
TP + TN+ FP + FN
(6)
Specificity measures the models ability to correctly identify patients without heart disease, focusing on the true negatives:
Specificity = TN
TN+ FP
(7)
The F1 score is the harmonic mean of precision and sensitivity. It provides a balanced measure when there is a trade-off between false positives and false negatives:
F1-score = 2PrecisionRecall
Precission+Recall
(8)
-
-
-
-
RESULT AND DISCUSSION
The proposed hybrid model, which combines a Random Forest (RF) classifier as the base learner and a Multi-Layer Perceptron (MLP) as the meta-learner, was evaluated on the heart disease dataset using a range of performance metrics. The results demonstrate the model's effectiveness in binary classification tasks, even under class imbalance conditions, and highlight its potential for use in clinical decision support.
-
Performance Metrics
Table 2 provides a snapshot of the models performance, highlighting its ability to make reliable predictions. With an overall accuracy of 84%, the model demonstrates a balanced approach in classifying healthy individuals and those with heart disease. Notably, it achieves a recall of 0.91 for the 'Disease' class, meaning it effectively identifying most actual cases of heart disease. At the same time, the model achieved a precision of 0.88 for the 'No Disease' class, indicating a high level of confidence in correctly identifying healthy individuals. This balance between sensitivity and precision makes the model a dependable tool in heart disease detection.
TABLE 2. Performance report of our proposed approach.
Class
Precision
Recall
F1-Score
No Disease
0.88
0.75
0.81
Disease
0.81
0.91
0.86
Accuracy
0.84
Macro Avg
0.84
0.83
0.83
Weighted Avg
0.84
0.84
0.83
-
Graphical Analysis of Model Performance
To further validate the performance of our proposed hybrid model, we examined several visual evaluation metrics, including the confusion matrix, precision-recall curve, and ROC curve. These visualizations provide deeper insights into the models classification behaviour and its reliability in a real-world healthcare setting.
Figure 2: Confusion matrix of the proposed model
The confusion matrix (Figure 2) illustrates how well the model distinguishes between individuals with and without heart disease. Out of 61 total cases, the model correctly identified 30 out of 33 patients with heart disease and 21 out of 28 healthy individuals. This reflects a high recall for the 'Disease' class, showing the models strong ability to detect actual positive cases a crucial aspect in clinical diagnosis.
Figure 3: Precision-Recall Curve of the proposed model.
In terms of precision and recall trade-offs, the precision-recall curve (Figure Y) shows that the model maintains high precision across various levels of recall, with an average precision (AP) score of 0.91. This indicates that the model successfully captures more true positive cases without greatly sacrificing accuracy, which is a valuable trait for any diagnostic tool.
Figure 4: ROC curve of the proposed model.
Moreover, the ROC curve (Figure 4) reveals the models ability to distinguish between the two classes across different threshold settings. With an AUC (Area Under Curve) of 0.90, the model demonstrates excellent discriminatory power, meaning it performs well in minimizing both false positives and false negatives.
Together, these visual metrics reinforce the reliability of our model, especially in sensitive clinical contexts where both overdiagnosis and missed diagnoses carry serious consequences. The combination of high recall for the 'Disease' class and substantial overall precision indicates that this hybrid model could be a supportive tool for early detection and risk assessment in heart disease.
-
Comparative Analysis
To better understand how our proposed hybrid model performs, we compared its results with those from earlier studies on heart disease prediction. Table 3 shows the accuracy achieved by different echniques reported in the literature. This comparison clearly demonstrates the advantages of our approach, showing that it outperforms many traditional machine learning models commonly used in similar studies.
Table 3. Accuracy comparison of our model with existing techniques
Authors
Technique
Accuracy (%)
Detrano et al. [11]
Logistic Regression
78
Nayak et al. [16]
Decision Tree
75
Anooj et al. [17]
Random Forest
80
Palaniappan et al. [18]
SVM
81
Islam et al. [19]
CNN-Based Model
82
Proposed Model
Hybrid (RF + MLP)
84
The results in Table 3 clearly show that our hybrid model performs better than traditional machine learning techniques and holds its own against more advanced models like CNN-based approaches. By combining the strengths of Random Forest (RF) and Multi-Layer Perceptron (MLP), we develop a system that balances transparency with strong predictive performance. RF helps capture complex patterns in structured data, while MLP excels at identifying nonlinear relationships, making for a well-rounded approach. This synergy enhances overall accuracy while maintaining a strong balance between precision and recallcrucial for effective heart disease prediction in real-world healthcare settings.
-
-
CONCLUSION
This study presents a hybrid approach to predicting heart disease, combining the strengths of Random Forest and Multi-Layer Perceptron models. By balancing performance with interpretability, our method achieves an impressive 84% accuracy, proving its potential as a reliable tool for heart disease detection. It surpasses traditional machine learning techniques by optimizing both precision and recall, ensuring that predictions are accurate and relevant for real-world healthcare applications. Through comparative analysis, we've demonstrated that our model stands strong against established methods, showing its effectiveness in situations when accuracy and reliability are crucial. In medical settings, having a model that provides transparent and explainable predictions can be a game-changer, helping healthcare professionals make better-informed decisions for their patients. That said, no model is perfect. There's always room for improvement, and we recognize the need to refine its ability to adapt to diverse datasets and increase its generalization power. Looking ahead, we plan to incorporate additional factors such as genetic and environmental influences and explore more advanced techniques to enhance accuracy further.
At its core, this research contributes to the expanding field of predictive healthcare models, promoting early detection, reducing misdiagnoses, and ultimately improving patient outcomes. By pushing the boundaries of hybrid machine learning, we hope to pave the way for even more effective tools in the fight against heart disease.
REFERENCES
-
World Health Organization, Cardiovascular diseases (CVDs), WHO, 2023. [Online]. Available: https://www.who.int/news-room/fact- sheets/detail/cardiovascular-diseases-(cvds)
-
A. Esteva et al., A guide to deep learning in healthcare, Nature Medicine, vol. 25, pp. 2429, Jan. 2019, doi: 10.1038/s41591-018-0316-z.
-
L. Breiman, Random forests, Machine Learning, vol. 45, no. 1, pp. 532, Oct. 2001, doi: 10.1023/A:1010933404324.
-
F. Jiang et al., Artificial intelligence in healthcare: Past, present and future, Stroke and Vascular Neurology, vol. 2, no. 4, pp. 230243, 2017, doi: 10.1136/svn-2017-000101.
-
G. Litjens et al., A survey on deep learning in medical image analysis, Medical Image Analysis, vol. 42, pp. 6088, Dec. 2017, doi: 10.1016/j.media.2017.07.005.
-
S. Sagi and L. Rokach, Ensemble learning: A survey, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 8, no. 4, e1249, 2018, doi: 10.1002/widm.1249.
-
A. Gudadhe, S. Wankhade, and H. Dongre, Decision support system for heart disease based on support vector machine and artificial neural network, in 2010 International Conference on Computer and Communication Technology (ICCCT), Sep. 2010, pp. 741745, doi: 10.1109/ICCCT.2010.5640374.
-
H. He and E. A. Garcia, Learning from imbalanced data, IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263 1284, Sep. 2009, doi: 10.1109/TKDE.2008.239.
-
D. L. Reilly, L. N. Cooper, and C. Elbaum, A neural model for category learning, Biological Cybernetics, vol. 45, no. 1, pp. 3541, 1982, doi: 10.1007/BF00335247.
-
M. A. Jabbar, B. L. Deekshatulu, and P. Chandra, Heart disease prediction using lazy associative classification, Procedia Computer Science, vol. 57, pp. 376381, 2015, doi: 10.1016/j.procs.2015.07.360.
-
R. Detrano et al., International application of a new probability algorithm for the diagnosis of coronary artery disease, American Journal of Cardiology, vol. 64, no. 5, pp. 304310, 1989, doi: 10.1016/0002-9149(89)90524-9.
-
A. Abdar et al., A new machine learning method for classification of heart diseases, Computer Methods and Programs in Biomedicine, vol. 192, p. 105261, May 2020, doi: 10.1016/j.cmpb.2020.105261.
-
S. Uddin et al., A hybrid machine learning model for predicting heart disease using data mining techniques, Computer Methods and Programs in Biomedicine, vol. 191, p. 105361, Apr. 2020, doi: 10.1016/j.cmpb.2020.105361.
-
C. Fernández et al., SMOTE for learning from imbalanced data: Progress and challenges, Neurocomputing, vol. 356, pp. 202221, Sep. 2019, doi: 10.1016/j.neucom.2019.04.045.
-
A. R. Nayak, R. Nayak, D. Sinha, and A. Sharma, Heart disease prediction using machine learning techniques, Materials Today: Proceedings, vol. 33, pp. 28462851, 2020, doi: 10.1016/j.matpr.2020.08.847.
-
P. K. Anooj, Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules, Journal of King Saud University – Computer and Information Sciences, vol. 24, no. 1, pp. 2740, 2012, doi: 10.1016/j.jksuci.2011.09.002.
-
S. Palaniappan and R. Awang, Intelligent heart disease prediction system using data mining techniques, International Journal of Computer Science and Network Security (IJCSNS), vol. 8, no. 8, pp. 343350, 2008.
-
M. T. Islam, M. S. Hossain, G. Muhammad, and A. Alelaiwi, Deep learning-based cardiac disease detection using ECG images, Healthcare, vol. 9, no. 5, p. 527, 2021, doi: 10.3390/healthcare9050527.