DOI : https://doi.org/10.5281/zenodo.18506606
- Open Access

- Authors : Shessymol Jaymon, Al Muhammed Fajrudheen, Adharsh Anil, Vinayak R R, Sukanya M V, Dr. Shani Raj
- Paper ID : IJERTV15IS010619
- Volume & Issue : Volume 15, Issue 01 , January – 2026
- DOI : 10.17577/IJERTV15IS010619
- Published (First Online): 06-02-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
RestSure AI: An Intelligent Web-Based System for Sleep Disorder Detection Using Machine Learning
Shessymol Jaymon
B.Tech Student, Dept. of Computer Science and Engineering, College of Engineering, Kottarakkara Kottarakkara, India
Vinayak R R
B.Tech Student, Dept. of Computer Science and Engineering, College of Engineering, Kottarakkara Kottarakkara, India
Al Muhammed Fajrudheen
B.Tech Student, Dept. of Computer Science and Engineering, College of Engineering, Kottarakkara Kottarakkara, India
Sukanya M V
Assistant Professor, Dept. of Computer Science and Engineering, College of Engineering, Kottarakkara Kottarakkara, India
Adharsh Anil
B.Tech Student, Dept. of Computer Science and Engineering, College of Engineering, Kottarakkara Kottarakkara, India
Dr. Shani Raj
Associate Professor, Dept. of Computer Science and Engineering, College of Engineering, Kottarakkara Kottarakkara, India
Abstract – Sleep disorders, including insomnia and sleep apnea, have a profound impact on physical health, cognitive functioning, and overall quality of life. However, they are often difcult to detect early because diagnosis typically relies on clinical evaluations and polysomnography, which can be costly and not widely available. This paper introduces RestSure AI, a web-based intelligent platform designed to predict prevalent sleep disorders using lifestyle and physiological information gathered through structured questionnaires. The system combines a Django pow- ered web interface with a machine learning backend based on a soft voting ensemble classier. User-provided datasuch as sleep duration, stress levels, physical activity, BMI, and vital signsare processed and analyzed to classify sleep disorder types. Experimental results show that the proposed system yields accurate predictions and provides interpretable health guidance, making it a practical tool for initial sleep health screening.
Index TermsSleep Disorder Detection, Machine Learning, Ensemble Learning, Django, Health Informatics
- Introduction
Sleep is essential for sustaining both physical and mental well-being. Conditions like insomnia and sleep apnea are becoming more common, driven by contemporary lifestyle factors such as stress, lack of physical activity, and inconsis- tent sleep patterns. Conventional diagnostic approaches often depend on clinical assessments and specialized devices, which restrict broad access to timely detection.
With recent progress in machine learning, it is now possible to perform predictive analysis using non-invasive data sources. This study focuses on developing and deploying a scalable, easy-to-use web application that utilizes machine learning models to forecast sleep disorders from everyday lifestyle habits and health-related metrics.
- Related Work
Recent progress in articial intelligence has played a sub- stantial role in advancing sleep monitoring and the detection of sleep disorders. A comprehensive review from December 2022 examined how consumer-grade sleep technologiessuch as wearable devices and smartphone-based sensorsare being combined with AI methods for sleep stage classication. The review emphasized the increasing importance of inexpensive, non-clinical data sources and showcased how machine learn- ing and deep learning models can be effectively applied in realistic, everyday sleep tracking contexts. At the same time, it identied important limitations, including lower clinical ac- curacy compared to polysomnography (PSG) and inconsistent performance across various consumer devices.
A separate research direction has concentrated on clas- sifying sleep disorders using structured lifestyle and health data. For example, a study released in August 2023 used the Random Forest algorithm on the Sleep Health and Lifestyle dataset to identify conditions such as insomnia and sleep apnea. The researchers showed that Random Forest models work particularly well for tabular health datasets because they can model complex, non-linear relationships and offer
interpretable measures of feature importance. However, the study also found that the models performance was strongly inuenced by the quality of the dataset, and its ability to generalize to clinical settings was still constrained.
To better manage uncertainty and ambiguity in medical diagnosis, a new neutrosophic-based machine learning frame- work was proposed in January 2025. This method integrated fuzzy reasoning with conventional classiers to strengthen decision support for predicting sleep disorders. The research demonstrated greater robustness when dealing with vague or incomplete datasets. Nonetheless, the intricate nature of the neutrosophic logic framework led to signicant computational overhead and made the system harder to interpret for users without specialized expertise.
In addition, ensemble integration-focused methods have been investigated to boost the accuracy of sleep disorder detection. A January 2025 study concentrated on improving standard machine learning models through sophisticated fea- ture engineering and hyperparameter featuring, such as grid search. These rened models produced higher classication accuracy than their baseline counterparts. Yet, longer training durations and an elevated risk of overtting emerged as key limitations of these ensemble model-centric strategies.
Ensemble learning has gained traction as an effective strat- egy for boosting predictive performance in sleep disorder detection. A study published in January 2025 introduced a multi-layer ensemble framework integrated with advanced data balancing methods, including the Synthetic Minority Over- sampling Technique (SMOTE). The ndings showed notable gains in prediction accuracy, especially when dealing with imbalanced datasets. Nonetheless, this ensemble design led to higher computational demands and diminished overall model interpretability.
Building on ensemble-oriented methods, another study from January 2025 utilized stacked classiers alongside ensemble balancing techniques to better capture minority classes. The proposed approach achieved higher sensitivity for rare sleep disorders and delivered strong classication results across multiple evaluation metrics. However, the increased archi- tectural complexity of the stacked ensemble required more extensive training data and greater computational power, which constrained its scalability.
In July 2025, a systematic review offered an extensive as- sessment of AI-based approaches for sleep stage classication and sleep disorder detection. It encompassed diverse tech- niques, including traditional machine learning, deep learning, and hybrid models applied to both PSG and wearable sensor data. Although the review successfully highlighted prevailing research directions and existing gaps, it did not provide experimental validation and was limited by the coverage of the databases used for literature selection.
Deep learning-based techniques have also been extensively explored for sleep stage classication. In a study from January 2023, convolutional neural networks (CNNs) and long short- term memory (LSTM) networks were applied to multi-modal physiological signals, including EEG, ECG, and EMG. This
method achieved high accuracy by automatically extracting intricate temporal and spatial patterns. Nevertheless, its de- pendence on large annotated datasets and substantial compu- tational power restricted its use in resource-limited setting.
In another investigation from April 2023, several machine learning algorithms were compared for predicting the severity of obstructive sleep apnea syndrome. The study underscored the strong potential of machine learning models for early diagnosis based on clinical and physiological variables. How- ever, it also reported that the models performance was highly dependent on the specic dataset and that they were not readily applicable in real-time clinical environments.
Finally, a 2024 study examined traditional machine learning methodssuch as Support Vector Machines, Decision Trees, and Random Forestsfor classifying different sleep disorders using health and lifestyle information. The authors high- lighted that these models are straightforward, interpretable, and relatively easy to deploy. Despite these advantages, they showed lower accuracy than deep learning models and encoun- tered scalability issues when applied to larger, more complex datasets.
TABLE I
Summary of Related Work in Sleep Disorder Prediction
Ref Year Dataset Used Technique Applied Key Focus Performance Level 1 2022 Consumer sleep device data ML and AI models Sleep classi- cation Medium 2 2023 Sleep Health & Lifestyle data Random For- est Disorder classication High 3 2025 Clinical sleep data Neutrosophic ML Decision support High 4 2025 Lifestyle and clinical data ensemble ML models Diagnosis improvement High 5 2025 Imbalanced sleep dataset Ensemble learning Disorder de- tection Very High 6 2025 Balanced sleep dataset Multi-layer ensemble Accuracy improvement Very High 7 2025 PSG and wearable data AI-based re- view Survey study 8 2023 EEG, ECG, EMG signals
Deep Learn- ing Sleep stage classication Very High 9 2023 Clinical pa- rameters Multiple ML models OSA severity estimation
High 10 2024 Health and lifestyle data ML algorithms
Disorder classication MediumHigh - Methodology
This section describes the materials, dataset, experimental setup, machine learning algorithms, and evaluation metrics employed in the proposed RestSure AI system for sleep disorder classication.
- Materials and Methods
The proposed methodology integrates data-driven machine learning techniques with a web-based health assessment plat-
form to predict common sleep disorders. The system processes lifestyle, physiological, and demographic parameters collected through structured questionnaires and applies supervised learn- ing models to classify sleep disorders into predened cate- gories.
All experiments were conducted using the Python program- ming language and relevant scientic computing libraries. The system workow consists of data preprocessing, feature encoding, model training, evaluation, and deployment within a Django-based web application for real-time inference
- Real Sleep Health and Lifestyle Dataset
The proposed system employs the Sleep Health and Lifestyle Extended Dataset from Kaggle, which offers detailed information on lifestyle, health, and sleep gathered from a het- erogeneous group of individuals. This dataset combines self- reported lifestyle indicators with physiological data that are linked to sleep habits and sleep quality. It includes variables such as age, gender, occupation, total sleep time, perceived sleep quality, physical activity level, stress level, body mass index (BMI), blood pressure, heart rate, and daily step count. This extended dataset expands on existing health records by adding a wider array of demographic and behavioral attributes, enabling more comprehensive modeling of sleep health trends and classication of sleep disorders. In contrast to strictly clinical datasets, it captures everyday lifestyle factors that affect sleep outcomes, making it well-suited for non-invasive
predictive modeling.
The dataset is rigorously preprocessed before model train- ing. This procedure entails encoding categorical attributes, converting continuous measurements into standardised for- mats, and eliminating missing and inconsistent entries. To preserve numerical stability during training, continuous char- acteristics like daily steps, heart rate, and sleep duration are normalised. Label encoding or one-hot encoding are used to transform categorical variables like gender and occupation according to their respective cardinalities.
Sleep disorder labels, such as No Sleep Disorder, Insomnia, and Sleep Apnoea, make up the target variable. When dividing the dataset into training and evaluation subsets, stratied sampling is used to guarantee balanced representation. Class proportions are maintained by this partitioning technique, which is essential for generating accurate performance metrics in multi-class classication tasks. All things considered, the Sleep Health and Lifestyle Extended Dataset offers a wealth of behavioural, physiological, and demographic information that facilitates efcient supervised learning for the identication of sleep disorders. The development of models that generalise well beyond controlled or clinical environments is supported by its real-world scope.
- Experimental Design
This study uses a two-phase experimental framework to evaluate the effectiveness of machine learning algorithms for the classication of sleep disorders using lifestyle and physiological data. Prior to applying optimisation and feature
renement techniques to enhance predictive performance, the primary objective of this design is to analyse baseline model behaviour.
During the rst stage, the dataset is split 70:30 into training and testing subsets. The training subset is used to t a number of machine learning classiers without the need for ensem- ble classication or hyperparameter optimisation. During this stage, which serves as a baseline evaluation, the models can learn directly from the original feature space. Performance is assessed using an unseen testing subset to investigate general- isation capacity and identify intrinsic constraints such as noise sensitivity, feature redundancy, and parameter dependency.
Using an optimised learning strategy, the second phase aims to improve classication performance. To guarantee consistency and numerical stability, the dataset is preprocessed before training. A Voting Classier combined with machine learning classiers is used for ensemble classication and parameter tuning. An ideal subset of features and matching model parameters that maximise classication performance are found using the Voting Classier.
The dataset is split into 0.70 training and 0.30 testing subsets once more during this optimised phase. By combining feature subsets and classier parameter congurations, the Vot- ing Classier-based optimisation process iteratively assesses potential solutions. In order to ensure fair evaluation across various sleep disorder classes, a tess function is dened based on classication accuracy and F1-score.
The Voting Classier-selected feature subsets are used to train the optimised classiers, which are then assessed on the testing data to gauge gains over baseline performance. This experimental approach makes it possible to compare non-ensemble and Voting Classier-enhanced models, offer- ing insights into how ensemble classication and parameter tuning affect the accuracy of sleep disorder classication. The baseline learning phase and the Voting classier ensemble learning phase are highlighted in Fig. 1, which depicts the overall workow of the suggested experimental design. This methodical approach guarantees a thorough assessment of both raw and optimised machine learning models, assisting in the creation of a precise and comprehensible sleep disorder detection system.
- Performance Metrics
To evaluate the effectiveness of the proposed sleep disorder classication models, several standard performance metrics were employed. These metrics provide a comprehensive as- sessment of classication accuracy, class-wise prediction qual- ity, and robustness, particularly in multi-class and imbalanced dataset scenarios.
Let TP , TN , FP , and FN denote True Positives, True Negatives, False Positives, and False Negatives, respectively.
- Accuracy: Accuracy measures the proportion of cor- rectly classied instances among the total number of samples and is dened as:
Accuracy = TP + TN (1)
TP + TN + FP + FN
Although accuracy provides an overall measure of model performance, it may be misleading when class imbalance is present.
- Precision: Precision evaluates the correctness of positive predictions by measuring the proportion of true positives among all predicted positives:
TP
- Accuracy: Accuracy measures the proportion of cor- rectly classied instances among the total number of samples and is dened as:
- Classication Algorithms
To categorize sleep disorders using health, lifestyle, and physiological attributes, multiple supervised machine learning classiers were implemented in this study. The selected al- gorithms were chosen based on their classication capability, interpretability, and suitability for structured tabular data. Each model was trained using preprocessed features and evaluated
Precision =
TP + FP
(2)
under a multiclass sleep disorder prediction framework.
- Logistic Regression: Logistic Regression (LR) is a
A higher precision value indicates a lower false positive rate, which is critical in reducing incorrect sleep disorder diagnoses.
- Recall (Sensitivity): Recall, also known as sensitivity, measures the models ability to correctly identify actual posi- tive instances:
probabilistic classication technique widely used for baseline performance evaluation. The model estimates the likelihood of class membership by applying a logistic function to a linear combination of input features. For multiclass sleep disorder prediction, a one-versus-rest strategy is employed. Owing to its low computational complexity and ease of interpretation, LR
Recall = TP
TP + FN
(3)
provides an effective benchmark for comparing more complex models.
High recall ensures that a larger proportion of sleep disorder cases are correctly detected.
- F1-Score: The F1-score represents the harmonic mean of precision and recall and provides a balanced evaluation metric:
F1-score = 2 × Precision × Recall (4)
Precision + Recall
This metric is particularly effective for evaluating classi- cation performance on imbalanced datasets.
- Specicity: Specicity measures the models ability to correctly identify negative instances:
- Recall (Sensitivity): Recall, also known as sensitivity, measures the models ability to correctly identify actual posi- tive instances:
- Decision Tree: Decision Tree (DT) classiers construct a hierarchical structure by recursively splitting the dataset based on feature values that maximize information gain. Each internal node represents a decision condition, while leaf nodes correspond to predicted classes. In the proposed sleep disorder prediction system, DTs enable transparent rule-based learning and effectively capture nonlinear relationships among health and lifestyle parameters. However, excessive tree growth may lead to overtting, which is mitigated through depth con- straints.
- Random Forest: Random Forest (RF) is an ensemble learning technique that integrates multiple decision trees to enhance predictive performance and generalization capability.
Specicity = TN
TN + FP
(5)
Each tree is trained on a bootstrapped subset of the data with randomly selected feature subsets. The nal classication
A high specicity value indicates that individuals without sleep disorders are correctly classied.
- Confusion Matrix: The confusion matrix is a tabular rep- resentation that summarizes prediction outcomes. For binary classication, it consists of four elements: TP , TN , FP , and FN . In multi-class classication, the matrix is extended to represent all classes and allows detailed per-class performance analysis.
- Macro-Averaged Metrics: For multi-class sleep disorder classication, macro-averaging is employed to ensure equal contribution from each class. The macro-averaged metrics are dened as:
is obtained through majority voting. RF demonstrates high robustness to noise, feature interactions, and class imbalance, making it well-suited for sleep disorder classication tasks.
- K-Nearest Neighbors: K-Nearest Neighbors (KNN) is an instance-based learning algorithm that assigns class labels based on the dominant class among the k closest samples in the feature space. Distance measures such as Euclidean distance are used to evaluate similarity. The choice of k is ensemble experimentally to balance bias and variance. KNN is effective in identifying local data patterns but is sensitive to feature scaling and computationally intensive for large datasets.
- Support Vector Machine: Support Vector Machine (SVM) is a margin-based classier that constructs an optimal decision boundary by maximizing the separation between
Macro Precision =
classes. Kernel functions, including linear and radial basis
function (RBF) kernels, enable SVM to model complex non- linear relationships. In this project, SVM effectively captures
intricate dependencies between sleep-related attributes and disorder classes, offering strong generalization performance.
Macro F1-score =
- Logistic Regression: Logistic Regression (LR) is a
- Feature Importance
Feature importance analysis was conducted to identify the most inuential parameters contributing to sleep disorder
where C denotes the total number of classes. prediction. Tree-based models, particularly the Random Forest
classier, were used to compute feature importance scores based on impurity reduction.
Results indicated that sleep duration, stress level, quality of sleep, physical activity level, BMI category, and daily steps were among the most signicant predictors. This analysis enhances model interpretability and provides valuable insights into lifestyle factors affecting sleep health.
- Correlation Coefcient Analysis
Correlation analysis was performed using the Pearson cor- relation coefcient to examine the relationships between input features and the target variable. This analysis helped identify redundant features and multicollinearity within the dataset.
Strong correlations were observed between sleep duration and sleep quality, as well as between physicl activity and daily step count. Features exhibiting high correlation were carefully evaluated to prevent redundancy and improve model efciency.
- Voting Classier Ensemble
To enhance prediction reliability, both hard and soft voting ensemble strategies were implemented. The ensemble com- bines heterogeneous base learners to leverage their comple- mentary strengths.
Hard Voting: The nal class label is determined by majority voting among base classiers.
Soft Voting: The predicted class probabilities from indi- vidual classiers are averaged, and the class with the highest aggregated probability is selected.
Soft voting is particularly effective in healthcare datasets where class boundaries overlap and probabilistic condence is essential.
Hard voting was used for comparative analysis, while soft voting was selected as the nal deployment model due to superior probabilistic performance.
- Design and Implementation
This section describes the architectural design and im- plementation details of the proposed sleep disorder detec- tion system. The framework integrates data preprocessing, machine learning classication, Voting Classier-based web- based deployment platform to enable real-time sleep disorder assessment.
- System Architecture: The overall system architecture follows a modular pipeline consisting of data acquisition, pre- processing, classication, and deployment. The design ensures scalability, interpretability, and real-time usability. Figure ?? illustrates the high-level workow of the proposed system.
The architecture is composed of four main layers:
- Data Layer: Responsible for dataset ingestion and pre- processing.
- Learning Layer: Performs classication using machine learning algorithms.
- Application Layer: Provides a web-based interface for user interaction and prediction delivery.
- Data Preprocessing and Encoding: The Sleep Health and Lifestyle Extended dataset was preprocessed to ensure data quality and consistency. Missing values were removed, and categorical attributes such as gender, occupation, and sleep disorder labels were encoded using label encoding techniques. Continuous features were normalized using standard scaling to ensure uniform feature contribution during training.
BMI values were converted into categorical classes (under- weight, normal, overweight, obese) to align with clinical inter- pretation. The nal feature vector consisted of demographic, lifestyle, and physiological attributes.
- Model Design: Multiple machine learning algorithms were employed to classify sleep disorders, including Logistic Regression, Decision Tree, Random Forest, and an ensemble- based soft voting classier. The ensemble model combined predictions from individual classiers to improve generaliza- tion and reduce model bias.
Each model was trained using a supervised learning ap- proach, with sleep disorder category as the target variable. The ensemble classier served as the primary prediction model due to its superior performance.
- Voting Classier Framework: A hard voting ensemble strategy was adopted in this work to enhance the robustness and predictive stability of the sleep disorder classication sys- tem. In hard voting, each base classier independently predicts a class label for a given input instance, and the nal output is determined by majority consensus among all classiers. This decision-level fusion mechanism is particularly effective for healthcare applications, as it reduces the inuence of isolated misclassications and promotes consistent predictions across diverse patient proles.
To construct a heterogeneous ensemble, ve supervised machine learning algorithms were selected as base learners: Logistic Regression (LR), Decision Tree (DT), Random For- est (RF), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM). These classiers were chosen to represent diverse learning paradigms, including linear models, instance- based learning, tree-based reasoning. The heterogeneity of the ensemble ensures that different aspects of the underlying data distribution are captured, thereby improving generalization capability.
Logistic Regression contributes to the ensemble through its linear decision boundary and probabilistic modeling frame- work. It effectively captures global trends between lifestyle variablessuch as sleep duration, stress level, and physical activityand sleep disorder classes. Although LR has limited capacity to model complex nonlinear relationships, its stable and interpretable predictions provide a reliable baseline that strengthens the ensembles consensus mechanism.
Decision Tree and Random Forest classiers enable the ensemble to model nonlinear interactions and hierarchical feature dependencies. The Decision Tree learns interpretable decision rules by recursively splitting the feature space based on information gain, making it suitable for capturing threshold- based clinical patterns. Random Forest extends this capability by aggregating multiple decision trees trained on bootstrapped
data samples and random feature subsets, thereby reducing overtting and variance. Within the hard voting ensemble, Random Forest contributes robust predictions that are resilient to noise and class imbalance.
The K-Nearest Neighbors classier adds a local, instance- based perspective to the ensemble. By assigning class labels based on the majority class among the nearest data points in the feature space, KNN effectively captures neighborhood- level similarities between individuals with comparable lifestyle and physiological characteristics. While KNN is sensitive to feature scaling and computationally expensive for large datasets, its inclusion enhances the ensembles ability to model localized patterns that may be overlooked by global classiers.
Support Vector Machine further strengthens the ensem- ble through its margin-based optimization framework. By constructing an optimal separating hyperplane in a high- dimensional feature space, SVM is capable of handling com- plex class boundaries, particularly when nonlinear kernel functions are employed. Its strong generalization performance complements the probabilistic and tree-based models in the ensemble, ensuring balanced decision-making across diverse sleep disorder classes.
Overall, the hard voting ensemble integrates linear, nonlin- ear, probabilistic, instance-based, and margin-based classiers into a unied framework. The majority voting mechanism ensures that the nal prediction reects collective agreement among heterogeneous learners, reducing individual model bias and variance. This ensemble strategy signicantly improves classication reliability and stability, making it well-suited for real-world sleep disorder detection in a web-based healthcare application.
- Experimental Implementation: The implementation was carried out using Python and popular machine learning li- braries. The dataset was divided into training and testing subsets using a 70:30 ratio. Five-fold cross-validation was applied to the training data to evaluate model stability.
Two experimental congurations were implemented:
- Baseline Model: Trained without feature selection or optimization.
- ensemble Model: Trained using Voting Classier-selected features and ensemble parameters.
Performance metrics were computed for both congurations to analyze improvement.
- Web-Based Deployment: The ensemble ensemble model was serialized and deployed within a Django-based web ap- plication. A multi-step questionnaire interface was designed to collect user inputs related to sleep habits, lifestyle, and health paameters.
User responses were processed in real time, transformed into model-compatible feature vectors, and passed to the trained classier. The system generated instant predictions along with personalized sleep health recommendations, enhancing user engagement and practical applicability.
- System Architecture: The overall system architecture follows a modular pipeline consisting of data acquisition, pre- processing, classication, and deployment. The design ensures scalability, interpretability, and real-time usability. Figure ?? illustrates the high-level workow of the proposed system.
- System Deployment and Integration
The ensemble machine learning model was serialized using standard Python model persistence techniques and integrated into a Django-based web application. This deployment archi- tecture enables real-time interaction between the user interface and the trained classication model.
User data collected through the multi-step questionnaire in- terface are processed dynamically within the application layer. Input features undergo the same preprocessing and encoding steps used during model training to ensure consistency. The processed feature vector is then passed to the deployed model to generate predictions in real time.
The system outputs the predicted sleep disorder category along with personalized health recommendations based on the classication results. This seamless integration of machine learning inference within a web framework ensures scalability, accessibility, and responsiveness, making the proposed pro- posed system suitable for real-world sleep health assessment applications.
- Materials and Methods
- Results and Discussion
The experimental results demonstrate that ensemble learning signicantly enhances sleep disorder classication accuracy compared to individual classiers. Among baseline models, Random Forest achieved the highest accuracy, conrming its capability to model nonlinear interactions in lifestyle and physiological data.
The hard voting ensemble improved classication stability by aggregating discrete predictions from heterogeneous clas- siers. However, the soft voting ensemble delivered superior performance across all evaluation metrics, achieving an accu- racy of 94.6% and an F1-score of 0.94.
The probabilistic nature of soft voting enabled better han- dling of overlapping feature distributions and class imbalance, which are common in lifestyle-based sleep datasets. Cross- validation results further conrmed the robustness and gen- eralization capability of the ensemble models, with reduced variance compared to standalone classiers.
Overall, the ndings validate the effectiveness of voting- based ensemble learning as a reliable and computationally efcient approach for non-invasive sleep disorder detection.
- Experimental Setup
The dataset was divided into 70% training data and 30% testing data. In addition, ve-fold cross-validation was applied to the training set to assess model stability and reduce sam- pling bias. All experiments were conducted using identical preprocessing steps to ensure fairness across classiers.
- Baseline Classication Performance
Table ?? presents the baseline performance of individual machine learning algorithms without feature selection or pa- rameter optimization. The Random Forest classier achieved the highest baseline accuracy, indicating its ability to model non-linear relationships in lifestyle and physiological data.
TABLE II
Baseline Performance of Individual Classifiers
Classier Accuracy (%) Precision Recall F1-score Logistic Regression 82.6 0.82 0.81 0.81 Decision Tree 84.9 0.84 0.83 0.83 Random Forest 88.4 0.88 0.87 0.87 KNN 83.7 0.83 0.82 0.82 SVM 86.1 0.86 0.85 0.85 TABLE III
Hard Voting Ensemble Performance
TABLE IV
Soft Voting Ensemble Performance
Model Accuracy (%) Precision Recall F1-score Soft Voting Ensemble 94.6 0.94 0.94 0.94 TABLE V
Model Mean Accuracy (%) Std. Dev Logistic Regression 81.9 1.7 Decision Tree 84.3 1.5 Random Forest 87.8 1.2 Hard Voting Ensemble 90.9 0.9 Soft Voting Ensemble 93.8 0.7 Five-Fold Cross-Validation Accuracy
Model Accuracy (%) Precision Recall F1-score Hard Voting Ensemble 91.3 0.91 0.90 0.90 - Five-Fold Cross-Validation Results
The training dataset was subjected to ve-fold cross- validation in order to assess model consistency. Table V illustrates that the ensemble model outperformed individual classiers in terms of average accuracy and variance.
- Voting Classier based-ensemble Model Performance
The voting classier demonstrated superior performance compared to all individual base learners, conrming the ef- fectiveness of ensemble learning for sleep disorder classica- tion. By combining Logistic Regression, Decision Tree, and Random Forest classiers through a hard voting strategy, the model achieved a baseline accuracy of 0.912, outperforming standalone classiers by a clear margin. This improvement highlights the ensembles ability to integrate complementary decision boundaries and reduce individual model bias and variance, resulting in more stable predictions across different sleep disorder classes.
After processing, the voting classier attained an accuracy of 0.946 on the testing dataset, with corresponding improve- ments in precision, recall, and F1-score. Performance consis- tency across training, cross-validation, and testing phases indi- cates strong generalization capability and minimal overtting. The voting classier effectively handled overlapping feature distributions and class imbalance inherent in lifestyle-based sleep datasets, making it a reliable and efcient model for real-world sleep disorder detection applications.Table ??. The highest accuracy and F1-score were attained by the Voting Classier-ensemble model.
- Comparative Analysis
voting Classier-based optimisation greatly increases clas- sication accuracy and stability, according to a comparison of baseline and ensemble models. While maintaining important predictors like sleep duration, stress level, physical activity, and BMI category, feature dimensionality reduction removed unnecessary variables.
- Statistical Test Analysis
Classication performance was consistently improved across folds in a paired comparison between baseline and Voting Classier ensemble models. The observed accuracy
Voting Classierin of roughly 4.1% shows how well evolution- ary optimisation addresses feature redundancy and classier parameter sensitivity.
- Discussion
The ndings verify that using lifestyle and health data, ensemble learning in conjunction with Voting Classier-based optimisation yields better performance for sleep disorder de- tection. The ensemble model successfully maintains computa- tional efcency appropriate for web-based deployment while striking a balance between accuracy and interpretability.
- Experimental Setup
- Conclusion
This study presented RestSure AI, a web-based sleep dis- order detection system leveraging ensemble voting classiers and lifestyle-based health data. By integrating multiple super- vised learning algorithms through hard and soft voting strate- gies, the proposed framework achieved robust and accurate classication performance.
Experimental evaluation demonstrated that the soft vot- ing ensemble consistently outperformed individual classiers, achieving a maximum accuracy of 94.6%. The Django-based deployment enables real-time prediction and personalized health feedback, making the system suitable for practical, large-scale sleep health screening.
Future work will focus on incorporating wearable sensor data and extending the model to additional sleep-related con- ditions to further enhance clinical relevance.
References
- S. Djanian, A. Bruun, and T. D. Nielsen, Sleep classication using consumer sleep technologies and articial intelligence: A review of the current landscape, Sleep Medicine, vol. 100, pp. 390403, Dec. 2022, doi: 10.1016/j.sleep.2022.09.004.
- I. A. Hidayat, Classication of sleep disorders using random forest on sleep health and lifestyle dataset, Journal of Dinda: Data Science, Information Technology and Data Analytics, vol. 3, no. 2, pp. 7176, Aug. 2023.
- N. R. Panda, S. Pramanik, P. K. Raut, and R. Bhuyan, Prediction of sleep disorders using novel decision support neutrosophic-based machine learning models, Zenodo Preprint, Jan. 2025, doi: 10.5281/zen- odo.14991712.
- M. A. Rahman, I. Jahan, M. Islam, et al., Improving sleep disorder diagnosis through ensemble machine learning approaches, IEEE Access, Jan. 2025.
- M. M. Monowar, S. M. N. Nobel, M. Afroj, M. A. Hamid, M. Z. Uddin,
M. M. Kabir, and M. F. Mridha, Advanced sleep disorder detection using multi-layered ensemble learning and advanced data balancing techniques, Frontiers in Articial Intelligence, vol. 7, p. 1506770, Jan. 2025, doi: 10.3389/frai.2024.1506770.
- T. U. Wara, A. H. Fahad, A. S. Das, and M. M. H. Shawon, A systematic review on sleep stage classication and sleep disorder de- tection using articial intelligence, Heliyon, vol. 11, no. 7, Jul. 2025, doi: 10.1016/j.heliyon.2025.e43576.
- S. K. Satapathy, H. K. Kondaveeti, and S. R. Sreeja, A deep learning approach to automated sleep stages classication using multi-modal signals, Procedia Computer Science, vol. 218, pp. 867876, Jan. 2023, doi: 10.1016/j.procs.2023.01.067.
- H. Han and J. Oh, Application of various machine learning techniques to predict obstructive sleep apnea syndrome severity, Scientic Reports, vol. 13, p. 6379, Apr. 2023, doi: 10.1038/s41598-023-33170-7.
- T. S. Alshammari, Applying machine learning algorithms for the classication of sleep disorders, IEEE Access, 2024, doi: 10.1109/AC- CESS.2024.3374408.
