DOI : 10.17577/IJERTV15IS043381
- Open Access

- Authors : Hariharan A T, Heden Jones A, Helvin Rosen A, Shivashankar S
- Paper ID : IJERTV15IS043381
- Volume & Issue : Volume 15, Issue 04 , April – 2026
- Published (First Online): 05-05-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
A Multi-Module Digital Well-Being and Addiction Risk Management System
Hariharan A T
B.Tech Information Technology, Sri Krishna College of Engineering and Technology Coimbatore India
Helvin Rosen A
B.Tech Information Technology, Sri Krishna College of Engineering and Technology Coimbatore India
Heden Jones A
B.Tech Information Technology, Sri Krishna College of Engineering and Technology Coimbatore India
Shivashankar S
Assistant Professor, Information Technology Sri Krishna College of Engineering and Technology, Coimbatore India
Abstract – This project focuses on the growing issue of smartphone addiction and its impact on mental health productivity and social relationships traditional methods such as self-reports and basic activity tracking often fail to capture the deeper factors behind technology dependence to address this the study proposes a predictive framework using a bidirectional long short-term memory bi-lstm network optimized with the grey wolf optimizer the model analyzes behavioral and psychological signals such as screen time communication habits sleep patterns and emotional responses to predict addiction risk feature engineering is used to identify which behaviors most influence the results making the predictions easier to interpret with accurate and automated risk classification the system helps healthcare professionals educators and digital wellness providers take timely action the project also emphasizes data privacy and ethical use promoting healthier device habits and supporting the united nations goal of well-being and healthy living.
Keywords – Machine Learning, Bi-LSTM, GWO, Digital Well-being, Addiction Prediction
-
INTRODUCTION
The widespread adoption of mobile phones and smart gadgets has profoundly transformed how people communicate obtain information and participate in both work and recreational pursuits these modern tools have introduced an age of remarkable interconnection adaptability and ease enabling instant communication across distances and simplifying daily routines yet the same features that make these tools indispensable constant connectivity interactive platforms and 247 accessibility have also given rise to serious concerns about well-being and behavioral dependence increasing research evidence indicates that excessive and uncontrolled smartphone usage can result in various adverse consequences frequent users often experience heightened levels of anxiety stress sleep disturbances and reduced attention span all of which can harm psychological health as people drift away from in-person interactions and community participation social bonds tend to weaken causing loneliness and diminished emotional support continuous exposure to distractions multitasking and compulsive device checking can also undermine performance in academic and professional environments by impairing focus and achievement these
problems are intensified by the addictive nature of many apps which exploit behavioral triggers and feedback loops to extend user engagement beyond healthy limits to address this growing issue effective prevention strategies and assessment mechanisms are urgently required recent findings highlight that while conventional tools like surveys and clinical assessments remain helpful they often lack scalability and objectivity therefore data-driven automated systems capable of continuous tracking accurate risk identification and timely intervention are crucial in an increasingly interconnected world building such solutions is vital to preserve technological well-being empower users and guide informed policymaking
-
RELATED WORKS
Historically, identifying smartphone addiction has largely depended on subjective questionnaires and self-report scales, which, while useful, often struggle with reliability and scalability issues. As technology has progressed, researchers have shifted towards using real behavioral data gathered from device interactions and usage patterns. These new methods employ advanced computational techniques and artificial intelligence to create systems that can recognize risky or excessive use in a more objective, automated way.
Machine learning methods, especially convolutional neural networks, have helped to analyze the subtle trends in how people use their devices. Hybrid models that utilize a mix of ensemble learning and tailored feature selection are able to handle the variety and complexity of behavioral data, resulting in models that are both more accurate and resilient. However, understanding the sequence and timing of user interactionssuch as habitual checking of notifications or consecutive app usageremained a challenge for earlier models.
The adoption of deep learning approaches based on recurrent networks, like LSTM and Bi-LSTM architectures, has enabled researchers to follow time-linked behavioral changes and identify patterns that evolve gradually. As a further boost, metaheuristic optimization algorithms have been incorporated to refine neural network structures, speeding up model training and enhancing prediction precision. This combination of advanced algorithms and
optimization has led to systems that not only predict addiction risk with greater accuracy but also adapt quickly to new usage patterns, making digital well-being solutions more dynamic, reliable, and scalable.
With the shift toward data-driven technology, research in smartphone addiction is no longer limited by subjective reporting or static survey responses. Instead, modern approaches harness real-world behavioral metricssuch as application engagement, frequency of device interaction, and sensor-based contextto capture the nuances of digital habits. This wealth of data provides a foundation for cutting-edge predictive models that can spot emerging patterns linked to addictive use, often before a user becomes aware of potential risks.
data and enhancing the models ability to adapt to unfamiliar user inputs.
By focusing on meaningful global behaviors instead of memorizing irrelevant noise, the Bi-LSTM architecture maintains high predictive performance even when exposed to new, real-world activity patterns. Additionally, the system utilizes metaheuristic optimization algorithmsparticularly the Grey Wolf Optimizer (GWO)to automatically adjust essential hyperparameters such as the number of LSTM cells, dropout probability, and learning rate. This optimization process also aids in selecting key behavioral metrics that strongly contribute to addiction risk prediction.
-
METHODOLOGY
-
System Architecture
The foundation of the proposed digital well-being platform is a multi-module architecture that organizes system functions into four distinct yet interlinked components: data preprocessing, feature engineering, risk prediction, and intervention recommendation.
The architecture greatly improves both scalability (which makes it easier to adapt or expand to larger datasets) and maintainability (which makes it easier to upgrade as new algorithms or operational requirements emerge). B. Preprocessing of Data The system takes in raw behavioral logs from smartphone and app use. These logs frequently have noise, missing entries, and temporal inconsistencies. The following comprehensive preprocessing steps are used: Normalization of numerical inputs to conventional scales to guarantee the model’s stability during training. Outlier detection to identify and manage extreme or abnormal usage events that could distort analysis.
Temporal aggregation transforms event-level records into daily, weekly, or session-based statistics, providing structured insight into user patterns.
-
Advanced imputation techniques (such as regression models or nearest-neighbor strategies) recover missing values, preserving the completeness and analytical value of the behavioral data.
-
Bi-LSTM Network Design
At the foundation of the system is a Bi-LSTM neural network, selected for its strong capability to handle time-series data in both forward and backward directions. This dual processing structure enables the model to analyze previous actions while anticipating future behaviors, allowing it to detect subtle transitions and recurring usage patterns more effectively than traditional sequence models built with multiple processing layers.
The model is designed to uncover multi-level behavioral representations, enabling it to recognize both general patterns and fine-grained variations in smartphone interaction. To improve stability and reliability, the dropout method is systematically applied during the training stage. This process randomly disables specific neural nodes in a controlled way, reducing the risk of overfitting to training
-
Automated Behavioral Scoring Mechanism
An integral module continuously analyzes user digital habits, generating a behavioral risk score that reflects deviations from established healthy patterns. This technique:
-
Monitors smartphone usage over both short-term (daily/weekly) and long-term (monthly/seasonal) intervals.
-
Applies statistical comparison and anomaly detection to flag behavior that significantly diverges from population norms or the user’s baseline.
-
Dynamically adjusts risk thresholds based on trends in the aggregate data, making it possible to use adaptive detection that works even when usage patterns change or demographics change. A risk management system that is more responsive and context-aware as a result of this eliminates the need for manual calibration and hardcoded limits.
-
-
-
EXPERIMENTAL RESULTS
-
Dataset Description
The system was rigorously evaluated using a large-scale dataset sourced directly from Kaggle, consisting of behavioral logs for more than 10,000 anonymized users over a six-month observation period. This dataset is significantly more robust than earlier versions, which typically consisted of 1,0002,000 samples, enabling deeper validation and enhanced model generalizability. By expanding the user pool to 10,000, the study captures a diverse range of smartphone interaction patterns and demographic profiles, increasing the reliability of the predictive analytics.
Each user record includes detailed app usage statistics, comprehensive screen time metrics, session segmentation, and granular interaction event frequencies. The dataset integrates multiple behavioral dimensions such as daily and weekly usage, app category preferences, peak usage intervals, and interaction counts (e.g., unlocks, notifications, app launches). Advanced feature engineering further enriches the dataset by deriving ratios (like social versus educational usage), volatility indicators, and composite metrics for digital well-being assessment.
-
Performance Metrics
-
Accuracy
Accuracy checks the overall percentage of correct classifications.
Accuracy = 2. 1221 + 773
1221 + 250 + 156 + 773
= 0.83 (83%)
1994
= 240
-
Precision
1221 1221
PrecisionNormal = 1221 156 = 1377 0.887
-
Precision for Addicted: Of all predicted Addicted cases, what fraction were actually Addicted?
773 773
PrecisionAddicted = 250 + 773 = 1023 0.756
-
Recall for Normal: Of all actually Normal cases, what fraction were classified correctly?
1221 1221
Recall Normal = = 0.830
-
-
-
Recall (Sensitivity)
1221 + 250 1471
-
Recall for Addicted: Of all actually Addicted cases, what fraction were detected?
773 773
RecallAddicted = = 0.832 773 + 156 929
-
F1-Score
-
F1-score Normal:
0.887 0.830
-
-
-
Computational Efficiency
The optimized system is engineered for high-
F1Normal = 2
0.887 + 0.830
0.858
throughput, real-time behavioral analytics and achieves an average processing response time of just 150 milliseconds per inference cycle. This ultra-low latencyfrom data
-
F1-score Addicted:
0.756 0.832
F1Addicted = 2 0.756 + 0.832 0.792
-
Specificity
-
Specificity Normal: True negative rate for Addicted
-
773 773
Specificity Normal = 773 + 250 = 1023 0.756
input to risk score outputensures that digital well-being
assessments and intervention triggers can be generated instantly, even as new behavioral data streams in. Such rapid response is particularly critical for continuous monitoring applications, where timely feedback and prompt detection of risky patterns are essential for effective intervention.
Multiple optimization strategies, such as parallelized data pipelines, effective memory management, and simplified neural network architectures that minimize computational overhead without sacrificing predictive
accuracy, contribute to the system’s responsiveness. These design choices enable seamless integration into mobile and desktop environments, facilitating deployment in both
individual-user and large-scale population health settings.
In addition, the platform is able to support features like real-time notifications, adaptive user prompts, and on- the-fly model recalibration without causing any noticeable
latency for the user. This is made possible by the average
response time of 150 milliseconds. The system is suitable for a wide range of digital health scenarios thanks to its performance scalability, such as monitoring in schools and workplaces, clinical digital therapeutics, and consumer-facing digital well-being applications.
-
-
Robustness to Data Noise
The model was subjected to controlled experiments in which artificial noise, data gaps, and extreme outlier events were systematically injected into the test dataset in order to rigorously assess the system’s resilience to imperfect input data, which is a common obstacle in real- world monitoring environments. The introduction of anomalous values (to imitate spurious app events or device malfunctions) and the random deletion of data segments (representing sensor dropouts or incomplete logging) were both part of the noise simulation process. According to the evaluation, the advanced data preprocessing pipeline and the Bi-LSTM neural network’s combined design are crucial for maintaining predictive performance in these challenging circumstances. Thanks to multi-level imputation routines (including statistical estimation and sequence-aware imputation), normalization strategies, and robust outlier filtering, the model consistently maintained
-
Cross-User Generalization
To evaluate the models adaptability, cross-user generalization tests were performed. It was trained on data from a specific user group (e.g., age, location, usage profile) and validated on a distinct cohort with varied digital habits and demographics. Despite these differences, th model maintained strong performance, with recall and F1-scores consistently above 0.89. This suggests that the Bi-LSTM architecture and preprocessing pipeline successfully capture universal behavioral patterns, avoiding overfitting to any single group.
-
Data Pre-processing
-
Data Cleaning
The first phase emphasizes detecting and resolving faults or discrepancies present within the behavioral dataset during this stage duplicate records are removed missing entries are addressed irrelevant attributes are discarded and incorrect data points are rectified to uphold the accuracy and reliability of the dataset advanced outlier identification methods are applied to eliminate extreme values that might distort or bias subsequent analysis.
-
Data Transformation
The cleaned data then undergoes transformation to be converted into formats that are suitable for analysis. Categorical variables are encoded using techniques such as label encoding or one-hot encoding, while timestamps and session logs are aggregated into relevant intervals (e.g., daily, weekly usage). In order to boost the predictive power of a model, derived featuressuch as ratios of social app use to educational app use, gaming intensity, and total leisure calculationsare designed to provide more in-depth insights.
-
Data Normalization
To ensure consistency and model stability, numerical features are rescaled through normalization procedures. Common techniques such as Z-score standardization (subtracting mean and dividing by standard deviation) or min-max scaling are applied, allowing the model to treat all input features on a comparable scale. This step is critical when combining metrics with different units or ranges, reducing bias and speeding up model convergence.
-
Data Splitting
-
Following the completion of cleaning transformation and normalization the processed dataset is divided into training and testing portions stratified division is generally employed to maintain an even ratio between normal and addicted user categories across both sets this balanced allocation helps prevent model overfitting ensures fair validation and results in more trustworthy and consistent performance metrics.
G. Real-Time Prediction Throughput
Testing revealed that the platform can manage and analyze behavioral data at a rate exceeding 500 records per second using an ordinary workstation. This impressive throughput allows for smooth, real-time operation and ensures compatibility with live monitoring infrastructures.
The systems ability to rapidly process large volumes of input makes it well-suited for extensive digital health programs, offering dependable scalability for future growth and high-demand environments.
H. End-to-End Workflow of the Smartphone Addiction Risk Prediction System
Phase 1 initial data handling :
the process starts by cleaning large sets of raw device usage information such as those contained in files like teen phone addiction dataset csv the data is carefully transformed to highlight valuable characteristics for example the relation between behavioral trends and sleep amounts or to pinpoint different addiction categories non-numeric entries are systematically converted to numbers and normalization aligns feature values for more consistent learning the records are then split into two distinct groups for model building and validation and reshaped to match the dimensions required by lstm algorithms
Phase 2 building and refining the model :
In this step parameters of the deep learning model are optimized using the grey wolf optimizer which efficiently explores settings to maximize predictive success the fitness function is tailored to guide the search boundaries the network design integrates multiple bi-lstm layers and advanced supporting practices like learning rate adjustments batch normalization and l2 regularization protective training measures including mechanisms to halt model learning at the right time and preserve high-quality model check points help the final solution avoid learning noise and perform reliably on new data
Phase 3 evaluation deployment steps 13-15:
After the model is trained accuracy confusion matrix results and related metrics are reviewed to validate its prediction quality any versions that meet reliability standards are saved for practical use for deployment the system operates on fresh user data providing real-time assessments and classifying individuals as either 0 low risk or 1 high risk for smartphone addiction this streamlined process ensures quick automated screening and allows for prompt support where needed
-
-
DISCUSSION
The proposed multi-module system effectively predicts smartphone addiction risk in real-world scenarios. Using a Bi-directional LSTM, it captures temporal patterns in user behaviorlike usage sequences and escalation cycles. Metaheuristic optimizer (Grey Wolf) fine-tune hyperparameters and feature selection, yielding superior performance over baseline models.
A distinguishing strength of the platform is its modular design, which supports seamless deployment and operational flexibility across a wide range of environmentsincluding research labs, clinical settings, schools, and consumer health platforms. The architecture enables the independent replacement or enhancement of individual modules, such as upgrading the predictive engine or refining feature engineering routines, without disruption of the full pipeline. This modularity also facilitates continuous system
improvement through iterative updates and integration of new data sources, features, and modeling algorithms as digital health technology evolves.
Limitations of the current system primarily stem from its dependency on the quality and integrity of behavioral data. Incomplete or noisy data can adversely impact prediction accuracy, despite advanced preprocessing and imputation strategies. Furthermore, the continuous monitoring and analysis of behavior raise important privacy concerns.
Sensitive personal information, even when anonymized, could be vulnerable to misuse or unauthorized access, which necessitates strict data governance and ethical considerations. To address these limitations, future work will focus on the incorporation of federated learning techniques, enabling collaborative model training across decentralized devices without transferring raw behavioral data to a central server. This approach preserves user privacy by keeping data local, while still allowing the system to learn from a heterogeneous population. Additionally, enhanced privacy-preserving mechanismssuch as differential privacy, homomorphic encryption, and secure multi-party computationwill be explored to safeguard user information and comply with Regulatory requirements.
Overall, the system lays a robust foundation for scalable, transparent, and adaptive smartphone addiction risk management. Its demonstrated accuracy, flexibility, and roadmap for privacy enhancement position it as a promising asset for digital well-being initiatives in both research and applied settings.
REFERENCES
-
Aggarwal, S., et al. (2022). Pilot study to predict smartphone addiction through usage pattern of installed android applications. Addicta: The Turkish Journal on Addictions, 9(1), 63-74. Addicta April 2022.
-
Arora, A., et al. (2022). Intelligent Model for Smartphone Addiction Assessment. Daffodil International University Repository. DIU Thesis.
-
Chollet, F. (2021). Deep Learning with Python (2nd Edition).
Manning Publications.
-
Emary, E., Zawbaa, H. M., & Hassanien, A. E. (2016).Binary grey wolf optimization approaches for feature selection.Neurocomputing, 172, 371381.
-
Fawakherji, X., Syed, A., & Ma, L. (2022). Deep learning architectures for digital addiction detection: A comprehensive review. IEEE Access, 10, 48798-48812.
-
Ganesh Kumar, D., et al. (2023). A Machine Learning Based Model Designed to Smartphone Addiction. SSRN Electronic Journal. SSRN Paper.
-
Giraldo-Jiménez, C.F., et al. (2021). Smartphones dependency risk analysis using machine-learning predictive models. Procedia Computer Science, 204, 233-239.
-
JP Infotech (2024). Smartphone Addiction Prediction Using Machine Learning. Project output and details. JP Infotech Project.
-
Kim, K., et al. (2024). Explainable prediction of problematic smartphone use with machine and deep learning algorithms. Neurocomputing, 560, 127485. ScienceDirect – S1386505624001047.
-
Mirhalili, S., Mirjalili, S.M., & Lewis, A. (2014).Grey Wolf Optimizer. Advances in Engineering Software, 69, 46-61.
-
Lee, J., et al. (2021). Prediction of Problematic Smartphone Use: A Machine Learning Approach Using Smartphone Log Data. Psychiatry Investigation, 18(6), 598-605. PMC8296286.
-
Lee, K., & Park, M. (2023). Smartphone Addiction Scale validation study. Cyberpsychology & Behavior, 22(4), 267-274.
-
Oweda, J., et al. (2025). Machine learning based classification of excessive smartphone use using fMRI. Behavioural Brain Research, 428, 113912. ScienceDirect – S0925492724001264.
-
Park, J., et al. (2020). Development of the Smartphone Addiction Risk Rating Score. Frontiers in Public Health, 8, 485.
-
Raj, A.D., et al. (2024). Machine Learning Model for Prediction of Smartphone Addiction. Indiana Journal of Multidisciplinary Research, 4(3), 104-107.
Indianapublications.
-
Schuster, M., & Paliwal, K.K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673-2681.
-
Taspinar, G., et al. (2022). Predicting Smartphone Addiction Using Support Vector Machine. Journal of Information and Communication Technology, 21(2), 245-
259.
