DOI : https://doi.org/10.5281/zenodo.18901348
- Open Access

- Authors : Ayush Singh, Harsh Tiwari, Prof. Ajay Kr. Srivastava
- Paper ID : IJERTV15IS020843
- Volume & Issue : Volume 15, Issue 02 , February – 2026
- Published (First Online): 07-03-2026
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License:
This work is licensed under a Creative Commons Attribution 4.0 International License
Online Mental Health Support Platform: An Intelligent Web-Based System for Emotion-Aware Digital Mental Healthcare
Ayush Singh
Department of Information Technology Shri Ramswaroop Memorial College of Engineering and Management (SRMCEM) Lucknow, India
Harsh Tiwari
Department of Information Technology Shri Ramswaroop Memorial College of Engineering and Management (SRMCEM) Lucknow, India
Prof. Ajay Kr. Srivastava
Department of Information Technology Shri Ramswaroop Memorial College of Engineering and Management (SRMCEM) Lucknow, India
Abstract – Mental health disorders are becoming increasingly common in todays fast-moving and digitally connected world. While awareness has improved, many individuals still struggle to access timely psychological support due to social stigma, financial limitations, and geographical constraints. This paper presents the design and development of an intelligent Online Mental Health Support Platform that integrates artificial intelligence techniques for emotion detection using speech and facial analysis. The system aims to identify emotional states in real time and provide appropriate self-help resources or therapist recommendations based on user needs. By combining web technologies with machine learning models, the proposed solution offers a confidential, accessible, and scalable mental wellness framework. The platform demonstrates how emerging AI technologies can support early emotional intervention and promote proactive mental healthcare.
Keywords – mental health support, emotion detection, speech analysis, facial recognition, artificial intelligence, digital therapy, personalized recommendation
- INTRODUCTION
Mental health is a foundational component of overall well- being, influencing how individuals think, behave, and respond to lifes challenges. In recent years, rising academic competition, workplace demands, social pressures, and digital overload have contributed to increased levels of stress, anxiety, and emotional fatigue.
This paper focuses on the design and implementation of an AI-enabled Online Mental Health Support Platform capable of analyzing emotional states in real time using speech and facial cues. The system leverages consumer-grade hardware, including a standard webcam and microphone, along with open-source machine learning frameworks to deliver accessible and cost-effective mental health assistance. The primary objective is to reduce barriers associated with traditional therapy models by providing an intelligent, confidential, and user-friendly digital alternative. By
integrating emotion recognition with personalized recommendations and therapist connectivity, the proposed platform promotes inclusive and technology-driven mental healthcare suitable for students, working professionals, and individuals in underserved regions
- LITERATURE REVIEW
There have been various research studies and commercial platforms have explored digital solutions for mental health support. Online therapy platforms provide remote counselling services and improve accessibility; however, they mainly depend on scheduled human interaction and lack automated emotional assessment features. In addition, subscription-based models may restrict access for many users.
At the same time, research in affective computing has shown that emotions can be identified through speech and facial expressions. Speech emotion recognition systems analyze features such as pitch, tone, and energy levels, while facial expression recognition systems use computer vision techniques to interpret facial landmarks and micro- expressions. Early implementations using traditional machine learning methods demonstrated feasibility but were often sensitive to background noise, lighting conditions, and user variability.
Recent advancements in deep learning have improved emotion classification accuracy and enabled real-time processing. However, most existing systems remain in standalone research prototypes and are not integrated into complete mental health support platforms that combine emotion detection with personalized recommendations and therapist connectivity.
The proposed research addresses this gap by developing a unified web-based platform that integrates speech and facial
emotion recognition with intelligent resource suggestions and therapist locator services, ensuring practical usability and accessibility.
- METHODOLOGY
The proposed system follows a modular workflow that processes user inputs, detects emotional states, and provides personalized support. Each stage is designed to ensure real- time performance, accuracy, and secure data handling.
- System Overview
The platform collects speech and facial inputs using a microphone and webcam. Audio signals and facial expressions are analyzed using machine learning models to detect emotional states. Based on the identified emotion, the system provides personalized recommendations or suggests professional therapist consultation. The design focuses on affordability, accessibility, and reliable performance using standard consumer devices.
- Major Components
- User Authentication Module manages secure login and user profiles
- Speech Emotion Recognition Module analyzes voice input
- Facial Expression Recognition Module detects facial expressions
- Emotion Classification Module combines results for final prediction
- Recommendation Engine suggests relevant self- help resources
- Therapist Locator Module provides professional contact information
- Database Module stores user data securely
- Processing Pipeline
- Input Acquisition: Speech and facial data are captured.
- Pre-processing: Audio and image data are cleaned and normalized.
- Feature Extraction: Important voice and facial features are identified.
- Emotion Classification: Machine learning models predict emotional state.
- Recommendation Output: Personalized suggestions or therapist details are displayed.
- Data Storage: Emotional history is securely maintained for future reference.
- Calibration Procedure
A short initialization step may collect neutral speech and facial samples to improve personalization and prediction accuracy.
- System Flow Representation
Fig. 1. Overall system architecture of the eye-controlled cursor system
E. Interaction Zones & State-Based Controls
To enhance user experience and prevent unintended
actions during emotional assessment sessions, the proposed platform incorporates structured interaction zones with state-
based control mechanisms. These controls ensure that user inputs such as audio recording, facial scanning, and recommendation activation occur only when intentionally triggered.
A dedicated Session Control Zone is implemented within the interface to manage the activation and deactivation of emotion detection modules. When the user selects this zone and maintains interaction for a short confirmation duraion, real-time emotion monitoring begins. Similarly, holding the control again pauses the detection process. This approach prevents accidental recordings and gives users full control over when analysis is active.
In addition, a Resource Interaction Zone is integrated to manage scrolling and content engagement within the recommendation panel. When users navigate through mental wellness resources or therapist listings, scrolling actions are enabled only within defined content areas. This ensures that emotional data processing does not interfere with browsing activities.
To reduce accidental triggers, the system applies short confirmation delays rather than instant activation. This threshold-based interaction design improves operational stability, minimizes unintended submissions, and enhances user comfort during prolonged usage.
Fig. 2. Screen-space interaction design for controlled emotional assessment within the proposed AI-based Mental Health Support Platform
- System Overview
- RESULTS AND DISCUSSION
The proposed AI-based Mental Health Support Platform was tested under standard indoor conditions using a webcam and microphone. The system successfully performed real-time emotion detection with stable response time and smooth interface behavior.
The facial emotion recognition module showed reliable performance under normal lighting, accurately identifying major emotional states such as sadness, happiness, anger, and neutral expressions. Performance slightly decreased in low- light conditions or when facial visibility was reduced.
Similarly, the speech emotion recognition module performed effectively in quiet environments by analyzing tone, pitch,
and intensity patterns. Background noise moderately affected prediction confidence, highlighting the importance of clear audio input.
The integration of facial and speech analysis improved overall reliability. When one input source was less stable, the other supported the final prediction, resulting in balanced and consistent emotional assessment.The screen-space interaction design enhanced usability by allowing users to start, pause, and navigate sessions intentionally. This reduced accidental actions and improved user comfort during prolonged use.
Although environmental factors and subtle emotional variations remain challenges, the system demonstrates practical feasibility for real-time emotional monitoring within a digital mental health support framework.
A. Figures and Tables
TABLE I
REPRESENTATIVE SYSTEM PERFORMANCE METRICS UNDER DIFFERENT LIGHTING CONDITIONS
Lighting/Talking Condition Average Detection Accuracy (%)
Facial/Speech Emotion expression accuracy
Average Response Delay (ms)
Low Lighting 70-80 Medium – High 90-110 Low Speaking 65-80 MediumHigh 95-120 Strong Lighting (light behind user) 90-100 Medium 90130 Natural Speaking 85-95 LowMedium 45-55 Giving Feedback 8590 MediumHigh 110-150 a. Metrics based on prototype observations using a standard webcam and audio processing
As shown in Table I, system performance decreases in dim lighting conditions owing to the difficulty in detecting facial expressions.
Fig. 3. Illustration of normalized gaze-to-screen coordinate mapping, showing how estimated gaze position is proportionally mapped to corresponding screen regions after calibration
Fig. 3 illustrates the process of mapping extracted emotional features to a final emotional classification score. During system calibration, baseline facial and speech samples are recorded. These reference values are used to normalize real-time inputs, ensuring consistent emotion prediction across different users and environmental conditions.
Equation: The overall emotional confidence score is computed using weighted multimodal fusion:
This equation represents the proportional contribution of facial and speech predictions to the final emotional classification. The weights are adjusted based on input reliability. For example, if audio quality is low, the system increases the contribution of facial analysis, and vice versa.
- CONCLUSION AND FUTURE WORK CONCLUSION
This research presented the design and implementation of an AI-based Online Mental Health Support Platform that integrates facial expression analysis and speech emotion recognition within a unified system. The platform demonstrates that real-time emotional assessment can be achieved using standard consumer hardware without requiring specialized equipment.
By combining multimodal emotion detection with structured interaction control, the system improves prediction reliability while maintaining user comfort and privacy. The experimental evaluation confirms that the platform performs effectively under normal indoor conditions and provides stable, responsive feedback.
Overall, the proposed system highlights the practical feasibility of integrating artificial intelligence into digital mental health support, offering accessible and technology- driven emotional monitoring.
FUTURE WORK:
Although the system demonstrates promising performance, several improvements can be explored in future research.
First, incorporating advanced deep learning architectures and larger, more diverse datasets may improve the recognition of subtle emotional variations.
Second, expanding the platform into a mobile-based application could enhance accessibility and allow continuous monitoring in everyday environments.
Third, integration with chatbot-based conversational support and adaptive recommendation systems may provide more personalized mental wellness guidance.
Finally, implementing stronger privacy-preserving mechanisms and encrypted cloud-based storage would further enhance user trust and data security. With these advancements, the platform can evolve into a more intelligent, scalable, and comprehensive digital mental health solution.
REFRENCES
- World Health Organization (WHO). Mental Health: Strengthening Our Response. Retrieved from: https://www.who.int/
- American Psychological Association (APA). Mental Health Resources and Research. Retrieved from: https://www.apa.org/
- OpenCV Documentation. Open Source Computer Vision Library.
Retrieved from: https://docs.opencv.org/
- TensorFlow Documentation. Machine Learning Framework.
Retrieved from: https://www.tensorflow.org/
- PyTorch Documentation. Deep Learning Framework. Retrieved from: https://pytorch.org/
- MongoDB Documentation. NoSQL Database for Web Applications. Retrieved from: https://www.mongodb.com/docs/
- Librosa Library.Python Package for Speech and Audio Analysis.
Retrieved from: https://librosa.org/
- BetterHelp. Online Counseling Platform. Retrieved from: https://www.betterhelp.com/
- Talkspace. Online Therapy Platform. Retrieved from: https://www.talkspace.com/
- S. Poria, E. Cambria, R. Bajpai, A. Hussain. A Review of Affective Computing: From Speech and Facial Emotion Recognition to Depression Detection. IEEE Transactions on Affective Computing, 2018.
- K. Schuller, D. Schuller. Machine Learning for Speech Emotion Recognition Advances and Future Directions. Springer, 2021.
- Hansen, J.H.L., Li, M. Emotion Recognition Using Speech and Language Information. IEEE Signal Processing Magazine, 2017.
