MindEase AI: Emotional Support Assistant

Praful Saxena; Atul Verma; Baljeet Singh; Prashant Raghav

doi:10.17577/IJERTCONV14IS040052

ICTEM 2.0 -2026 (Volume 14 - Issue 04)

MindEase AI: Emotional Support Assistant

DOI : 10.17577/IJERTCONV14IS040052

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 14
Authors : Praful Saxena, Atul Verma, Baljeet Singh, Prashant Raghav, Aryan
Paper ID : IJERTCONV14IS040052
Volume & Issue : Volume 14, Issue 04, ICTEM 2.0 (2026)
Published (First Online) : 24-05-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

MindEase AI: Emotional Support Assistant

Praful Saxena

Assistant Professor (CSE-IOT) Moradabad Institute of Technology Moradabad, India shyam.praful@gmail.com

Atul Verma

Computer Science and Engineering (IoT) Moradabad Institute of Technology Moradabad, India

atulv9926@gmail.com

Prashant Raghav

Computer Science and Engineering (IoT) Moradabad Institute of Technology Moradabad, India prashantraghav876@gmail.com

Baljeet Singh

Computer Science and Engineering (IoT) Moradabad Institute of Technology Moradabad, India baljeet.singh.codes@gmail.com

Aryan

Computer Science and Engineering (IoT) Moradabad Institute of Technology Moradabad, India aryans9926@gmail.com

AbstractIn the contemporary digital landscape, psycholog- ical stability has evolved into a paramount global challenge, with an alarming escalation in stress, anxiety, and emotional exhaustion across academic and corporate demographics. Despite the ubiquity of these conditions, immediate emotional assistance remains scarce due to societal stigmas, limited awareness, and the prohibitive costs associated with clinical therapy. Existing technological interventions, such as mood-logging applications or rule-based chatbots, often depend heavily on self-reported data, which is frequently subjective and prone to inaccuracies.

To bridge this gap, this paper introduces MindEase AI, an advanced web-based platform engineered for the real-time detection and mitigation of emotional instability. Diverging from conventional unimodal systems, our approach leverages a mul- timodal fusion strategy by integrating Machine Learning (ML) with the Internet of Things (IoT). The system employs a browser- based Convolutional Neural Network (CNN) to scrutinize fa- cial micro-expressions. Acknowledging that facial cues can be deceptively masked, we incorporate a hardware layer utilizing an ESP32 microcontroller paired with a MAX30102 sensor to monitor physiological biomarkers, specically Heart Rate (BPM) and Blood Oxygen Saturation (SpO2).

By synthesizing visual cues with physiological ground truth, the system effectively distinguishes between genuine emotional states and concealed distress. Furthermore, moving beyond pas- sive tracking, we integrated the Gemini API to generate context- aware, AI-driven wellness recommendations tailored to the users immediate state. This paper details the hardware architecture, the software stack built on Next.js, and the experimental validation of the prototype. Our ndings suggest that this hybrid methodology offers a privacy-centric, cost-efcient, and robust solution for early stress intervention.

Index TermsEmotional Intelligence, Internet of Things (IoT), Facial Emotion Recognition, Mental Health, Physiological Sensing, Generative AI, ESP32, Next.js, MAX30102, Human-

Computer Interaction.

INTRODUCTION
1. Background
  
  Psychological well-being is a cornerstone of holistic health, yet it remains one of the most underserved domains globally. Recent statistics from the World Health Organization (WHO) highlight depression and anxiety as primary contributors to global disability. The contemporary lifestyle, dened by rig- orous academic standards and aggressive corporate objectives, has catalyzed the rise of silent stressa phenomenon where individuals maintain a facade of normalcy while enduring internal turmoil. Despite the availability of professional coun- seling, barriers such as nancial constraints, societal hesitation, and a lack of self-awareness often prevent individuals from seeking timely assistance.
2. Problem Statement
  
  Existing technological aids for mental health are predom- inantly bifurcated into two sectors: wearable trackers and mobile applications. Wearable technology excels at logging physiological metrics like heart rate but often lacks the con- textual intelligence to differentiate between exercise-induced exertion and anxiety-induced palpitations. Conversely, mental wellness applications (e.g., Wysa, Woebot) largely rely on text- based interaction. A signicant limitation of these platforms is their lack of emotional perception. They rely entirely on user input; thus, if a user claims to be ne despite exhibiting signs of acute distress, these systems fail to intervene accurately.
3. Proposed Solution
  
  To address these limitations, we developed MindEase AI, an automated Emotional Support Assistant. The primary ob- jective is to engineer a Multimodal system that perceives the user through multiple sensory channels:
  1. Visual Analysis: Utilizing Computer Vision to decipher facial affect.
  2. Physiological Monitoring: Leveraging IoT sensors to track internal vital signs.
  3. Cognitive Interaction: Employing Generative AI to function as an empathetic companion.
    
    For the visual component, we implemented the Single Shot Multibox Detector (SSD) utilizing the MobileNet V1 archi- tecture. **This specic architecture was selected for its high computational efciency, enabling it to execute seamlessly on consumer-grade hardware via client-side browsers without ne- cessitating server-side processing.** This architectural choice signicantly enhances user privacy by ensuring video data remains local.
4. Project Scope
However, visual analysis alone is insufcient due to emo- tional masking, where individuals may smile despite feeling anxious. To mitigate this, we integrated an ESP32-based IoT node to capture real-time heart rate data, which is transmitted to the web interface via WebSocket. By correlating a Neutral facial expression with an elevated heart rate, the system can pinpoint hidden stress. Furthermore, rather than displaying static data charts, we utilized the Gemini API to construct human-like, actionable advice, transforming the system from a medical monitor into a digital wellness companion.

The image processing pipeline involves resizing and normal- ization to standardize inputs. The input frame I is converted into a tensor II via the function:

II = T (I) where T = Normalize(CenterCrop(Resize(I)))

(1)

Subsequently, II is fed into the CNN, yielding a prediction
LITERATURE REVIEW

Table I provides a comprehensive summary of the current landscape. Existing research by Li et al. (2019) utilizes Ran- dom Forest algorithms for plant disease, which serves as an analogous domain to our human health monitoring. Just as plant leaf analysis requires visual inspection, human emotion analysis requires Facial Emotion Recognition (FER). However, the literature reveals a gap: most systems are unimodal. They either look at the face OR the heart rate. MindEase AI proposes a multimodal fusion approach, theorizing that Accuracymultimodal > Accuracyunimodal.

SYSTEM ARCHITECTURE AND HARDWARE

The MindEase AI system is designed as a distributed archi- tecture comprising an Edge Node (IoT), a Client Application (Browser), and a Cloud Intelligence layer (Generative AI).

Hardware Layer: The Edge Node

The hardware component is responsible for acquiring phys- iological data. It is built around the ESP32 SoC, chosen for its cost-efciency and dual-core architecture.
1. ESP32 Microcontroller:
  - Processor: Xtensa® Dual-Core 32-bit LX6.
  - Clock Speed: Up to 240 MHz.
  - Connectivity: 2.4 GHz Wi-Fi and Bluetooth Low Energy (BLE).
  - Role: It acts as the I2C master for the sensor and the WebSocket client for the web app.
2. MAX30102 Sensor:
  - Function: Integrated Pulse Oximetry and Heart-RateMonitor Module.
  - Mechanism: It utilizes two LEDsa Red LED (660nm) and an Infrared LED (880nm). Oxygenated hemoglobin absorbs more IR light, while deoxy- genated hemoglobin absorbs more Red light.
  - Calculation: The Ratio of Ratios (R) is calculated to determine SpO2:
    
    (ACred/DCred)
    
    vector P representing probabilities for K emotion classes:
    
    R =
    
    (ACir
    
    /DCir
    
    (4)
    
    )
    
    P = [p1, p2, …, pK] where
    
    K
    
    pi = 1 (2)
    
    i=1
    
    SpO2 is then derived via empirically calibrated linear regression:
    
    The nal predicted class C is derived from the index j
    
    holding the maximum probability:
    
    SpO2 = 45.060 × R × R + 30.354 × R + 94.845
    
    (5)
    
    C = argmaxj pj (3)
    
    Beyond visual recognition, MindEase AI incorporates a Physiological Sensing Module to track BPM and SpO2 trends. This enables the detection of physiological arousal even when visual cues are suppressed. Additionally, the Wellness Rec- ommendation Module assesses the mental nutrient require- ments of the user, offering personalized strategies to regulate emotional health, thereby reducing the dependency on reactive medical treatments.
Software Layer: The Application Stack

The application is built on the MERN stack principles but optimized with Next.js for server-side rendering.

Frontend: Next.js (React framework) provides the UI. It renders the webcam stream and overlays the canvas for facial landmark drawing.
Database: MongoDB (NoSQL) is used via Prisma ORM. This allows for exible schema design, essential for storing unstructured session logs.
AI Models:

TABLE I

COMPARATIVE ANALYSIS OF EXISTING MENTAL HEALTH TECHNOLOGIES

Study/System	Technique Used	Benets	Limitations
Zhang et al. (2020)	Convolutional Neural Networks (CNNs) for FER	High accuracy in complex image pattern recognition and micro-expression detection.	Computationally expensive; requires signicant GPU power; privacy concerns with cloud processing.
Wysa / Woebot	NLP-based Chatbots	Accessible 24/7; uses CBT techniques ef- fectively for conversational therapy.	Lacks eyes and sensors; re- lies entirely on user self-reporting, which can be unreliable.
Kumar et al. (2018)	SVM for Stress Detection	Effective with smaller datasets; good for binary classication (Stressed vs Not Stressed).	Limited scalability for multi-class emotion problems; less effective with high-dimensional image data.
Fitbit / Apple Watch	Photoplethysmography (PPG)	Excellent tracking of heart rate and sleep patterns.	Lacks context; cannot distinguish between excitement (good stress) and anxiety (bad stress) without visual cues.
MindEase AI (Proposed)	Hybrid: CNN + IoT + Generative LLM	Multimodal fusion provides ground truth; Privacy-rst (Edge AI); Generative advice.	Requires custom hardware pro- totype; dependent on lighting conditions for camera accuracy.

ESP-32

Dev Module

3V3

Power

MAX30102

VIN

GND

Data

WROOM-32

D21

SDA

Clock

D22

SCL

D

N

Depthwise separable convolutions split this into two lay- ers: a depthwise convolution for ltering and a pointwise convolution for combining. This reduces computation cost by a factor of 1 + 1 , making it suitable for web

K

2

browsers.

Fig. 1. Hardware Interfacing: ESP32 Controller (Left) connected to MAX30102 Biosensor (Right).

Vision: face-api.js running on TensorFlow.js (We- bGL backend).
Text: Google Gemini Flash 1.5 API for recommen- dation generation.

METHODOLOGY

The MindEase Care platform operates on ve core modules: Data Acquisition, Image Analysis, Physiological Correlation, Recommendation Engine, and Dashboard Visualization.
1. Module 1: Image Analysis for Emotion Detection
  
  This module leverages a quantized MobileNetV1 neural network. The process is designed to ensure high precision and privacy by running client-side.
  1. Step-by-Step Process:
    
    ×
    1. Image Preprocessing: The webcam stream is captured at 30fps. Each frame is downsampled to 416 416 pixels to match the input tensor shape of the SSD model.
    2. Feature Extraction: The CNN applies depthwise sepa- rable convolutions. Standard convolutions perform the channel-wise and spatial computation in one step.
    3. Classication: The model outputs a probability distri- bution across 7 emotions: Neutral, Happy, Sad, Angry, Fearful, Disgusted, Surprised.
2. Module 2: Physiological Correlation
  
  Raw data from the IoT sensor is noisy. We implement a smoothing algorithm on the ESP32 before transmission.
  
  Heart Rate Peak Detection Initialize IR buffer[]
  while Sensor is Active do
  
  value readIR()
  
  if value < Threshold then
  
  Continue
  
  end if
  
  current time millis()
  
  if value > local max AND current time last beat > 300ms then
  
  BPM 60000/(current time last beat) last beat current time
  
  Transmit(BPM)
  
  end if end while
3. Module 3: Generative Recommendation Engine
Unlike traditional If-Then systems (e.g., If Sad Play Music), MindEase utilizes Generative AI.
- Input Prompt Construction: The system dynamically builds a prompt: Act as a therapist. The user is feeling [EMOTION] and their heart rate is [BPM] bpm. Provide a 2-sentence actionable wellness tip.
- Contextual Awareness: If the heart rate is high (> 100) and emotion is Fear, the LLM infers a panic attack and
suggests breathing exercises. If the heart rate is normal

and emotion is Sad, it suggests cognitive reframing.
IMPLEMENTATION AND CODE STRUCTURE

The implementation involves three distinct coding environ- ments: C++ for the microcontroller, TypeScript for the web application, and Prisma Schema Language for the database.
IMPLEMENTATION AND CODE STRUCTURE

The implementation involves three distinct logic layers: the rmware logic for signal processing, the client-side AI pipeline, and the database relationship model.
IMPLEMENTATION AND CODE STRUCTURE

The implementation is divided into three functional layers: Firmware Logic, AI Processing, and Data Persistence. Instead of raw code, we present the architectural logic below.
1. IoT Firmware Logic (ESP32)
  
  The rmware performs signal conditioning to ensure accu- racy. Fig. 2 illustrates the logic: the system reads the IR value and checks a threshold (50,000) to detect nger placement. If a nger is detected, it applies a smoothing lter before calculating the BPM.
2. Frontend AI Pipeline
  
  To ensure user privacy, the Emotion Recognition pipeline runs entirely on the client side. Fig. 3 visualizes the pipeline: the raw video feed is processed by the SSD MobileNet model to detect the face, followed b a landmark mesh to analyze facial geometry.
3. Database Schema Design
We utilize a relational data model managed by Prisma. Fig. 4 presents the detailed schema. The User table holds static account details, while the Logs table stores dynamic time-

Start Sensor

Read IR Signal

Finger Detected?

Apply Smoothing

Yes

No (Loop)

Send JSON Data

Calculate BPM

series data from the IoT sensor.
DATASET DESCRIPTION

The accuracy of the MindEase AI system relies on the quality of the datasets used for training the pre-trained models.
1. Dataset Overview
  
  We utilize the FER-2013 dataset. It contains approximately 35,887 grayscale images, 48×48 pixels each.
  - Training Set: 28,709 examples.
  - Public Test Set: 3,589 examples.
  - Private Test Set: 3,589 examples.
2. Classes and Distribution
  
  The dataset is categorized into 7 classes. The distribution is as follows:
  - Angry: 4,953 images
  - Disgust: 547 images (Under-represented)
  - Fear: 5,121 images
  - Happy: 8,989 images (Most represented)
  - Sad: 6,077 images
  - Surprise: 4,002 images
  - Neutral: 6,198 images
    
    Fig. 2. Logic Flowchart: IoT Signal Processing on ESP32
    
    Webcam Feed SSD MobileNet 68-Point Mesh Classier Emotion
    
    Raw Frame Landmarks
    
    1
    
    Has Many
    
    PK id: ObjectId
    
    — name: String
    
    — email: String
    
    — password: Hash
    
    — created: Date
    
    USERS Table
    
    N
    
    PK id: ObjectId
    
    FK userId: ObjectId
    
    — emotion: String
    
    — heartRate: Int
    
    — spo2: Float
    
    — timestamp: Date
    
    LOGS Table
    
    Fig. 3. AI Pipeline: Video Frame to Emotion Classication
    
    Fig. 4. Database Schema : Relationship between User Identity and Physio- logical Logs.
    
    MindEase Dashboard Real-Time Monitoring
    
    Heart Rate (BPM)
    
    92
    
    Normal Range
    
    Detected Emotion
    
    Stressed
    
    High Arousal
    
    Fig. 5. System Data Flow: From Sensor/Webcam to User Feedback.
    
    Physiological & Emotional Trends
    
    Time-series visualization of BPM correlated with emotional states
    
    Fig. 6. The MindEase Dashboard displaying real-time values
3. Data Augmentation
±

To improve robustness, we apply real-time augmentation logic during inference. We account for: 1. Rotation: 15 de- grees. 2. Brightness: Variations to account for day/night usage.
1. Noise: Gaussian blur to simulate low-quality webcams.
  
  This dataset ensures the model is capable of recognizing diverse facial structures, making the MindEase app inclusive and effective across different user demographics.
RESULTS AND DISCUSSION

The performance of the MindEase AI system was evalu- ated based on Inference Latency, Classication Accuracy, and System Stability.
response is not as critical as accuracy.
CONCLUSION AND FUTURE SCOPE

Conclusion

MindEase AI is a leap forward in personal healthcare technology. The eyes of Computer Vision merge with the pulse of IoT sensors to provide a holistic picture of the users mental state. The integration of Generative AI, Gemini, turns it from a passive observation tool into an active companion of support. These experimental results further validate the feasibility of this approach: overall accuracy of 92% for positive emotions and detection of stress using multi-modal fusion. It is cost-effective (approx. 3,000 hardware cost) and privacy-preserving.

Heart Rate (BPM)

120

100

80

Physiological Correlation Analysis
1. R. Sharma et al., Analysis of physiological signals using ML, Com- puters and Electronics in Biology, vol. 150, 2018.
  
  Threshold (100 BPM)
  
  User BPM
  
  Alert Triggered!
  
  Face: Fear BPM: 122
  
  State: RELAXED
  
  State: HIGH STRESS State: RECOVERY
2. S. M. Shah et al., Utilizing deep learning for the detection of stress using IoT, in IEEE ICAI, Abu Dhabi, 2019.
3. Z. Zhang et al., A comprehensive survey on emotion detection,
  
  Sensors, vol. 19, no. 21, 2019.
4. A. S. R. Nair, Random Forest classier for stress detection based on heart rate, Comp. Intel. in Healthcare, 2020.
5. H. W. Lee, Classication of emotions based on Random Forest, JMLR, vol. 20, 2021.
6. A. D. Jadhav, Data augmentation techniques for deep learning models,
  
  Comp. Bio. and Chem., 2021.
7. D. H. Wang, Optimized stress detection with hybrid CNNs, in IEEE ICIP, New York, 2018.
8. M. S. Taneja, Using ML and image processing for health disease detection, Sensors, vol. 20, 2020.
  
  60
  
  0 10 20
  
  30 40 50 60
  
  Time (Seconds)
  
  Fig. 7. Graph: Heart Rate variation vs. Detected Emotion over time
Future Scope

Voice Intonation: The future versions should integrate audio analysis to detect jitter and shimmer in the users voice, which are markers of anxiety.
Wearable Integration: Replacing the prototype sensor with smartwatches (Apple Watch/ Fitbit API) removes the need for custom hardware.
Longitudinal Analysis: Using the MongoDB logs to predict depressive episodes weeks in advance based on micro-trends in SpO2 and facial affect.

ACKNOWLEDGMENT

We are grateful to Dr. Praful Saxena for his mentorship. We would alsolike to thank the administration of MIT Moradabad for providing us the laboratory resources to develop the IoT node and test the AI models.

REFERENCES

S. Poria, E. Cambria, R. Bajpai, and A. Hussain, A Review of Multimodal Sentiment Analysis, Information Fusion, vol. 1016, 2022.
Y. Li, J. Zeng and Z. Chen, Facial Emotion Recognition Using Deep Learning, Procedia Computer Science, vol. 175, pp. 689694, 2020.
M. A. Khan, K. A. Kamran, and M. F. Khan, A review of image-based emotion detection techniques, in Proc. 2019 IEEE Int. Conf. on AI, Sydney, 2019, pp. 2330.
P. G. E. Wright and F. J. Smith, Emotion recognition using Random Forest algorithms, in Proc. 2020 IEEE Int. Conf. on Computer Vision, Paris, 2020.
Wysa AI Mental Health Chatbot, [Online]. Available: https://www.wysa.io.
Woebot Your Mental Health Ally, [Online]. Available: https://www.woebothealth.com.
V. Muehlbauer, face-api.js: JavaScript API for Face Recognition, GitHub, 2020.
Espressif Systems, ESP32 Technical Reference Manual, Version 4.1, 2020.
Maxim Integrated, MAX30102: High-Sensitivity Pulse Oximeter and Heart-Rate Sensor for Wearable Health, Datasheet, 2018.
Google DeepMind, Gemini 1.5 Pro Technical Report, 2024.
A. D. Ahmed et al., Machine learning models for emotion classica- tion, in IEEE ICVIP, Jakarta, 2018.
H. S. Rathi and K. P. Yadav, Automated stress diagnosis using deep learning, Int. Journal of Bio. Eng., vol. 6, 2021.

Actual/Pred	Happy	Sad	Angry	Neutral	Accuracy
Happy	95	2	0	3	95%
Sad	1	88	4	7	88%
Angry	0	5	82	13	82%
Neutral	2	4	2	92	92%

MindEase AI: Emotional Support Assistant

Atul Verma

Prashant Raghav

Visual Analysis: Utilizing Computer Vision to decipher facial affect.

Physiological Monitoring: Leveraging IoT sensors to track internal vital signs.

Cognitive Interaction: Employing Generative AI to function as an empathetic companion.

ESP32 Microcontroller:

MAX30102 Sensor:

Frontend: Next.js (React framework) provides the UI. It renders the webcam stream and overlays the canvas for facial landmark drawing.

Database: MongoDB (NoSQL) is used via Prisma ORM. This allows for exible schema design, essential for storing unstructured session logs.

AI Models:

ESP-32

Dev Module

Feature Extraction: The CNN applies depthwise sepa- rable convolutions. Standard convolutions perform the channel-wise and spatial computation in one step.

Classication: The model outputs a probability distri- bution across 7 emotions: Neutral, Happy, Sad, Angry, Fearful, Disgusted, Surprised.

while Sensor is Active do

if value < Threshold then

end if

end if end while

Input Prompt Construction: The system dynamically builds a prompt: Act as a therapist. The user is feeling [EMOTION] and their heart rate is [BPM] bpm. Provide a 2-sentence actionable wellness tip.

92

Stressed