Intelligent Virtual Interview Assistant with Instant Performance Analysis

Prof. M. D. Ingle; Pournima Mali; Jagruti More; Anand Magar; Kunal Patil

doi:10.17577/IJERTV15IS061076

Volume 15, Issue 06 (June 2026)

Intelligent Virtual Interview Assistant with Instant Performance Analysis

DOI : 10.17577/IJERTV15IS061076

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 0
Authors : Prof. M. D. Ingle, Pournima Mali, Jagruti More, Anand Magar, Kunal Patil
Paper ID : IJERTV15IS061076
Volume & Issue : Volume 15, Issue 06 , June – 2026
Published (First Online): 25-06-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Intelligent Virtual Interview Assistant with Instant Performance Analysis

‌Prof. M. D. Ingle

‌Department of Computer Engineering JSPMS Jaywantrao Sawant College of Engineering, ‌Pune,India

Pournima Mali

Department of Computer Engineering JSPMS Jaywantrao Sawant College of Engineering, Pune,India

Jagruti More

Department of Computer Engineering JSPMS Jaywantrao Sawant College of Engineering, Pune, India

‌Anand Magar

Department of Computer Engineering JSPMS Jaywantrao Sawant College of Engineering, Pune, India

Kunal Patil

Department of Computer Engineering, JSPMS Jaywantrao Sawant College of Engineering Pune, India

Abstract – The recruitment process increasingly relies on artificial intelligence to improve efficiency, scalability, and fairness. This paper presents an AI-Powered Virtual Interviewer with Real-Time Feedback, an intelligent interview platform that evaluates candidates through speech, text, and facial-expression analysis. The system integrates speech recognition, natural language processing, computer vision, and large language models to conduct adaptive interviews and provide instant performance feedback.

Keywords-Artificial Intelligence, Virtual Interviewer, Real- Time Feedback, Natural Language Processing, Speech Analysis, Facial Expression Recognition, Large Language Models, Recruitment Automation.

‌INTRODUCTION

Today’s competitive job market, interview performance plays a crucial role in determining career opportunities for students and job seekers. While candidates often possess the required technical knowledge, many struggle to effectively communicate their skills, manage nervousness, and respond confidently during interviews. Regular interview practice is essential for improving these skills; however, access to professional guidance and mock interview sessions is often limited due to time, cost, and availability constraints.

Recent advancements in Artificial Intelligence (AI), Natural Language Processing (NLP), Speech Recognition, and Computer Vision have created new opportunities for developing intelligent learning systems that can simulate real- world interview environments. These technologies enable automated analysis of spoken responses, communication patterns, facial expressions, and overall interview performance, providing users with valuable insights into their strengths and areas for improvement.

This paper presents an AI-Powered Virtual Interview Practice System with Real-Time Feedback, designed to help users

prepare for interviews through interactive mock interview sessions. The system generates interview questions, evaluates responses, analyzes communication skills, and provides instant personalized feedback. By creating a realistic and engaging practice environment, the platform enables users to improve their confidence, communication abilities, and interview readiness.

The proposed system combines speech-to-text conversion, large language models, facial expression analysis, and performance analytics to deliver a comprehensive interview preparation experience. Unlike traditional mock interview methods, the system offers continuous practice opportunities, objective evaluation, and actionable recommendations, making interview preparation more accessible, efficient, and effective for learners.

The primary objective of this work is to develop an intelligent interview practice platform that supports self-assessment, skill enhancement, and continuous improvement through AI-driven feedback and performance tracking.
1. For Candidates:
  - ‌No software installation needed runs entirely in- browser via a standard webcam and microphone
  - Simple 3-step flow: register start interview receive feedback report
  - Supports multiple languages and works on low- bandwidth connections (minimum 2 Mbps)
  - Feedback is displayed in plain language, no technical jargon, with specific timestamps pointing to moments in the interview.
2. ‌Easy Interview Practice
  - ‌Users can begin a mock interview with a single click. The AI interviewer automatically generates questions and guides users throughout the interview process. No manual configuration or specialized training is required.
3. Self-Learning Support
  - ‌The platform encourages self-learning by providing personalized recommendations and improvement tips. Users can practice repeatedly and monitor their communication development through continuous feedback and assessment.

‌SOFTWARE AND HARDWARE REQUIREMENTS

Software Specifications

Component	Specification
Operating System	Windows 10/11 or Ubuntu Linux
Programming Language	Python 3.10+
Frontend	HTML, CSS, JavaScript, Streamlit
Backend Framework	Flask / FastAPI
Database	MongoDB / SQLite
ML Frameworks	TensorFlow, PyTorch, Scikit-learn
NLP Libraries	NLTK, SpaCy, Hugging Face
CV Libraries	OpenCV, MediaPipe
Speech Libraries	SpeechRecognition, Librosa, PyAudio
Version Control	Git / GitHub

‌Hardware Specifications

Component	Specification
Processor	Intel Core i5 (8th Gen+) or AMD Ryzen 5
RAM	Minimum 8 GB (16 GB Recommended)
Storage	256 GB SSD or higher
GPU	NVIDIA GTX 1650+ (for DL inference)
Webcam	HD Camera for facial analysis
Microphone	Noise-cancelling microphone
Internet	Stable broadband connection

‌SYSTEM DESIGN AND PERFORMANCE EVALUATION
1. The development of the AI-Powered Virtual Interviewer with Real-Time Feedback began with identifying the need for an accessible platform that helps users improve their communication and interview skills through regular practice. Many students and job-seekers face difficulties in expressing their thoughts confidently during interviews due to a lack of preparation and constructive feedback. To address this challenge, the project was designed to simulate a real interview environment where users can practice answering questions and receive immediate feedback on their performance.
2. ‌Units
  
  The AI-Powered Virtual Interviewer with Real-Time Feedback uses various measurement units to evaluate user performance and system effectiveness. Interview duration is measured in seconds () and minutes (min) to track the time taken by users to answer questions. Speech processing components analyze speaking rate in words per minute (WPM), which helps determine the user’s fluency and pace of communication.
  
  Performance metrics such as communication quality, grammar accuracy, vocabulary usage, confidence level, and overall interview performance are represented as percentage scores (%). Speech-to-text accuracy is also measured in percentage form to evaluate the effectiveness of voice recognition. System response time, which indicates the time required to process a user’s response and generate feedback, is measured in milliseconds (ms). Data storage and processing requirements are expressed in megabytes (MB) and gigabytes (GB) depending on the amount of interview data maintained by the system.
3. ‌Equations
  
  The overall performance of a user during a mock interview can be calculated by combining multiple communication- related parameters such as fluency, grammar, confidence, and vocabulary usage. The overall communication score is computed as:
  
  CS = (WF × F + WG × G + WC × C + WV × V) / (WF + WG
  
  + WC + WV) (1)
  
  where,
  
  CS = Communication Score
  
  F = Fluency Score
  
  G = Grammar Score C = Confidence Score V = Vocabulary Score
  
  WF = Weight assigned to Fluency WG = Weight assigned to Grammar WC = Weight assigned to Confidence WV = Weight assigned to Vocabulary
  
  ‌Equation (1) calculates the weighted average of all communication parameters to generate a final communication score for the user. The resulting score is expressed as a percentage and is used to provide real-time feedback and performance analysis during interview-practice sessions.
4. Some Common Mistakes
  
  ‌During interview-practice sessions, users often make several communication-related mistakes that negatively affect their overall performance. One of the most common issues is speaking too quickly or too slowly, which can reduce the clarity and effectiveness of communication. Many users also use filler words such as “um,” “uh,” “like,” and “you know” excessively, making their responses appear less confident and less professional‌
  - Users participating in mock interviews often make several communication-related mistakes that can negatively affect their interview performance. One of the most common issues is the excessive use of filler words such as “um,” “uh,” and “like,” which reduce the clarity and professionalism of responses. Research in automated interview analysis has shown that speech patterns and verbal fluency significantly influence communication effectiveness and interviewer perception [2].
  - Another common mistake is providing irrelevant or incomplete answers that fail to address the interview question properly. Many users struggle to organize their thoughts logically, resulting in responses that lack structure and coherence. Intelligent interview-training systems can identify such deficiencies and provide personalized recommendations for improvement [1], [4].
  - ‌Grammar mistakes, limited vocabulary, and improper sentence construction are also frequently observed during interview practice sessions. Natural Language Processing techniques and transformer-based language models can effectively analyze textual responses and assess linguistic quality, helping users improve their communication skills [7], [8], [10].
  - Poor pronunciation and unclear speech delivery can further impact communication quality. Modern speech- recognition systems are capable of converting spoken responses into text and identifying speech-related issues with high accuracy, enabling real-time performance evaluation and feedback generation [6].
  - Lack of confidence is another major challenge faced by users during interviews. Long pauses, hesitation, nervousness, and inconsistent speaking patterns often reduce overall performance. Conversational coaching
    
    systems and AI-based interview trainers have demonstrated the ability to detect such behavioral indicators and assist users in developing greater confidence through repeated practice and constructive feedback [3], [4].
  - The proposed AI-Powered Virtual Interviewer with Real- Time Feedback automatically detects these common mistakes and provides personalized suggestions for improvement. By evaluating fluency, grammar, vocabulary usage, response relevance, and confidence, the system helps users strengthen their communication abilities and become better prepared for real-world interview situations [1], [2], [3].

TECHNOLOGIES AND ARCHITECTURE

Technology Stack

TABLE I. AI VIRTUAL INTERVIEWER TECHNOLOGY STACK

Layer	Technology	Version	Function
User Interface	React.js / Flutter Web	React 18	Candidate- facing interview portal
Authentication	OAuth 2.0 / JWT	OAuth 2.0	Secure session management
Speech-to-Text	OpenAI Whisper	Whisper v3	Real-time speech transcription
Question Engine	GPT-4o (fine- tuned)	GPT-4o	Dynamic question generation
Answer Evaluation	BERT + Rubric Scorer	BERT-large	Semantic answer quality scoring
Sentiment Analysis	RoBERTa / DeepFace	RoBERTa-base	Facial & textual sentiment
Backend	FastAPI (Python)	Python 3.11	API orchestration
Database	PostgreSQL + Redis	PG 16 / Redis 7	Session storage & caching
Feedback Engine	GPT-4o + custom prompts	GPT-4o	Structured post- interview report
Deployment	Docker + AWS Lambda	Docker 24	Scalable cloud deployment

Architecture Diagram

Fig. 1 illustrates the architecture of the AI-Powered Virtual Interviewer with Real-Time Feedback. The system generates interview questions based on the selected domain and conducts an interactive interview session. User responses are analyzed using communication analysis, sentiment detection, grammar checking, and pronunciation evaluation modules. Based on the analysis, the system generates follow-up questions and provides real-time feedback. Finally, the analytics and reporting module creates performance reports that help users improve their communication skills and interview readiness.

candidate records and supports efficient retrieval of interview- related data. A performance radar chart displays multi- dimensional evaluation across technical, communication, consistency, structure, and confidence axes.

‌Fig. 1. Architecture Diagram
Interview Configuration Interface

The interview configuration interface allows administrators to manage interview settings and customize parameters. This module supports configuration of question domains (Java, Python, JavaScript, C++, Data Structures, Machine Learning, etc.), difficulty levels (Easy, Medium, Hard), experience levels (Fresher, 12 yrs, 3+ yrs), and time windows. The system generates a mission preview describing the interview style and coaching focus.
‌Qustion Analysis Interface

‌The question analysis interface evaluates candidate

COMPARATIVE METHODS

ANALYSIS OF EXISTING

responses on a per-question basis, displaying strengths and weaknesses for each answer. Scores are provided on a 1/10 scale with an option for deep analysis. The interface presents

TABLE II. Comparative Study of Existing AI Interview Systems

Author / Source	Year	Key Advantage	Limitation
Vanderbilt RASL Lab	2023	Multimodal interaction; specialized training	Limited to specific groups; no real-time feedback
Lee et al.	2024	Hybrid AI improves accuracy	No stress/non- verbal analysis
Zhang et al. (SimInterview)	2025	Multilingual; scalable	No emotional/non- verbal feedback
Kumar & Sharma	2025	Feasibility demonstrated	Small sample; limited conversation flow

‌VIII. RESULTS AND DISCUSSION

Dashboard Interface

The dashboard interface serves as the central control panel of the proposed system. It provides a consolidated view of interview activities, candidate statistics, analysis summaries, and system operations. It enables administrators and users to monitor interview sessions and access different functionalities efficiently. Key metrics displayed include total interviews, completion rate, in-progress sessions, and average confidence score.
‌Profile Management Interface

The profile management interface handles candidate and administrator profile information. It allows users to manage personal details, interview history, account information, and performance records. The interface maintains organized

question-wise breakdowns covering answer structure, contextual relevance, and technical depth, enabling candidates to identify specific areas for improvement.
1. ‌System Performance
The implementation results demonstrate that the proposed system successfully performs automated interview management and intelligent candidate evaluation. The integration of AI-based technologies such as speech processing, facial emotion recognition, and NLP improves the overall efficiency and reliability of interview analysis. The modular architecture supports scalability and future enhancements.

‌ACKNOWLEDGMENT

We sincerely thank our project guide, Prof. M. D. Ingle, for his valuable guidance, support, and encouragement throughout the development of the project AI-Powered Virtual Interviewer with Real-Time Feedback.

We are grateful to our Project Coordinator, Prof. Pooja Barve, and Head of Department, Prof. S. B. Chaudhari, for their continuous support and motivation. We also thank the faculty members of the Department of Computer Engineering for their guidance and assistance.
‌REFERENCES

L. Hemamou, G. Felhi, V. Vandenbussche, J.-C. Martin, and C. Clavel, HireNet: A hierarchical attention model for the automatic analysis of asynchronous video job interviews, Proc. AAAI Conf. Artificial Intelligence, vol. 33, no. 1, pp. 661-668, 2019.
I. Naim, M. I. Tanveer, D. Gildea, and M. E. Hoque, Automated analysis and prediction of job interview performance, IEEE Trans. Affective Computing, vol. 9, no. 2, pp. 191-204, Apr.-Jun. 2018.
M. E. Hoque, M. Courgeon, J.-C. Martin, B. Mutlu, and

R. W. Picard, MACH: My automated conversation coach, in Proc. ACM Int. Joint Conf. Pervasive and Ubiquitous Computing, 2013, pp. 697-706.
L. Chen and K. Jokinen, Spoken dialogue systems for job interview training, in Proc. Workshop Spoken Dialogue Systems Technology, 2011, pp. 1-10.
Y. Guo, Z. Zhang, and S. Zhao, Automated interview question generation using retrieval-augmented large language models, arXiv preprint arXiv:2312.04345, 2023.
A. Radford, J. W. Kim, T. Xu, G. Brockman, C. McLeavey, and I. Sutskever, Robust speech recognition via large- scale weak supervision, in Proc. Int. Conf. Machine Learning, vol. 202, 2023, pp. 28492-28518.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proc. NAACL-HLT, 2019, pp. 4171-4186.
Y. Liu et al., RoBERTa: A robustly optimized BERT pretraining approach, arXiv preprint arXiv:1907.11692, 2019.