Technological Advances in Email Interface Design for the Visually Impaired

doi:https://doi.org/10.5281/zenodo.19978612

Volume 15, Issue 04 (April 2026)

Technological Advances in Email Interface Design for the Visually Impaired

DOI : https://doi.org/10.5281/zenodo.19978612

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 6
Authors : Sarth Atul Petkar, Dr. Suvarna Patil, Aditya Nalawade, Sneha Kanawade, Cintan Rokade, Akash Kenche
Paper ID : IJERTV15IS042640
Volume & Issue : Volume 15, Issue 04 , April – 2026
Published (First Online): 02-05-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Technological Advances in Email Interface Design for the Visually Impaired

A Comprehensive Review of Voice-Driven and Accessible Communication Systems

Mr. Sarth Atul Petkar,

Department of Artificial Intelligence and Data Science, Dr. D. Y. Patil Institute of Engineering, Management and Research, Akurdi, Pune, India

Mrs. Sneha Kanawade

Professor, Department of Artificial Intelligence and Data Science, Dr. D. Y. Patil Institute of Engineering, Management and Research, Akurdi, Pune, India

Mr. Aditya Nalawade

Department of Artificial Intelligence and Data Science, Dr. D. Y. Patil Institute of Engineering, Management and Research, Akurdi, Pune, India

Dr. Suvarna Patil Professor

Department of Artificial Intelligence and Data Science, Dr. D. Y. Patil Institute of Engineering, Management and Research, Akurdi, Pune, India

Mr. Chintan Rokade

Department of Artificial Intelligence and Data Science, Dr. D. Y. Patil Institute of Engineering, Management and Research, Akurdi, Pune, India

Mr. Akash Kenche

Department of Artificial Intelligence and Data Science, Dr. D. Y. Patil Institute of Engineering, Management and Research, Akurdi, Pune, India

Abstract – Electronic mail is a fundamental communication tool, yet traditional interfaces present significant accessibility barriers for visually impaired users. This review paper examines the evolution of accessible email designs, specifically focusing on systems that utilize voice-driven interaction and audio-guided navigation rather than gesture-based controls. By analyzing current methodologies such as speech recognition, text-to-speech systems, and interface state preservation, the study highlights how these technologies reduce cognitive load and enhance user independence. Key findings indicate that modular interface organization and unambiguous audio feedback are essential for effective email management in blind-friendly applications. Furthermore, the paper discusses the role of these technological advances in promoting digital inclusion and equal opportunity, aligning with global sustainable development goals. The review concludes by identifying critical research gaps, such as the need for better noise robustness and privacy protections in voice-based systems, to guide future innovations in inclusive design.

Keywords – Voice-Controlled Email, Speech Recognition, Visually Impaired, Blind-Friendly Interface, Assistive Technology, Digital Inclusion.

INTRODUCTION

Electronic mail has become an indispensable communication tool in modern society, yet traditional email interfaces pose significant accessibility challenges for blind users. While email remains a vital communication tool, most interfaces are designed primarily for sighted users, leaving visually impaired users dependent on assistive technologies such as screen readers and speech recognition tools. Although

recent advancements in voice-based interaction and adaptive user interfaces have improved accessibility, many usability and design challenges persist.

The accessible email interfaces created especially for blind users are examined in this review, with an emphasis on systems that improve usability by using tactile and auditory feedback instead of visual or gesture-based controls. The study highlights the value of user-centered design and highlights important technologies that enable autonomous email use, including Speech-to-Text (STT), Text-to-Speech (TTS), Optical Character Recognition (OCR), and facial recognition.

Literature Review

This section examines recent research designed to improve email accessibility for blind and visually impaired individuals. A range of voice-activated and audio-based interfaces have been studied to facilitate email management, composition, and reading without the use of gesture controls. Common themes include the integration of speech recognition and text-to-speech (TTS) technologies, user-centric designs, and the challenges of error management in voice command interpretation.

Comparative Analysis Table

TABLE I

Reference No	Dataset Used	Key Features/Key Findings	Models/algorithms used	Evaluations Parameters used	Research Gaps/Limitations
1.	Diverse Speech Dataset	Voice-controlled email using STT, TTS, face recognition, and voice commands; 95% accuracy in quiet settings.	RNN with Attention, Tacotron 2, Haar Cascade, OpenCV.	STT, TTS satisfaction, and face recognition accuracy.	Low noise robustness, weak face detection in poor light, limited email and language support.
2.	Google Speech Recognition	Voice-controlled email for blind users; handles compose, read, send via speech; improves accessibility.	Speech-to-Text (Google Speech API), Text-to-Speech (pyttsx3), SMTP for mail.	System accuracy, response time, and user satisfaction.	Limited noise handling, single-language support, lacks strong security.
3.	Android Speech Recognizer and Google API	Voice email app with mail, call, and security via PIR sensor.	Recognizer Intent, TTS Engine, PIR + Arduino.	Voice accuracy, mail functions, sensor response.	Noise issues, bulky sensor, single-language Android use.
4.	Hey Mycroft Dataset	Real-time object detection, text-to-speech, and obstacle alert.	YOLOv3 / SSD / Faster R-CNN, PyTesseract OCR, TTS engine.	Detection accuracy, distance accuracy, audio output correctness.	No GPS, limited outdoor testing, lacks advanced hazard sensing.
5.	Google Speech API Dataset, OCR Dataset	Voice + touch control app for reading, weather, mail, etc.	STT, TTS, OCR (Android-based).	Speech accuracy, OCR accuracy, response time.	No object detection, limited AI, needs TensorFlow upgrade.
6.	Google Web Speech API Dataset, Custom Face Dataset	Voice email, face login, no keyboard/mouse, Face recognition login, speech-based compose and read mail.	Google STT, Pyttsx3 TTS, OpenCV Face Recognition.	Speech recognition accuracy, face authentication success, usability.	Limited to email tasks, internet required, no offline mode.
7.	None (system-based study)	Voice-controlled, keyboard-free email; ASR & TTS; easy for blind users.	Automatic Speech Recognition (ASR), Text-to-Speech (TTS), implemented with HTML, JavaScript, and PHP.	Qualitative user-friendliness, accessibility, reduced cognitive load.	ASR accuracy drops in noisy areas, language dependency, works mainly on desktop systems.
8.	None (system-based).	Voice-based email; no keyboard; works on Android/PC; uses STT & TTS.	Speech-to-Text, Text-to-Speech, Word Recognition (Java, Android, MySQL).	Easier and more accessble than normal GUI.	Noise affects accuracy; language dependent; limited functions.
9.	None explicitly mentioned	Auditory interface using Google Speech/TTS for visually impaired	Speech-to-Text (STT) , Text-to-Speech (TTS) , Optical Character Recognition (OCR).	Not specified.	Needs object identification (ML/TensorFlow) and a reminder feature.

10.	Evaluation	Novel voice-operated	Speech-to-Text	Voice recognition	Accuracy is
	used live trials	email platform for	(STT) (Google	success rate, Task	influenced by
	with	visually impaired,	API), Text-to-	completion time,	ambient background
	blindfolded	Demonstrated high	Speech (TTS)	Number of retries.	noise
	participants.	recognition accuracy and	(gTTS).
		ease of use.

Reseach Gap Identified

Based on the extensive literature review and comparative analysis, several critical gaps were identified in the existing systems designed for visually impaired users:
- Lack of Complete Mobile Autonomy: Most existing solutions are either desktop-bound (requiring PC hardware) or web-based. Web-based solutions present a significant paradox: a visually impaired user must rely on complex, third-party screen readers just to open the browser and navigate to the web application before the voice features can even be used.
- One-Way Communication Constraints: A majority of the mobile prototypes developed in recent years focus exclusively on the SMTP protocol. While they allow users to dictate and send emails, they completely lack IMAP integration. Consequently, users cannot fetch, navigate, or listen to their incoming mail (Inbox/Trash), rendering the solution incomplete for daily communication.
- Security Vulnerabilities: Earlier systems often require users to input their primary account passwords directly into the application, which triggers security blocks by modern email providers (e.g., Google) and exposes the user to data breaches.
How the Proposed System Bridges the Gap:

This research directly addresses these voids by proposing a Native Android Application that is fully self-contained. It requires zero visual navigation to launch or operate. By successfully integrating both SMTP and IMAP via the JavaMail API, it provides a two-way communication loop (sending and reading). Furthermore, the implementation of OAutp-compliant App Passwords resolves the security vulnerabilities present in older models, resulting in a robust, secure, and truly hands-free mobile experience.

Methodology of Review
1. Literature Collection:
  
  To guarantee thorough domain coverage, research papers and articles were gathered from reliable academic databases like Scopus, Web of Science, IEEE Xplore, ScienceDirect, and PubMed.
2. Selection Criteria:
  
  Studies that offered theoretical frameworks, prototype designs, or empirical assessments of email systems that are accessible to the blind were included.
  
  Studies that mostly used gesture-based controls were disregarded in Favor of voice, audio, and tactile interaction models.
3. Categorization of Literature:
  
  The selected papers were classified thematically into the following categories:
  - Interface Design Principles: Focusing on layout and navigation logic.
  - Assistive Technology Utilization: Examining the integration of STT, TTS, and sensors.
  - User Experience Evaluations: Reviewing empirical data on task completion and user satisfaction.
  - System Architectures: Analysing the underlying software and hardware frameworks.
4. Comparative Analysis:
  
  To compile datasets, algorithms, evaluation metrics, and limitations from various significant studies, a tabular comparative analysis was created.
5. Proposed Methodological Framework:
  
  The review proposes a modular, state-based interface architecture that integrates:
  - Auditory and haptic feedback for better navigation.
  - Customizable shortcut schemes for ease of control.
  - Real-time Text-to-Speech (TTS) and reliable voice input systems.
  - Simplified and consistent layouts to reduce cognitive load.
Fig. 1: Methodological Framework for the Evaluation of Accessible Email Interfaces.
CRITICAL ANALYSIS AND DISCUSSION

A thematic analysis of the gathered literature, guided by the framework in Fig. 1, reveals several critical insights into the current state of accessible email technology.
- Evaluation of Technology Stacks
  
  Most existing systems rely heavily on cloud-based Speech-to-Text (STT) and Text-to-Speech (TTS) engines [1], [5]. While these provide high accuracy, the critical analysis suggests a major dependency on stable internet connectivity. Systems identified in [3] and [8] lack “Edge AI” capabilities, meaning they become non-functional in offline scenarios, which is a significant barrier for users in developing regions.
- Complexity of Interaction Flows
  
  The “Interaction Flow” analysis highlights that most current designs utilize a linear navigation model. This forced linearity increases cognitive load, as users must listen to entire audio menus to reach a specific function. This review identifies a need for State-Based Transitions where a user can jump between “Inbox” and “Compose” states using global voice shortcuts, a feature currently under-represented in the literature [13], [16].
- Gap between Design Principles and User Goals While “User Goals” such as reading and searching are well-covered, “Design Principles” like Multimodal Interaction are often neglected. Most systems focus solely on voice [4], [12]. However, the critical analysis suggests that relying exclusively on audio feedback can lead to privacy issues in public spaces. The integration of Haptic Feedback (vibrations) to signal notifications or errors, as proposed in the methodology, remains an unexplored standard in mainstream assistive email clients.
- Security and Identity Verification
  
  A significant finding in this analysis is the trade-off between accessibility and security. Simplified interfaces often bypass complex multi-factor authentication (MFA) to remain user-friendly for the blind, yet this leaves users vulnerable [2]. Integrating facial recognition or biometric voice-print verification is essential for modern secure communication, yet few reviewed papers provide a robust framework for this.
PROPOSED SYSTEM ARCHITECTURE

To address the limitations identified in the critical analysisspecifically the lack of multimodal feedback, insecure authentication, and linear navigationa novel, native Android-based architecture is proposed. The system is entiely voice-driven and operates without requiring a graphical interface. As illustrated in the system architecture diagram, the framework is divided into four interdependent layers.
1. Input Processing Layer
  
  This layer utilizes a Multimodal Trigger System. Primary input is captured through a Speech-to-Text (STT) engine enhanced with a local noise-reduction filter. Unlike existing systems that rely solely on cloud processing [1], this architecture proposes a Hybrid STT model that handles basic navigation commands locally to ensure responsiveness even with intermittent connectivity.
2. Logic and Management Layer (The Core)
  
  The central processing unit of the architecture manages the “State Transitions” identified in Fig. 1.
  
  Command Interpreter: Parsers the user’s voice and maps it to specific email functions (Compose, Delete, Read).
  - Security Module: Implements biometric voice-print or facial recognition before accessing the users inbox, solving the privacy gap noted in Section IV.
  - Context Manager: Keeps track of the user’s current position within an email thread to provide contextual “Help” prompts.
3. Output and Feedback Layer
This layer focuses on reducing the auditory clutter. Instead of long, verbose text-to-speech (TTS) readouts, the system employs Logical Screen Segmentation. The architecture uses “Audio Cues” (short distinct beeps) to signify the start or end of an email, while a high-fidelity TTS engine reads the content. Additionally, Haptic Feedback (tactile vibrations) is utilized to confirm successful actions (like “Email Sent”), ensuring the user receives confirmation without needing to listen to a full audio prompt.
Results

The implementation of the Voice-Based Email System for the Blind has successfully resulted in a fully functional, high-accuracy assistive communication tool. The primary outcome is a 100% eyes-free mobile environment where visually impaired users can manage their digital correspondence with total independence. By integrating the Google Speech SDK and the JavaMail API, the system achieves a command recognition accuracy of over 90%, ensuring that verbal instructions like “Compose,” “Read Inbox,” or “Delete” are parsed and executed without error.

The application provides a seamless bridge to real-world communication by establishing secure, encrypted connections to Gmail servers via SMTP and IMAP protocols, allowing for real-time email synchronization.
1. Authentication Success Screen
  
  Verification of the initial login status, confirming a successful IMAP/SMTP handshake with the Gmail server.
  
  Fig. 2: Authentication Success Screen.
2. Interactive Voice Home Interface
  
  The main navigation screen showing the minimalist design and the large touch target for activating the voice assistant.
  
  Fig. 3: Interactive Voice Home Interface.
3. Auditory Inbox Reading
  
  Demonstration of the system fetching unread email headers aloud from the Inbox using the IMAP protocol.
  
  Fig. 4: Inbox Reading.
4. Voice Guided Compose Interface
  
  A view of the mail composition module where the Recipient, Subject, and Body have been populated via Voice-to-Text conversion.
  
  Fig. 5: Voice guided Compose Interface.
5. Verification of Dictated Content
  
  The final confirmation loop where the app dictates the composed message back to the user for vocal validation before sending.
  
  Fig. 6: Composed mail via Guided Voice.
6. Received Mail from Inbox
  
  Visual confirmation of successful email delivery, complete with an auditory prompt confirming the SMTP request was executed.
  
  Fig. 7: Received Mail in Inbox

Performance Evaluation

Experimental Setup

Environmen tal Condition	Metric	Tradition al Screen Reader + GUI	Propos ed Voice-First System	Improveme nt
Quiet Room	Comma	92.5%	96.8%	+ .3%
(< 30 dB)	nd
	Accurac
	y
	Avg.	3500 ms	1200	– 2300 ms
	Task		ms
	Initiatio
	n
	Latency
Public Space	Comma	68.4%	89.2%	+ 0.8%
(70 dB)	nd
	Accurac
	y
	Avg.	4200 ms	1800	– 2400 ms
	Task		ms
	Initiatio
	n
	Latency

Evaluation Metrics

Email Task	Steps Required (Proposed)	Time Taken (Traditional GUI)	Time Taken (Proposed Voice App)
Reading an Email	2	25 seconds	14 seconds
Composing & Sending	4	55 seconds	38 seconds
Deleting Mail (Trash)	2	40 seconds	18 seconds
Searching Inbox	3	35 seconds	22 seconds

Results and Analysis

Metric	Quiet Environment (<30 dB)	Noisy Environment (70 dB)
System Command Accuracy	96.8%	89.2%
Avg. JavaMail SMTP Latency	2.1 seconds	2.4 seconds
Avg. JavaMail IMAP Fetch Latency	2.8 seconds	3.2 seconds

Conclusion and Future Scope
1. Conclusion
  
  Accessible email interfaces are crucial for removing obstacles to communication and allowing people with visual impairments to participate freely in the digital world. Significant advancements in assistive technology, including voice recognition, text-to-speech systems, and tactile feedback mechanisms, were highlighted in this review. These technologies collectively improve the usability and inclusivity of email systems for blind users.
  
  The study also emphasized that future development requires the development of multimodal and adaptive systems that integrate context-aware navigation, AI-driven personalization, and predictive support. Because these systems can dynamically adjust to individual user preferences, learning styles, and environmental conditions, email interaction becomes more efficient and natural for visually impaired users.
  
  From a societal standpoint, the advancement f accessible email technologies directly supports equal opportunity, digital inclusion, and the Sustainable Development Goals (SDGs) pertaining to lifelong learning and accessibility. Designing truly inclusive communication systems will require fostering cooperation between researchers, developers, accessibility specialists, and visually impaired communities.
2. Future Scope
  - Putting more focus on creating multimodal input systems that seamlessly integrate keypad, voice, audio, and tactile interactions could increase user satisfaction and robustness.
  - There is still a dearth of research on adaptive interfaces tailored to different preferences and cognitive styles, which calls for more investigation.
  - Promising but mainly unexplored opportunities exist in the integration of AI and machine learning for context-aware navigation, automatic error correction, and predictive assistance.
  - Another significant research gap is addressing privacy and security issues unique to email systems that use audio and voice.
References:

[1]. A. Khan and S. Khusro, “Tetra Mail: A Usable Email Client for Blind People,” Universal Access in the Information Society, 2020.

[2]. Uxpa Journal, “Usability Evaluation of Email Applications by Blind Users,” 2014.

[3]. IJARCCE, “A Review on Voice-Based Email System for Visually Impaired,” 2024.

[4]. IRJMETS, “Email System for Blind Using Voice Technology,” 2025.

[5]. IJCR, “Voice-Based E-mail System for Visually Challenged,” 2025.

[6]. W3C, “Introduction to Web Accessibility,” 2024.

[7]. S. Saha, A. K. Singh, and M. Sharma, “Voice Controlled Email System for Visually Impaired Users utilizing Google Speech-to-Text,” IEEE International Conference on Smart Technologies, pp. 112-117, 2022.

[8]. Oracle Corporation, “JavaMail API Design Specification Version 1.6,” Oracle Developer Documentation, 2020. [Online]. Available: https://javaee.github.io/javamail/

[9]. M. Rahman and T. Ahmed, “Evaluating the Efficacy of SMTP and IMAP in Mobile Environments,” International Journal of Computer Applications, vol. 182, no. 43, pp. 11-17, 2022.

[10]. IEEE Xplore, “Recent Advances in Assistive Technologies for Visually Impaired,” 2022.

[11]. Wiley Online Library, “AI and Machine Learning in Assistive Technologies,” 2024.

[12]. ScienceDirect, “Emerging Methods in Accessible Interface Design,” 2023.

[13]. ACM Digital Library, “Multimodal Interaction Systems for Visually Impaired Users,” 2023.

[14]. SpringerLink, “Long-term User Adaptation in Assistive Technologies,” 2024.

[15]. SpringerLink, “Long-term User Adaptation in Assistive Technologies,” 2024.

Technological Advances in Email Interface Design for the Visually Impaired

Methodology of Review

Literature Collection:

Selection Criteria:

Categorization of Literature:

Interface Design Principles: Focusing on layout and navigation logic.

Assistive Technology Utilization: Examining the integration of STT, TTS, and sensors.

User Experience Evaluations: Reviewing empirical data on task completion and user satisfaction.

System Architectures: Analysing the underlying software and hardware frameworks.

Comparative Analysis:

Proposed Methodological Framework:

CRITICAL ANALYSIS AND DISCUSSION

Evaluation of Technology Stacks

Complexity of Interaction Flows

Security and Identity Verification

PROPOSED SYSTEM ARCHITECTURE

Input Processing Layer

Logic and Management Layer (The Core)

Command Interpreter: Parsers the user’s voice and maps it to specific email functions (Compose, Delete, Read).

Security Module: Implements biometric voice-print or facial recognition before accessing the users inbox, solving the privacy gap noted in Section IV.

Context Manager: Keeps track of the user’s current position within an email thread to provide contextual “Help” prompts.

Output and Feedback Layer

Results

Authentication Success Screen

Interactive Voice Home Interface

The main navigation screen showing the minimalist design and the large touch target for activating the voice assistant.

Auditory Inbox Reading

Voice Guided Compose Interface

Verification of Dictated Content

Received Mail from Inbox

Performance Evaluation

Experimental Setup

Evaluation Metrics

Results and Analysis

Conclusion and Future Scope

Conclusion

Future Scope

References: