Voice Assisted Text Reading System For Blind And Visually Impared Persons

DOI : 10.17577/IJERTCONV11IS05039

Download Full-Text PDF Cite this Publication

Text Only Version

Voice Assisted Text Reading System For Blind And Visually Impared Persons

Santosh Herur 1

Dept. Of E&C,

Jain institute of Technology, Davanagere, Karnataka, India. santoshherur@jitd.in

Bhoomika G R 2

Dept. Of E&C,

Jain institute of Technology, Davanagere, Karnataka, India.

Arpitha K R 3

Dept. Of E&C,

Jain institute of Technology, Davanagere, Karnataka, India.

bhoomika814789@gmail.com arpithakr202@gmail.com

Shubhashini K H 4

Dept. Of E&C,

Jain institute of Technology, Davanagere, Karnataka, India. shubhashinikh@gmail.com

Syeda Shaista Zainab P5

Dept. Of E&C,

Jain institute of Technology, Davanagere, Karnataka, India. syedashaista1128@gmail.com


With the enhancement of the living norms of the people, we've come so materialistic that we've forgotten how the eyeless and visually bloodied persons live a tough life. The major problem faced by eyeless visually disabled people all these days is that they're unfit to do textbook The lack of self-recognition compels them to rely on others for their everyday need conditioning similar as reading review, letters transferred through post, pertaining books etc. This problem may reduce their confidence as they could not repel singly. This design is about designing a device that help these eyeless and visually bloodied persons in their textbook recognition. This end is achieved by developing a module that convert the textbook into speech and speak out through the headphone/ Speaker handed. The capturing of the image is done using the jeer pi cam and birth of textbook is done using the erected program and further the textbook is honored for words and spoke out through headphone/ speaker.

Keywords: Raspberry pi 3, OCR (optical character recognition), voice output.


    Globally, an estimated 285 million individuals are affected by visual impairment, with 39 million being blind and 246 million experiencing some degree of poor vision. Roughly 90% of individuals with visual impairments in the world belong to the low- income demographic, and 82% of people with blindness are aging individuals. Global efforts have resulted in a reduction of individuals affected by eye-related conditions, with the numbers decreasing by a factor of 20. It is estimated that 80% of visual impairments can be prevented or treated. India has the highest number of individuals with vision loss, with 15 million out of the 37 million cases worldwide. Despite numerous attempts by researchers, permanent solutions for visual impairment have not yet been found. To aid those who prefer to be self-reliant rather than dependent on others, this created an assistive device for individuals with visual impairments. This design development enables individuals to see clearly and move independently. There are roughly 9.1 billion deaf and non-verbal persons in the world. They encounter a wide variety of

    communication issues throughout the course of219

    the day. The linguistic procedure known as subscribe language is used to facilitate communication between unhindered and regular individuals. Subscribe language depends on sign patterns that are comparable to human body language and arm gestures to facilitate communication between the great ignoble. The main concern for those who are deaf or have difficulty speaking out loud is that they be able to interact with the typical types of individuals in society. Additionally, it is not feasible for all of the millions to learn sign language in order to comprehend what is being conveyed through gestures. As a result, communication barriers exist between stupid and deaf persons. Simpletons can easily use sign language to communicate with others in a way that others cannot. This used the little computer known as the "jeer pi" to help those who are visually and verbally impaired overcome these challenges. This provide the results for blind, deaf, and stupid persons using this gadget. Eyeless individuals can utilize Tesseract software to convert images into sound, while deaf individuals can communicate their content when another person speaks, it is displayed as a communication. The individuals who were unable to speak communicated through written text rather than sign language, which can be conveyed through eSpeak. This provided essential means to address the challenges faced by these millions of individuals.


    The goal is to offer a solution that enables easy document access for individuals who are blind or visually impaired. Constructing a system to make blind and visually impaired persons self- dependent. Providing efficient and unique capability in accessing private documents, reading newspapers, etc.


    The interest towards creating a device that could help blind and visually impaired persons in reading. The device helps to access documents and read newspapers easily. It makes them independent as they can access or read private documents.


    [1] Ani R, Effy Maria, J Jamia Joyce, and Sakkaravarthy V have proposed a smart spectacle for individuals who are blind, which can detect text and convert it into a voice output. This technology can assist visually impaired persons in accessing printed text in an audible form. The implementation primarily targets Raspberry Pi, which serves as an interface between the camera, sensors, and image processing results, while also performing functions to manipulate peripheral units such as keyboards and USB devices.

    [2] Joao Guerreiro and Daniel Goncalves authors describe how portable digital imaging devices have made significant progress in many domains and propose the use of screen readers to assist visually impaired users in accessing digital information. They introduce concurrent speech, which allows visually impaired individuals to listen to several irrelevant pieces of information, comprehend the overall message, and identify the relevant ones for further attention.

    [3] J. Liang D and Doremann suggest that advanced technology can greatly assist visually impaired persons in exploring their environment using mobile computer vision. The authors present a survey of application domains, technical challenges, and solutions for document analysis from digital devices and imaging processes. They also highlight some sample applications under development and provide ideas for future feasibility.

    [4] Alexandre Trilla and Francesc Alias discuss the improvement of the text-to- speech (TTS) method for input text reading and speech conversion. The article presents research on the analysis of input features for speech synthesis.

    [5] In a conference for robot tutors, S. Mascaro and H. H. Asada introduced the measurement of finger posture and shear force in initial experimentation.

    [6] Pitrelli J. and Bakis R. propose the IBM expressive text-to-speech synthesis system for American English based on an article


    about audio, speech, and language processing.


    Fig. 1. Block Diagram of Text Reader System

    This device composed of three main parts Input, Processing unit and the Output. The smart reader receives input through the use of a camera. Using the camera, an image of the printed text is captured. Push button used for camera capture the text and play/pause the audio. The Raspberry Pi model is the central component of the processing unit. The speaker or headphone serves as the output component of the device. Firstly, the camera captures an image of the printed text, which is then processed and extracted using OCR technology on the Raspberry Pi. The extracted text is then converted into a speech signal using Text to Speech Conversion (TTSC) and the resulting audio is played through the speaker or headphone.

    A fully integrated system is typically used in the design, which feeds printed text into a camera for digitalization and then processes the scanned document using an OCR (Optical Character Recognition) software module.

    The process of identifying the sequence of characters and determining the current reading line is facilitated by a particular approach. OpenCV (Open-source Computer Vision) libraries are employed to perform tasks such as character recognition and capturing the image of text.

    Fig. 2. Block Diagram of Text to Voice

    The process continues for the non-verbal masses who then translate their thoughts into text that can be converted to an audio signal. The transformed audio message is delivered through the speaker. The key board is interface to raspberry pi, the type data will convert in to voice.

    Fig. 3. Block Diagram of Symbol to Voice

    Here the respective finger is assigned to respective text or word, according to recognition the voice output will described through speaker.



    Logitech C270 HD Webcam Terabyte HDMI to VGA Cable

    Generic Buena Sol Tactile Momentary Push Button Switch

    Mini Digital USB 2.0 Speaker with 3.5 mm Audio


    Raspbian OS VNC Viewer Espeak



    OpenCV (Open-Source Computer Vision)


    Fig. 4. Experimental Setup

    1. Text to speech (TTS) through (SW0)

    2. Text to speech using camera (TTSC) through (SW1)

    3. Gesture control via (SW3)



    • Converting written communication messages, warnings, and traffic directions to voice format is a useful method that allows them to be accessed by individuals who have difficulty reading.

    • This technology is capable of transforming sign boards and other text into speech.

    • Visually impaired people can live an independently life. They are independent in their work and do not rely on others.

    • People who are visually impaired can readily interact with the application thanks to the graphical user interface (GUI).

    • Using a text-to-speech tool to practice word pronunciation in writing is also beneficial.

    • It can be used by normal or illiterate people who cannot read the language but understands it.

    • The system is both easily operable and portable.


      Fig. 5. Display about Necessary Conversion

      By using a single, small gadget, to produce a prototype model for blind, stupid, and deaf persons. The project offers a special way for these persons to independently maintain their websites. Python source code is used to support the project. The transformed audio message is delivered through the speaker. The Python language is the most user-friendly interface for the Raspberry Pi. This project is aimed at creating a single device that is compact and easy to manage for individuals who are blind, deaf, or mute. The system is equipped with three switches, each with different functions. The appropriate switch is selected for the necessary conversion. After choosing conversion the output will produced through speaker/headphone.

    • Range of reading distance is 30-40cm only.

    • Minimum Character font size should be 14pt.

    • The text line can only tilt up to a maximum of 4-5 degrees from the vertical.


    A compact device has been designed as a prototype model to assist individuals who are visually impaired, deaf or mute. This design aims to assist visually and hearing impaired individuals in managing their daily lives more independently, which is a crucial factor. One major advantage is its portability and lightweight nature. Sign language is an effective communication method between hearing and speech-impaired individuals, and this proposed system seeks to improve the lifestyle of such


    individuals. An Advanced Speech Communication System for Deaf People was developed and evaluated in a real-world scenario, with the ultimate goal of enabling long-term communication.


    This design can be implemented with any advanced technology by using a simpler programming language to make it less complex. The complexity can be reduced by a small device which could be more helpful to those people in today's electronic world.


[1] Ani R, Effy Maria, J Jameema Joyce, Sakkaravarthy V, Smart Specs: Voice assisted Text Reading system for Visually Impaired Persons Using TTS method, IEEE International Conference on Innovations in Green Energy and Healthcare Technologies.

[2] Joao Guerreior and Daniel Goncalves. Text-to-Speech: Evaluating the Perception of Concurrent Speech by Blind People, International journal of computer technology.

[3] J. Liang D. and Doermann H. Camera- based analysis of text and documents: a survey, International Journal on Document Analysis and Recognition.

[4] Alexandre Trilla and Francesc Alias. (2013). Sentence Based Sentiment Analysis for Expressive Text-to-Speech, IEEE Transactions on Audio, Speech, and Language Processing, Vol. 21, Issue. 2. pp. 223-233.

[5] [S. Mascaro. And H. H. Asada. Finger posture and shear force measurement: Initial experimentation, in Proc. IEEE Int. conf. robot.

[6] Pitrelli J. and Bakis R. The IBM expressive text-to-speech synthesis system for American English, IEEE Trans. Audio, Speech, Lang. Process.

[7] Kwon et al. "A study on the Development of Voice-Enabled Text Reading System for Visually Impaired Persons." (2017)

[8] Thakur et al. "Design and Development of an Assistive Text Reading System for the Blind and Visually Impaired." by Thakur et al. (2018)

[9] Vijayarani, S., & Saranya, P. (2018). Voice-Assisted Reading System for Visually Impaired. In Proceedings of the International Conference on Advances in Computing, Communications and Informatics (pp. 1323-1326). IEEE.

[10] "Design and Evaluation of a Voice- Activated Text Reader for Visually Impaired Individuals." (2019)

[11] Sharma et al. "Text to Speech for Visually Impaired: A Survey." (2019)

[12] Singh et al. "Voice-based Text Reader for Visually Impaired People." (2019)

[13] Mishra, A., Dubey, A., & Shukla, R. K. (2019). Development of Voice Assisted Reading System for Visually Impaired People. In Proceedings of the International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (pp. 65-71). Springer.

[14] Xie et al. "Assistive Technology for Blind and Visually Impaired People: An Overview of Text-to-Speech Technology."2020

[15] Bokhari, M. B., Iqbal, M., Anjum, F., & Rauf, F. (2020). Design and Implementation of Voice-Operated Text Reading System for Visually Impaired. In Proceedings of the International Conference on Advanced Communication Technologies and Networking (pp. 435- 442). Springer

[16] Bhattacharya, A., Ghosh, S., & Saha, S. K. (2020). A Review of Speech Synthesis Techniques for Text-to-Speech Conversion in Assistive Technology. International Journal of Speech Technology, 23(2), 263-281.