Design & Development of Learning Assistance device for Visually Impaired People

Maagha SN

Dept of TCE, GSSSIETW, Mysuru

Srusti HK

Dept of TCE, GSSSIETW, Mysuru

Bilvashree V

Dept of TCE, GSSSIETW, Mysuru

Shireen Mariyam Dept of TCE, GSSSIETW, Mysuru

Basavanna. M

Asst Professor , Dept of TCE, GSSSIETW, Mysuru

AbstractOur paper tends to the reconciliation of a total Text Read-out framework intended for the outwardly tested that is visually challenged. The framework comprises a webcam interfaced with raspberry pi which acknowledges a page of printed text. In the Raspberry Pi module we have introduced the OCR (Optical Character Recognition) checks it into an advanced record which is then exposed to slant revision, division, before highlight extraction to perform classification. After classifying, the text is readout a text to speech conversion unit (TTS engine) installed in raspberry pi. The output will be of preferred language. The simulation is only an inception of picture handling for example the picture is converted into text and text is converted into speech by the OCR software installed in raspberry pi. The device we have designed has interesting applications in libraries, auditoriums, offices where instructions and sees are to be perused and furthermore in the helped filling of application forms. Our device provides output in multiple languages and helps the visually impaired kids in learning basics.

Keywords-Optical Character Recognition (OCR), Text to Speech Synthesis (TTS)


In our planet there are 7.7 billion people creatures,

285 million are outwardly weakened out of whom 39 million individuals are totally visually impaired, for example have no vision by any stretch of the imagination, and 246 million have gentle or extreme visual impedance and 65 % of individuals outwardly hindered and 82% of all visually impaired are 50 years and more established It has been anticipated that constantly 2020, these numbers will ascend to 75 million visually impaired and 200 million individuals with visual debilitation. As perusing is of prime significance in the everyday schedule of humanity, outwardly hindered individuals face a ton of troubles. Our gadget helps the outwardly hindered by helping them to find out additional. It has been numerous advancement in this area to help visually impaired to read without much difficulties. Firstly, the image is captured with less complex foundation, for example the test inputs are imprinted on a plain white sheet. Without any preprocessor such images are easily converted into text , but such an approach will not be useful in a real-time system. The characters will be peruse out as individual letter and not a total word, in methods that use segmentation of characters for recognition. This gives a bothersome sound

yield to the client.

In our paper, we have structured the gadget will have the

option to distinguish the content from any mind boggling foundation and read it effectively. By having the option to identify an area encasing four focuses, we expect this is the necessary locale containing the content. The new picture acquired at that point experiences edge discovery and a limit is then drawn over the letters. This gives it more definition. The picture is then prepared by the OCR and TTS to give sound yield.


    D.Velmurugan, M.S.Sonam, S.Umamaheswari, S.Parthasarathy and K.R.Arun. [1] Proposed picture to talk change system using raspberry pi. In this paper reenactment results have been adequately affirmed and the hardware yield has been had a go at using different models. The figuring viably took care of the image and examines it undeniably. It is an International Journal of Engineering Science and Computing, volume 6, March 2016 .This is a reasonable similarly as profitable device for the apparently incapacitated people. The computations used here on various pictures are viable and does its change. The device is helpful to the overall population and it is diminished.

    J.N.Balaramakrishna and Ms. J. Geetha. [2] The proposed paper is an International Journal of Engineering, Technology September 2017. he system enables the ostensibly debilitated to not feel unsuspecting with respect to examining text not written in Braille. The image pre-dealing with contemplate the extraction of the substance region from the staggering establishment to give a nice quality commitment to the OCR. The substance yield of the OCR, is sent to the TTS engine which makes the yield as talk.

    Shubham Machale, Aadityaa Yedshikar, Pratik Sahastrabuddhe and Shashikant Suranje. [3] The proposed paper is an International Journal of Engineering August 2018. In this work, Subsequent to looking at the progression example of current flexible human administrations advancement, this article shows other convenient therapeutic administrations exhibit taking into account appropriated figuring. This versatile application can be gotten to and data can be shared transversely over

    contraptions using cloud. This contraption makes the customers hear the photos being scrutinized in their pined for tongue.

    Nagaraja L, Nagarjun R S, Nishanth M Anand, Nithin D, Veena S Murthy. [4] The propose paper incorporates Text Extraction from picture and changing over the Text to Speech converter, a strategy which makes amaze individuals to scrutinize the substance. This is to develop a model for stun people for seeing the things in real world, where text on thing is expelled and changed over into talk. This is finished using Raspberry pi. This paper is an International Journal of Computer Applications (0975 8887) National Conference on Power Systems and Industrial Automation (NCPSIA 2015) . The conveyability is the crucial point which is practiced by giving a battery support and can be executed as a future development. The adaptability allows the customer to use at whatever point and can pass on contraption wherever. Poonam.S.Shetake, S.A.Patil and P. Madhav. [5] Proposed this paper where they have talked about character acknowledgment and discourse combination methods which are helpful to perform task text to discourse transformation. As expressed TTS is partitioned into two character acknowledgment and discourse. To get best discourse amalgamation database of framework ought to be large.There is extension to build the database of proposed framework. Strategies to take care of these issues we have additionally summed up with assistance of various papers. This is an International Journal of Industrial Electronics and Electrical Engineering, August-2014.

    Ashanta waghmare, Bhagyashri Sawale, Mohini Pangare, Urmila Shinde and Ashish Jadhav. [6] Proposed a framework where Text and Speech is fundamental assets of correspondence. Vision is required to get to the information in a report or text type. Ostensibly debilitated people face loads of difficulties while getting to printed chronicles or text structure. Access diverse substance and update their understanding by giving specific course of action is the rule degree of this endeavor. In this Proposed System we used WEB Camera for getting the image. This got picture is then changed over into text with the help of OCR (Optical Character Recognition) programming and TTS engine is used to change over substance to talk.

    1. Venkateswarlu, D. B. K. Kamesh, J. K. R. Sastry and Radhika Rani. [7] Proposed paper has presented a method that is inventive, proficient and continuous cost valuable empowers client to hear the substance of text pictures as opposed to perusing them. It has the idea of Optical Character Recogntion (OCR) and Text to Speech Synthesizer (TTS) in the Raspberry pi. This framework encourages outwardly weakened individuals to collaborate with PCs through vocal interface. Text Extraction from shading pictures is a difficult undertaking in the PC. A strategy Text-to-Speech transformation that sweeps and understands numbers and English letter sets that are in the content or picture utilizing OCR procedure and transforming it to discourse. Ms.Athira Panicker,Ms.Anupama Pandey and Ms.Vrunal

      Patil. [8] proposed a framework where pictures without positive edges the program may not work appropriately. It will work impeccably for picture or report messages which have unmistakable edge. For the item with extravagant textual style, straightforward content, text that is excessively little, obscured text, and for nonplanar surface it won't work appropriately. The marking calculation should be improved. A superior naming technique for parts could improve the recognition of characters. This could show signs of improvement results for roundabout content, which will in general be excused as the clamor because of the gathering of the letters. This is an International Journal of Advanced Research in Computer Engineering and Technology (IJARCET) Volume 5, Issue 10, October 2016

      Anush Gorl, Akash Sehrawat, Ankush Patil, Prashanth Chougule and Supriya Khatavkar [9]. Title – " Raspberry Pi Based Reader for Blind individuals" proposed a content peruser that utilizes OCR innovation .This was distributed in June-2018 by International Research Journal Of Engineering and Technology (IRJET).It can peruse the content as composed, printed or composed. The primary impediment is it can peruse just single language. It utilizes Linux based working framework. Text to discourse is finished by Raspberry Pi which again utilizes Tesseract library with the assistance of phyton language. Caught picture by Pi camera will be improved by open CV library. The yield is hurt by sound yield.

      Priya , Amandeep Kaur Gahier and Mamoon Rashid [10]. Title – "Change of text to discourse in the Punjabi language" . They utilizes the technique for pre-processor , morphological investigation , logical examination , syntactic-prosodic investigation , letter to sound module and prosody generator. The fundamental weakness is it has long technique, it has 13 stages to be followed. It likewise incorporates various innovations of discourse , It distinguishes and read Punjabi language just, If the picture isn't clear it won't recognize and it won't gives you yield. It utilizes neural system i.e reenact order. It possibly gives the yield just when the discourse record is coordinated with the outcome.

      Asha . Hagargund , Sharsha Thota , Mitadru and Eram Shaik [11]. Title – " Image to discourse change for outwardly impeded" This was distributed in June 2017 by IJLRET. This gadget is proposed to help outwardly hindered individuals , In which the fundamental structure is an implanted framework that catches a picture by Pi camera . It doesn't peruse or catch images , It likewise utilizes OCR and TTS motor which produces yield as discourse, It needs battery reinforcement for framework . The respective channel is helped for commotion expulsion , It utilizes rasbian as Operating System which is free and it is improved for Pi equipment . The yield is caught with the assistance of Raspberry Pi jack utilizing speakers or headphones

      Vemula Sindhu , D. Vara Prasad and Md. Shokat Ali [12]. Title – "Framework is to distinguish the object of the visually impaired individuals through the android framework and MATLAB". This was distributes in October 2016 IJPRES. This framework has a model that

      catch the data that will help for the visually impaired individuals. They utilize a calculation to think about picture dependent on edges and pixels of caught picture. They have issues with respect to UI and vigor asses of calculation in perusing the content from complex and the various foundations or various items. This framework likewise read the content that is imprinted on the chamber surface and furthermore on dull surface.

      Mallappa D. Gurav, Shruti S. Salimath, Shruti B. Hatti, Vijayalaxmi I. Byakod [13]. Title "A perusing help for the visually impaired individuals utilizing OCR and OpenCV". It was distributed in May 2017 by ISRET. In this framework it encourages the visually impaired individuals to peruse the hand held item and the content on the record. To peruse the content on the mind boggling foundation they have proposed a calculation ie novel content restriction dependent nervous dissemination and stroke direction. They utilize worldwide structure highlight of text at every single pixel. For the visually impaired clients off-the-rack OCR is utilized to word acknowledgment on the content district and it is changed to the sound structure . The picture will be changed over into information , the distinguished information structure in the picture will be shown in the status bar. The information acquired will be articulated utilizing Flite library utilizing the ear telephones.

      Aaron James S, Sanjana S and Monisha M. [14]. Title "OCR based programmed book peruser for the outwardly weakened utilizing Raspberry PI''. It was distributed in January 2016 by IJIRCCE. To help BBC essential it utilizes python programming as ,ain programming language. At each pixel the comparing highlight maps gauge the worldwide basic component. The camera begins spilling when the Raspberry Pi board is controlled. To give the picture on the board the article for name perusing is set before the camera and the catch will be clicked. The content will be perused and the yield is in the sound structure.

      Satish Kumar G.A.E , Munindhar.M. [15].Title-" The Intelligent Assistive System for Visually Disabled Persons". It was distributed in November 2019 by IJRTE. This gadget can be structured in a solitary goggle so wearing of gadget will be agreeable. For the weakened individuals the new innovation that includes neuron based visual framework that can legitimately advise to harmed neurons as a top priority. This paper primary objective is to add memory to record that is changed over with the goal that they can recover those documents later on. This will be valuable or accommodating for outwardly disabled individuals for getting by in outside condition.


    In the existing system object detection is done with the help of sensors. It is difficult to identify text with complex background because each Individual features namely accuracy, resolution, range control interface, environmental condition needs to be considered and cost is high . These are measure factors to be considered. And only provides speech

    in single language.

    Limitations of Existing System

    Existing system provides output only in single language and it cannot provide speech output for texts with complex background.

    Problem Statement.

    In developing countries we find world's 90% visually impaired people. The existing system is significantly slower. Day by day and current data are not promptly accessible in Braille framework. Without vision it is very challenging for a visually Impaired person to learn things. The blind visually impaired should depend on any other person for their learning. The existing system provides or guides the visually impaired in a single language.

    Aim and objectives

    The aim of the project is to design an automatic text reader in multiple languages with a complex background and the push buttons helps the kids to learn the basics like alphabets The objectives are as follows

      • Image capturing through streaming.

      • Process the image and detect object.

      • Compute the size. Distance and direction

      • Image capture of text. And predicting characters.

      • Multiple libraries . TTS creation.


    Visual impdance makes life fairly hard for individuals who battle from this medical issue, yet the utilization of innovation can help in sometime in the not so distant future to-day assignments. This work centers the advancement of a photo-to-speech application for the visually impaired. The undertaking is called automatic text/document reader for visually impaired, and its ultimate purpose is to help the visually impaired in assisting while learning.

    To achieve that, set of frameworks of Optical Character Recognition (OCR) and Text to Speech Synthesis (TTS) are integrated into the Raspberry Pi and TTS (Text to speech).The output is provided in multiple languages.

    A. Methodology

    This device consists of two parts, image processing module and voice processing module. The optical character acknowledgment (OCR) is the procedure that changes over the sweep or printed text pictures into the content configuration for the further preparing. Our task presents the basic methodology for text extraction and its change into discourse that is speech.

    Initially the Raspberry pi module is tried. Text to discourse (TTS) framework delivers the more regular voice that can be firmly coordinated with human voice. The email and informing are the models for the discourse union. The initial step of discourse combination is for the utilizations to talk a world in amplifiers and afterward that discourse is changed over into the computerized design by utilizing simple to advanced converter and put away in memory.

    The counterfeit creation of human voice is discourse blend. A PC framework utilized for this undertaking is known as a discourse synthesizer. Anybody can utilize this synthesizer in programming or equipment items. The content to-discourse (TTS) framework is to change over

    ordinary language text into discourse. Orchestrated discourse can be created by linking bits of recorded discourse that are put away in a database.

    The nature of a discourse synthesizer is chosen by its instinctive nature or comparability to the human voice and by its capacity to be seen plainly. Our venture sums up the distributed written works on Text to Speech (TTS), with examining about the endeavors taken in each paper. This framework will be progressively useful for an unskilled and outwardly debilitated individuals to hear and comprehend the content in different languages. Also children can utilize press fastens and become familiar with the nuts and bolts.

    Raspberry Pi is a little, minimal effort CPU which can be utilized with a screen, Keyboard and mouse to turn into a productive, undeniable PC. The Raspberry Pi utilizes programming which is either free or open source, which additionally makes it savvy. The Raspberry Pi likewise utilizes a SD card for capacity and its little size additionally gives us the benefits of versatility. As a piece of the product advancement, the open source PC vision libraries are used for picture handling. It is also utilized for continuous Image/Video Processing; IoT based applications and Robotics applications. Raspbian OS is legitimate Operating System accessible for raspberry-pie applications. This OS is productively streamlined to use with Raspberry pi.

    Optical Character Recognition (OCR)

    Optical character acknowledgment is the electronic gadget. It changes over the pictures of composed and printed text into machine encoded text. It peruses the content from records, photograph on the picture. In early forms of optical character acknowledgment, we have to prepare each character and it was able to work just on one text style. In cutting edge renditions of optical character acknowledgment, it takes the vast majority of the textual styles and it likewise takes the computerized picture record design inputs. It is one of the significant instruments for blinds or outwardly disabled individuals to get to the printed data. To begin with, it filters the picture then it begins to see the characters in the substance and it changes over that perceived data into an electronic document which can be utilized to talk the content.

    Text to Speech Synthesis (TTS)

    Outwardly debilitated individuals can't see the content in the pictures, papers, reading material and so on. This makes them incapable to become familiar with the data that in those things this slaughters their learning aptitudes. Hence, we need a gadget to get them out of this issue. So TTS enables the outwardly disabled individuals by changing over the content into discourse with the goal that at that point to can get the data plainly which is imprinted in the pictures, paper, notes and so on.


      Fig.1 Hardware diagram of the proposed system

      Fig 2: Connection diagram

      We have proposed a wearable gadget with camera which catches the content and the caught picture is first distinguished and extricated utilizing OPENCV library. At that point the OCR technique is utilized, which changes over the pictures of composed or printed into a machine encoded text. In the wake of catching picture OCR will look at the blunder utilizing a post-handling calculation. At that point caught picture will be changed over into discourse flags then it will be perused out through headphones or speaker [16]. Various libraries prompts the utilization of this gadget in multilanguage. The product utilized in this framework is Python. The whole framework is executed utilizing Raspberry pi3 model.


      Fig.3 Proposed system workflow

      As the framework is executed utilizing Raspberry pi3 model as the spine structure. The significant element of this venture is to utilize this module for showing the visually impaired understudies i.e., training letters in order and letters in order words to show them English or some other dialects. Along these lines utilizing press catches understudies can click and listen the separate letter sets any number of times, this causes them to learn language successfully and without any problem. The methods for character affirmation. Webcam gets the image then the image can be scrutinized. Pre-taking

      care of is done in the subsequent advance. Concealing picture is changed over into dim scale and diminish scale is changed over into the twofold picture. Character is separate and resizes the image. Weight designs that can be facilitated. Empty the establishment Edge distinguishing proof is done in last development of character affirmation, in that open the substance record and form the report is finished with the objective that the yield is taken care of in text gathering.


    This framework helps the outwardly debilitated individuals to peruse the record naturally. They simply should be prepared with this framework. The framework peruses numerous language. The visually impaired children ought to be prepared for the catch framework instrument.

    Fig.4 : Project experiment setup

    Fig.5: Capturing text in the document

    The Fig.5 shows the text captured in order to recognize the text and produce the voice output.

    Fig.6: Display of recognized text as output

    Figure 6 shows the output of the captured image at the phyton shell, with the help of this system we can increase the literacy rate of the visually impaired people because the system is low of cost and user friendly. The recognized text is will be converted to speech output at the speaker.

    Advantages of proposed system

      • Reduce the books that needs to turn to Braille.

      • Improves learning skills

      • Quick response

      • Easy to use.

      • Portable.

      • Cost beneficial.

      • Clarity of speech.


This framework empowers the outwardly debilitated to not feel off guard while it comes to perusing text not written in braille. The picture pre-handling part extricates the necessary content district from the mind boggling foundation and to give a decent quality contribution to the OCR. The yield of the OCR is text that is been sent to the TTS motor which creates the discourse yield in numerous dialects. The structured framework gives versatility of , a battery might be utilized to control up the framework. It very well may be utilized in daze schools and universities. Along these lines, that there is an expansion in the education rate. This can likewise be utilized as utilization of computerized reasoning. It is useful for uneducated individuals and furthermore diminishes absence of education rate. We likewise can make versatile utilization of same point.


