An Assistive Reading System for Visually Impaired and Blinds using OCR and TTS Techniques on LabVIEW

DOI : 10.17577/IJERTCONV7IS10073

Download Full-Text PDF Cite this Publication

Text Only Version

An Assistive Reading System for Visually Impaired and Blinds using OCR and TTS Techniques on LabVIEW

Darshini Raj

Dept. of TCE, Mysuru, Karnataka, India

Hiba Fathima Tazeen

Dept. of TCE, Mysuru, Karnataka, India

Varsha Ravish

Dept. of TCE, Mysuru, Karnataka, India

Anupama H.N

Dept. of TCE, Mysuru, Karnataka, India

Nagashree R.N Dept. of TCE, GSSSIETW,

Mysuru, Kaenataka, India

AbstractExtraction of knowledge just by listening to sounds is a peculiar feature. Though text is a medium of communication but speech is more powerful means of communication than text. Optical character recognition have become one of the most successful technology in the field of pattern recognition and artificial intelligence.To improve the ability to access the textual information an assistive system has been used that reads the text from hand written and scanned document and converts the textual information to speech. Speech signals produced can be saved and reproduced for later use. The main objective of this paper is to develop an cost effective and user- friendly optical character recognition based speech synthesis.This paper integrates the text and speech synthesizer which is performed using Laboratory virtual instruments engineering workbench (LabVIEW 2017 version).

KeywordsOptical character recognition, Text to Speech, Image acquisition, LabVIEW.

  1. INTRODUCTION

    In our day to day life, text is present everywhere which can be either in the form of documents or in the form of natural scenes that can be read by a normal person. Exact likeness of machine in mankind activities or functions like reading and writing is a dream from ancient days, this dream turned into reality in these days. Unfortunately, blinds and visually impaired persons are facing difficulty in reading some information, because of their vision trouble which restricts their mobility in unconstrained environment. Optical character recognition (OCR) technology identifies the character automatically through an optical mechanism. OCR technology converts typed or printed text in scanned document, newspaper and magazines into machine encoded text. As time goes by, many approaches are put forward to deal with OCR based speech synthesis. The OCR method is presented by using OCR technology and windows phone with greater quality camera. Interpreting text from real world pictures is a challenging problem due to change in the environmental factors, even it is easier when finest open source

    OCR engine is used [1]. The orange pi process make sure to read text present in the image for helping blind people, the system composed of orange pi, when OCR output is given to orange pi , it find outs the image content and gives the output in the form of audio signal[2]. An assistive system has been proposed for visually impaired and blind people; it looks through textual information or details on paper and generates corresponding voice output using OCR and text to speech (TTS) synthesizer [3]. Improved text detection aiming to develop a camera based text reading system for people who are facing trouble in reading; they built a working model with pan- tilt-zoom actionable and evaluated a new text-detection method for image area composed of small characters [4]. Text detection from natural scene images[5], proposes a system that reads the text which comes in contact with natural scenes and aims to provide support to the visually impaired people, camera based document study becomes a real possibility with greater resolution and high attainability of digital camera.

    In this proposed system, OCR technology which is one of the family of techniques performing automatic recognition and Text to speech (TTS) synthesizer have been used. OCR based speech synthesis produces human speech artificially. Synthesizing is the most effective production of speech waveforms using text to speech conversion in LabVIEW which generates more powerful medium of communication than text because blinds can also responds to sounds. The proposed methodology aims to develop efficient, cost effective and user friendly application so that people with blindness can also interact with their environment as that of a sighted person.

  2. METHODOLOGY

    The significant function of any assistive reading system is text information extraction and it is an foremost part of OCR. In this paper the methodology is proposed in such a way which uses an assistive system that reads texts from scanned documents whose textual information is further converted to speech.

    Figure 1: Conversion of text ti speech process

    1. Optical character recognition

      OCR is an acronym for optical character recognition a technique performing automatic identification of characters through an optical mechanism. It converts printed or typed text captured by scanner into machine editable text. The OCR based system consists of following processes.

      • Image Acquisition

      • Image Pre-processing (Binarization)

      • Image Segmentation

      • Matching and Recognition

      1. Image acquisition

        The image acquisition is a process in which an image of text to read will be captured using a USB camera. The flap of the camera is kept open while doing the acquisition process in order to get a uniform white background. The image will be obtained using the code generated in LabVIEW.

      2. Image Binarization

        Binarization method is also called as image preprocessing. Binarization is a method used to convert the grey scale image having range between 0 to 255, into binary image having range 0 or 1. Once acquisition is done, further creates a temporary memory location for an image that contains typed or handwritten character with the image type as 8 bit per pixel value. After that it takes the session input, a unique reference to camera which is obtained from theimaq opencamera and then it takes the reference to the image as an image input that receives the captured pixel data as output.

      3. Image Segmentation

        Image segmentation is a process of separating a given image into multiple segments. This process aims to change the representation of the image into something whose analysis becomes easier. This process Reads the text in the image and identifies all objects in the image based on the properties that is set, and then compares each object with every character in the character set file.

      4. Matching and Recognition

        Matching and Recognition is a process, correlation between stored templates and segmented character has been obtained. For each object, it selects the character that most closely matched the object. It uses the substitution character for any object that did not match any of the trained characters. It uses the Substitution Character property to specify the substitution character. Here image acquisition IMAQ OCR read is used to match and read the characters in the read string indicator.

    2. text to speech

    The text which is extracted from optical character recognition can be automatically read by a text to speech synthesizer. A computer system is used for the purpose of producing artificial human speech is called as a speech synthesizer. A normal language text is converted into speech by TTS system. In LabVIEW there is property called as .NET object where we create an instance called constructor node which contains initialization parameters used to assign system speech synthesizer. The output of constructor node is given to invoke node, which converts text in string format to speech.

  3. REULTS

    The proposed system has been developed using LabVIEW 2017 version which read the text and converts the given text into speech. Since we are using database approach the accuracy level is high.

    This process consist of two steps

      • OCR

      • Conversion of text to speech

    1. OCR

      In this step image is given as input as shown in figure 2 where image acquisition takes place the acquired image is binarized, segmented as shown in figure 3.

    2. Conversion of text to speech

    In this step the text is converted to speech as shown in figure 4, before converting into speech the acquired text is matched with the train characters which are stored in database.

  4. CONCLUSION

In this paper the proposed methodology speaks about the OCR based speech synthesis system which produces an effective speech output in wave file format which is very useful for the blind people. This system is been implemented using LabVIEW 2017 version. This methodology is been carried out using two processes they are OCR and speech synthesis respectively. In OCR printed or written character documents are scanned and image is acquired by using IMAQ vision of LabVIEW, then the acquired characters are been undergone in the process of segmentation and matching with the templates methods using LabVIEW. Then the obtained output which is in the form of text is converted into speech. This system has a limitation of interpreting the text in handwritten form and restricted for recognition of the handwritten characters which is stored only in the database. Thus the methodology developed is user friendly, and cost effective. This system has the flexibility of approaching some modifications when there is need of it.

REFERENCES

  1. Kanchana V and Abdul Shabeer H, OCR Based Speech Synthesis System Using LabVIEW, Volume 5: PG Scholar/VLSI Design, Department of Electronics and Communication Engineering, Salem, (2014).

  2. Akshay A, Amrrith N P, Dwishanth P and Rekha V, A Survey On Text To Speech Conversion, Volume 5: Department of Computer Science and Engineering, Easwari Engineering College, Chennai, (2018) Mar- Apr.

  3. Akshay Sharma, Abhishek Srivastava, Adhar Vashishth, An Assistive Reading System for Visually Impaired Using OCR and TTS, PEC University of Technology, Chandigarh, (2014) June.

  4. Mauritius Seeger and Christopher Dance, Binarizing Camera Images for OCR, (2001) 13 September.

  5. K. karthik EL, Ramkumar, I.S. Nivethitha, D.Vaishnavi, Tools Identification System with Audio Indication Using OCR in Labview IJSRD-vol.3, Issue 01, (2015).

  6. S. M. A. Hague et al., Automatic detection and translation of Bengali text on road sign for visually impaired, Daffodil Int. Univ. J. of Sc. and Tech, vol. 2, (2007).

  7. B.B.Chaudhuri and U.Pal, A complete printed Bangla OCR system, Pat. Rec., vol. 31, pp. 531-549, (1997).

  8. Dr. Tulio sulbaran, Web Based Speech to Text and Text To speech Application for Provide Online Academic text for Blind Students, (2004) June 3.

  9. K. Matsuo et al., Extraction of character string from scene image by binarizing local target area, Trans. Of The Ins. of Elec. Eng. of Japan, vol. 122-C(2), pp. 232241, (2002).

  10. Nobuo Ezaki, Kimiyasu Kiyota, Bui Truong Minh, Marius Bulacu and Schomaker, Improved Text-Detection Methods For A Camera-Based Text Reading System For Blind Person, Toba National College Of Maritime Technology, Japan, Kumamoto National College Of Technology, Japan, Tokyo Institute Of Technology, Tokyo, Japan, (2005).

  11. Nobuo Ezaki, Marius Bulacu, Lambert Schomaker, Text Detection from Natural Scene Images Towards A System For Visually Impaired Person, Toba National College Of Maritime Technology, Japan, 17th International Conference On Pattern Recognition.

  12. Kaczmirek, Lars, Wolff, Klaus G, Survey Design For Visually Impaired and Blind People, Universal access in HCI, part 1, (2007).

  13. T.Dutoit, High quality text to speech synthesis: A Comparison Of Four Candidate Algorithms, IEEE International conference on Acoustics, speech and signal processing,(1994), pp.19-22.

  14. Erin Brady, Meridith Ringel Morris, Yu Zhong, Samuel white and Jeffrey P. Bigham, Visual Challenges in the Everyday Lives of Blind People, CHI 2013, ( 2013) April 27-May 2.

  15. Yamaguchi and M. Maruyama, Character Extraction from Natural Scene Images by Hierarchical Classifiers, In Proceedings of the International Conference on Pattern Recognition, (2004), pp. 687-690.

Leave a Reply