Implementation of Text to Speech Conversion

Chaw Su Thu Thu; Theingi Zin

doi:10.17577/IJERTV3IS030548

Volume 03, Issue 03 (March 2014)

Implementation of Text to Speech Conversion

DOI : 10.17577/IJERTV3IS030548

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 1,002
Total Downloads : 14353
Authors : Chaw Su Thu Thu, Theingi Zin
Paper ID : IJERTV3IS030548
Volume & Issue : Volume 03, Issue 03 (March 2014)
Published (First Online): 22-03-2014
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Implementation of Text to Speech Conversion

Chaw Su Thu Thu1 , Theingi Zin 2

1Department of Electronic Engineering, Mandalay Technological University, Mandalay

2Department of Electronic Engineering, Mandalay Technological University, Mandalay

Abstract- Text-To-Speech (TTS) conversion is a computer- based system that can be able to read any text aloud, whether it was directly introduced in the computer by an operator or scanned and submitted to an Optical Character Recognition (OCR) system. While in text to speech, there are many systems which convert normal language text in to speech. The main aims of this paper are to study on Optical Character Recognition with speech synthesis technology and to develop a cost effective user friendly image to speech conversion system using MATLAB. In this work, the OCR system is implemented for the recognition of capital English character A to Z and number 0 to 9. Each character is recognized at once. The recognized character is saved as text in notepad file. In this work a text-to-speech conversion system that can get the text through image and directly input in the computer then speech through that text using MATLAB.

INTRODUCTION

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware [2]. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech.

Text-to-speech (TTS) convention transforms linguistic information stored as data or text into speech. It is widely used in audio reading devices for blind people now a days [6]. In the last few years however, the use of text-to-speech conversion technology has grown far beyond the disabled community to become a major adjunct to the rapidly growing use of digital voice storage for voice mail and voice response systems. Also developments in Speech synthesis technology for various languages have already taken place.

The Speech Application Programming Interface or SAPI is an API developed by Microsoft to allow the use of speech recognition and speech synthesis within Windows applications.
PROPOSED ALGORITHM In this work, there are two main parts:
- Optical Character Recognition System for Paper Text
- Text to Speech Conversion
OCR. The character image is mapped to a higher level by

extracting special characteristics and patterns of the image in the feature extraction phase.

The classifier is then trained with the extracted features for classification task. The classification stage identifies each input character image by considering the detected features. As Classifiers, Template Matching and Neural Networks are used.
Text to speech conversion for the e-text input that directly typed in computer is also executed by the above steps.
SIMULATION RESULTS

In this work, the OCR system is implemented for the recognition of capital English character A to Z and number

0 to 9. Each character is recognized at one time. The recognized character is saved as text with notepad file. There are two portions in program; in the first portion it gives the text output according to input image , then it convert that text into the speech. In the second portion, the e-text is directly input in computer, then it is converted into speech.

Firstly the input image of time new romance, font size 12, bold type characters is taken and then it is converted into text. As shown in Figure 2, character A is cropped from the image and features are extracted. After that it is converted to text, saved in notepad file and speech simultaneously. Similarly, the test results for character T is also illustrated in Figure 3. The recognized character can be displayed in the command widow and can be save in notepad file as shown in Figure 4.

(a)

(b)

Figure 2. (a) Character A converted into text (b) A sound wave

(a)

(b)

Figure 3. (a) Character T converted into text (b) T sound wave

(a)

(b)

Figure 4. (a) Output text in command window (b) Saved text in notepad (character A and T )

The mathematical numbers are also successfully cnverted into text and then speech which is shown in Figure 5.

(a)

(b)

Figure 5. (a) Number 5 converted into text (b) Number 5 sound wave

Another type of font character is taken and again it is converted into text and then speech successfully as shown in Figure 6 and 7.

(a)

(b)

Figure 6. (a) Character M converted into text (b) Character M sound wave

(a)

(b)

Figure 7. (a) Number 2 converted into text (b) Number 2 sound wave

As illustrated in Figure 8, the e-text that directly input in computer by typing from keyboard, then it is also converted into speech successfully.

(a)

(b)

Figure 8. (a) E-text Input (b)Sound Wave Hello, How are you?
CONCLUSION

In this work, image into text and then that text into speech is converted by MATLAB. E-text into speech is also converted successfully. By this approach text from a word document, Web page or e-Book can be read and can generate synthesized speech through a computer's speakers. For image to text conversion, firstly image is converted into gray image. Gray image is converted into binary image by thresholding and then it is converted into text by MATLAB. Microsoft Win 32 SAPI library has been used to build speech enabled applications, which retrieve the voice and audio output information available for computer. In this work, one character can be converted into text at once. As a further extension, OCR system can be developed for converting words or sentences image into text.

REFERENCES

Ainsworth, W., "A system for converting English text into speech," Audio and Electroacoustics, IEEE Transactions on , vol.21, no.3, pp. 288-290, Jun 1973
Fushikida, Katsunobu; Mitome, Yukio; Inoue, Yuji, "A Text to Speech Synthesizer for the Personal Computer," Consumer Electronics, IEEE Transactions on , vol.CE-28, no.3, pp.250-256, Aug. 1982
Hertz, S., "English text to speech conversion with delta," Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '86. , vol.11, no., pp.2427-2430, Apr 1986
Lynch, M.R.; Rayner, P.J., "Optical character recognition using a new connectionist model," Image Processing and its Applications, 1989., Third International Conference on , vol., no., pp.63-67, 18- 20 Jul 1989
S. Furui, Speaker independent isolated word recognition using dynamic features of speech spectrum, IEEE Transactions on Acoustic, Speech, Signal Processing, Vol.34, issue 1, Feb 1986, pp. 52-59.
Leija, L.; Santiago, S.; Alvarado, C., "A system of text reading and translation to voice for blind persons ," Engineering in Medicine and Biology Society, 1996. Bridging Disciplines for Biomedicine. Proceedings of the 18th Annual International Conference of the IEEE , vol.1, no., pp.405-406 vol.1, 31 Oct-3 Nov 1996
Tanprasert, C.; Koanantakool, T., "Thai OCR: a neural network application,"TENCON '96. Proceedings. 1996 IEEE TENCON. Digital Signal Processing Applications , vol.1, no., pp.90-95 vol.1, 26-29 Nov 1996
Breen, A.P., "The future role of text to speech synthesis in automated services," Advances in Interactive Voice Technologies for Telecommunication Services (Digest No: 1997/147), IEE Colloquium on , vol., no., pp.6/1-6/5, 12 Jun 1997

Implementation of Text to Speech Conversion

(a)

Leave a Reply