Intelligent Voice Operating System (IVOS)

Devang Sawant; Prateek Phadtare; Chirag Vyas

doi:10.17577/IJERTCONV3IS06020

ICONECT - 2015 (Volume 3 - Issue 06)

Intelligent Voice Operating System (IVOS)

DOI : 10.17577/IJERTCONV3IS06020

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 228
Total Downloads : 11
Authors : Devang Sawant, Prateek Phadtare, Chirag Vyas
Paper ID : IJERTCONV3IS06020
Volume & Issue : ICONECT – 2015 (Volume 3 – Issue 06)
Published (First Online): 24-04-2018
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Intelligent Voice Operating System (IVOS)

Devang Sawant, Prateek Phadtare, Chirag Vyas

BE, Computer Engineering

KC College of Engineering and Management Studies & Research

Thane

Abstract Microsoft and other OS designers, have tried their level best to incorporate features in their OS which can help differently-abled or handicap users to get the maximum out of their PC. However, these accessibility features have their own limitation, and they rarely allow the user to interact with applications other than OS itself. Example using media player or MS-Office is not easily possible. It is not possible to use all the features of the computer by people suffering from dexterity issues, motion impairment and blindness. So, for this people to use all the computer features to the fullest we will try our level best to design a super layer OS called intelligent voice operating system (IVOS). With the help of this system we will be able to operate, open, browse all computer applications by our voice commands. More applications can be added as and when needed and can be operated using voice.

Keywords Hidden markov model, Dynamic time warping, neural networks.

INTRODUCTION

Hardware and software that help people who are physically challenged, often called "accessibility options" when referring to enhancements for using the computer, the entire field of assistive technology is quite vast and even includes ramp and doorway construction in buildings to support wheelchairs. Enhancements for using the computer include alternative keyboard and mouse devices, replacing beeps with light signals for the deaf, screen magnifiers and text enlargers and systems that form tactile Braille letters from on-screen text. Environmental control units (ECU's), Automatic door openers, Dragon NaturallySpeaking voice recognition software for those with physical challenges.

Mac OS X includes a wide variety of features and assistive technologies known as Universal Access that include screen and cursor magnification, a full-featured screen reader, visual flash alerts, closed-captioning support, and much more. Mac OS X includes all the features your application needs to make it accessible to users with special needs.

Apple strongly encourages developers to support these APIs in all of their applications so they are compatible with features built into Mac OS X such as Voice-Over, as well as other third-party products. The X-code and Interface Builder tools, as well as the Cocoa frameworks, make it easy to add accessibility tags like roles and descriptions.

For example, Interface Builder has an Inspector that allows you to enter a description for any control in the user interface; that description will be synthesized into speech when Voice-Over is enabled.
IVOS (Intelligent Voice Operating System) is an intelligent agent that offers both Speech Recognition and Text-to-Speech capabilities, allowing you to run computer via voice commands. The menu and sub menu of any software, including total voice operation of MS Outlook can be controlled. You can use voice commands to open files, folders or websites and much more. In addition, the Text-to-Speech features allow you to convert your spoken text into typed letters, compatible with any software that offers a text input window (email messages, forms etc.). Additional features include dynamic voice control, dictation, transcription of recordings and more.
LITERATURE SURVEY

Voice based applications are taking over the market. Text will be replaced by voice in near future. Voice provides good authentication in comparison to text. In future all computer and mobile applications will be accessed by voice which will remove the need to use hands to deal with any handheld system.
METHODOLOGY

Voice recognition consists of two main processes: acquiring speech signals, and processing the signals with computer algorithms to remove background noise and detect the speech accurately. Acquired signals can be used to manipulate different actions, such as the rejection of background as well as white noise, to follow the command of the user, or accurately move the object such as wheelchair upon users wish.

In voice recognition, numbers of DSP algorithms are used to process the speech signal. Often preloaded libraries

intelligently predict the future words and complete the word/sentence based upon users initial words. Speech recognition is generally implemented using Voice Activity Detection (VAD) for start and end detection, as well as zero crossing method and 4th order cumulants to determine the presence of speech. In order to achieve a quality speech signal, the bit rate and the sampling frequency of input signal should not be exceedingly high.

In case of speech detection for dysarthrias patients, the overall algorithm and process become more complex due to difference in energy and frequency of tone. Some problems associated with the speech of the dysarthrias due to neuromuscular deficiency are velpharygeal noise, irregular articulation breakdown and mispronunciation of the fricative /v/ as the nasal /m/ .
BLUDING BLOCKS FOR IMPLEMENTATION Both a strong software and hardware are necessary to

implement a speech detection system. The first element

required for speech detection is DSP algorithm as well ADC/DAC. The microphone is connected at the input of the system where speech signal is detected. The input signal is send to the signal procesing CMOS chip, where the DSP algorithm is performed on the speech signal, and finally the signal is outputted to the speaker which is connected to PCB board using USB. Since Matlab is slower compared to C/C++, DSP coding is mainly done in C++ programming for processing signals. Overall CMOS chip would include modules for acoustics, dictionaries, along with recognition decoder and ADC at the input end. Once the speech signal is detected, software will process the signal to accurately detect the speech and result would be outputted through D/A convertor. The load end of the system should be low resistance approximately 75-800 in order to reduce the overall system power consumption, which would be helpful in powerless CPU system, or where the utilization of speech recognition systems has been limited by software . Or in other words, the input impedance should be 5-10 times higher compared to output impedance of the system.

On the hardware side, the main input element is the microphone. In order for microphone to supply a good speech signal to the ADC on the chip, it should meet important specification. Microphones respond to 20 Hz to 20 KHz frequencies better compare to higher frequencies. Sensitivity of the microphone should not exceed +/- 3dB. In addition, the voltage produced in responses to an acoustic stimulus should be in hundreds of mV/Pa in a microphone. For example, a sensitivity of 70 mV/Pa means the microphone produces an output of 70 mV when presented with an input of 1 Pascal (94 dB SPL).
ADVANTAGES AND FEATURES

VII. FUTURE SCOPE

Human computer interaction in artificial intelligence is a promising field, IVOS can largely affect how users use their computer with minimum use of keyboards and mouse.

REFERENCES
DISADVANTAGES

User may not be able to write a mail using free speech, however on-screen keyboard will help him with this kind off scenario

"Speech recognition for disabled people". http://www.businessweek.com/1998/08/b3566022.htm.
John Pierce (1969). "Whither Speech Recognition". Journal of the Acoustical Society of America.
Janet M. Baker, Li Deng, James Glass, Sanjeev Khudanpur, Chin- Hui Lee, Nelson Morgan, Douglas OShaughnessy (MAY, 2009). "Research Developments and Directions in Speech Recognition and Understanding, Part 1". IEEE SIGNAL PROCESSING MAGAZINE. http://research.microsoft.com/pubs/80528/SPM- MINDS-I.pdf. Retrieved May, 2010.
S. Suk, S. Chung, and H. Kojimma, Voice/Non-Vocie Classification Using Reliable Fundamental Frequency Estimator for Voice Activated
A. Little, and L. Reznik, Speech Detection Method Analysis and Intelligent Structure Development, In Proc. Australian New Zealand Conference on Intelligent Information System 96, 1996, pp. 2. [Accessed Sept. 3, 2008]Ekpe Okorafor, Mensah Kwabena Patrick Real-time Streaming Analysis for Hadoop and Flume, 2012

Intelligent Voice Operating System (IVOS)

Leave a Reply