Creation of Speech Processing Model for Civil Engineering Profile

DOI : 10.17577/IJERTCONV8IS13015

Download Full-Text PDF Cite this Publication

Text Only Version

Creation of Speech Processing Model for Civil Engineering Profile

P Saikeerthi Dept. of Ece, Student, GSSSIETW

Poorani A Dept. of Ece, Student, GSSSIETW

Dr S Padmashree Dept. of Ece, Professor, GSSSIETW

Abstract: In recent years, Speech recognition technology has become an increasingly popular concept. Speech recognition is the ability of a machine to identify words and phrases in spoken language and convert them to a machine – readable format. Manual error detection in profile of Civil Engineering drawings has always been a tedious task for reviewers working under Civil platforms. This paper effectively reduces engineers work in commenting on each profile of an Engineering drawing by developing a model using speech processing. The supporting technologies are Flask – Web framework and a Html page as a User Interface working on the frontend. The paper makes a major use of technology in Engineering field for a better and accurate performance in reviewing profiles of Steel structures.

Keywords: Speech Recognition, User – Interface, Flask Frame- work, Web page, Civil Engineering Profile.


    The major issue faced by industries, is lack of awareness about the current technologies in the market and methods to use those technologies in industry to improve production rate and the count of clients and customers by providing a satisfactory service and products.

    Figure 1: Civil Engineering Profile

    A drawing that contains information on landscaping, grading or other site details is called a Civil drawing or a Site drawing. These Civil drawings are intended to be provide a complete picture of all the required essentials in a construction site to a civil engineer.A Civil Engineering Profile (in Figure 1) is obtained by from the initial Contract sheet which consist of a raw output of any civil structure with the profile name and the specification of each profile attached to it.

    This contract sheet is then replicated into a detailed Erectsheet (E-plan) where each profile of the Civil Structure is represented in a detailed form. Later the E-plan is redrawn by specifying each profile as a separate structure with their specification incorporated in them called the Drawing sheet (D-sheet). The Engineers review these D-sheet, an example of a D sheet is shown in Figure 2, with respect to the Client sheet for variation in profile specifications and annotate them if any by following the relationship present between the profiles which is accomplished using Ontologies between the profiles.Ontology is the philosophical study of being. Broadly its studies concepts that directly relate to being,existence, becoming, reality, as well as the basic categories of being and their relations.

    Manual error detection in profile of engineering drawings has always been a tough task for developers. Rather than performing this tedious and time-consuming process manually it would be time efficient if a computer is made to do things through speech processing. In the present industries engineers are manually annotating the Engineering Profiles making it a tedious process based on the survey made.

    Figure 2: D – Sheet

    Having an Application Program Interface (API) to understand a speech input, recognize it and replicate it in text with a background process working on User Interface and web-client interface not only reduces time consumed in manual typing but also increases accuracy. The Speech input used in the Civil Engineering profile consists of various Civil engineering terms used in annotation of the Civil engineering Profile.

    Speech processing is the study of speech signals and processing methods. Speech processing is defined as the study of speech signals and their processing methods, and as the intersection of natural language processingand digital signal processing. Speech recognition is the ability of a machine or program to identify words and phrases in spoken language and convert to a machine-readable format. The signals are usually processed in a digital signal processing, applied to speech signals. Speech processing technologies are used for digital speech coding,text -to-speech synthesis,spoken language dialog systemsand automatic speech recognition. In recent years, Speech recognition technology has become an increasingly popular concept. Work processes become more efficient because document processing time is reduced. Documents can be generated three times faster with speech recognition. Speech recognitions commonly used to perform commands, operate a device, or write without having to use a keyboard, mouse or press any key.

    User Interface, abbreviated as UI, is a space where human- machine interaction can take place and the operator can reduce the number ofinputs to the machine and the machine, I turn can reduce the number of unnecessary outputs to the user. The UI is used in this paper to upload the Civil Engineering Profile drawing for review by the engineers.

    Flask is a micro web framework written in Python. It does not require any sort of particular libraries and tools; hence it is classified as a microframework. It has no form validation, database abstraction layer, or any other components where pre-existing third-party libraries provide common functions. A framework is a code library that makes a developers life easy by building scalable,reliable, and

    maintainable application providing reusable code. A frame work is used here to generate a link between python code and the user interface.

    This paper makes way through speech processing for the profile review of Civil Engineering drawings with the use of civil engineering terms and ontologies to build relations. Speech is widely preferred than text since manual typing incurs time consumption and reduction in accuracy rate as the model to be generated has a potential to recognize the correct terms despite the changes in the accent and pronunciation.


    The literature survey is split into three section each section is a review of each Speech processing, Machine Learning, Civil engineering drawings. Various approaches have been applied in order to accomplish Speech recognition. In survey for speech processing model Hidden Markov Model (HMM) and Automatic Speech Recognition Model (ASR) [1][2][3] proved to be more efficient as the produced a69.4% accuracy with random dataset. In literature survey [4] used GM – – based prediction of voice quality for Speech disorders. And noise extraction. Under machine learning literature survey [5][6] models the phone-specic spectral envelope information up to 2-4 ms speech. Provided that the proposed CNN-based approach yields ASR and used mainly for virtual Speech assistant. Papers [7] [8] worked on steel consumption in the steel structure residence. Steel consumption of H- section steel beams and square steel tubular columns is taken as objective function, which has relatively larger proportion than others and showed that after the structural optimization design, the consumption of steel GL1 was reduced by 10.2%, the consumption of square steel tubular column occasionally was reduced by 15.0%, compared to those of the original design. Relatively total steel consumption was reduced by 12.6% to the original design. The approach proposed in these papers is an effective method for the optimization design of steel structure residence. These papers summarized recently developed methods and theories in the developing direction for applications including evolutionary computation, reasoning, expert system, learning and classification as well as like chaos theory, simulated annealing and knowledge- based engneering.


    The major objective of this paper is to reduce the time consumption in Speech Reviewing and increase accuracy. The proposed model is designed in such a manner that it meets the assigned objectives. The model consists of three blocks each for the User – Interface, Framework and the Speech Recognition module. The block diagram of the model is shown in Figure 3.

    Figure 3: Block Diagram

    Step 1. An input D – Sheet is obtained with the help of the User Interface and the obtained D – Sheet is verified for reviewing.

    Step 2. Python Speech Recognition code runs in the background in order to recognize the input speech given as a review and the speech is converted in to a read – able format. Step 3. The Flask frame work stores the converted text and the same is displayed on the user interface aside the input D Sheet.

    Step 4. The processed is continued if further reviews are needed.

    All the three blocks in the model work simultaneously in order to accomplish the required task.

    The python code is used for speech recognition and text conversation and the frame work stores the converted text. The User Interface is used get the input D Sheet and also to display the reviewed output.

    The User – Interface consists of a user detail form and the column of the review page consists of the input space and the review space. The input D sheet is shown in the input space and the converted text is displayed on the review space.

    The Python code for Speech processing is triggered through the User – Interface itself. Once the python code is triggered it initiates speech recognition module and through the speech recognizer recognize_google() the input speech is recognized and is converted to the text format.

    The converted text format is stored in the Web frame work Flask which acts as a link between both Web page and Speech recognition module. The stored data through the python code is then displayed on the review space of the web page.


    Figure 4 resembles the operation process in a pictorial format. It picturizes the process such as the data input, working process of the python code and the web frame work

    Flask. And the process form for text conversion of speech and the storage of converted text and its display on the review sheet.

    Figure 4: Flow Chart


    The main objectives of this paper are to create a model which can understand the user input and replicate the same to perform reviewing of Civil Engineering profile in an accurate and time reliable manner. The proposed model proved to be effective in accomplishing the above objective.


This work is supported in part by SANRIA Engineering and Consulting Pvt Ltd. a technology – driven engineering company which provides multi disciplinary solutions in connection design,3D Modelling, Steel Detailing, As – Built Engineering, Building designed Project Management consulting.


  1. [1].Keiichi Otani and Takaaki Hasegawa, Member, IEEE,The Image Input Microphone-A New Nonacoustic Speech Communication System by Media Conversion from Oral Motion Images to Speech, Ieee Journal On Selected Areas In Communications, Vol. 13, No. I, January 2005 .

  2. [2].Giuseppe Riccardi, Senior Member, IEEE, and DilekHakkani-Tür, Member, IEEE, Active Learning: Theory and Applications to Automatic Speech Recognition, IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 13, NO. 4, JULY 2006.

  3. [3].Alexander Kain and Michael W. Macon, SPECTRAL VOICE CONVERSION FOR TEXT-TO-SPEECH SYNTHESIS, Center for Spoken Language Understanding (CSLU) Oregon Graduate Institute of Science and Technology P.O. Box 91000, Portland, OR 97291- 1000, USA

  4. [4].DOUGLAS OSHAUGHNESSY, SENIOR MEMBER, IEEE, Interacting with Computers by Voice: Automatic Speech Recognition and Synthesis, Invited paper.

  5. [5].OytunTürk and Marc Schröder, Evaluation of Expressive Speech Synthesis with Voice Conversion and Copy Resynthesise Techniques, IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 5, JULY 20105 965.

  6. [6].Dimitri Palaz,MathewMagimai, Ronan Collobert, Analysis of CNN-based Speech Recognition System using Raw Speech as Input

    ,Idiap Research Institute, Martigny, Switzerland AI Research, Menlo Park, CA, USA.

  7. [7].VetonKëpuska& Gamal Bohouta, Next-Generation of Virtual Personal Assistants (Microsoft Cortana, Apple Siri, Amazon Alexa and Google Home),Electrical & Computer Engineering Department Florida Institute of Technology Melbourne, FL, USA.

  8. [8].Pengzhen Lu, Shengyong Chen, and Yujun Zheng, Articial Intelligence in Civil Engineering, Faculty of Civil Engineering & Architecture, Zhejiang University of Technology, Hangzhou 310023, China, November 2012.

  9. [9].Zhang Hao, Liu Tielin, Liu Hong, Wang Zheng, Optimization design for beam and column of steel structure

  10. residence, Shenyang Jianzhu University, Shenyang, Liaoning, 110168, China, 2015 8th International Conference on Intelligent Computation Technology and Automation.

Leave a Reply