Sign Language Detector Using Machine Learning

DOI : 10.17577/IJERTCONV11IS05013

Download Full-Text PDF Cite this Publication

Text Only Version

Sign Language Detector Using Machine Learning

Rudresh N C Assistant Professor Department of

Information Science and


PESITM, Shivamogga Email


Chandana K J Under Graduate Department of

Information Science and


PESITM, Shivamogga Email kjchandana.hlk@gmail. com

Gagana D Under Graduate Department of

Information Science and


PESITM, Shivamogga Email gaganadarmaraj@gmail. com

Pallavi K G

Under Graduate Department of Information Science and


PESITM, Shivamogga


Prashanth G V

Under Graduate Department of Information Science and


PESITM, Shivamogga


Keywords – Sign Language detector, signs or hand gestures, Machine Learning.


  1. Machine learning is a branch of AI that enables computers to improve and self-learn by analyzing and learning from data. It works by learning from patterns in the collected information. Models and algorithms can then make predictions based on their experience.

    On the other hand, machine learning is a process that enables computers to solve problems without requiring human intervention. It can also take actions based on prior observations.

    To automate calculations, data processing and pattern recognition, you can feed machine learning algorithms examples of labeled data (training data)

    The act of communicating is the act of transferring information from one place, person, or group to another. Three components make up communication: the speaker, the message, and the listener. The speaker's message will only be considered successful if the listener understands it. There are various types of communication, including formal and informal communication, oral (face-to-face and distance) and written communication, non-verbal, grapevine, feedback, visual, and active listening.

    The non-verbal communication is one that involves gestures, facial expressions, and body language.

    Despite their ability to communicate among themselves using sign languages, people with normal hearing find it difficult to communicate with the deaf and dumb and vice versa due to a lack of knowledge of sign languages. Using such a solution, one can easily translate sign language gestures into English, a commonly spoken language.

    The solution is our project called Sign Language Detector. This project will solves the problem of normal people who dont know the sign language to communicate with deaf and dumb peoples .

    Characteristics of Machine Learning:

    • Since our paper is based on the Machine Learning, it is important that Camera-based image acquisition and image processing of

      signs which we done in front of a webcam should be captured and the platform should be trustable. Utilization of Machine Learning enables this.

    • After capturing the signed images or video it should predict the outcomes. For this reason, Machine Learning is used.

    At this age of Technology, the demand for a computer-based system is tremendous. Interesting technologies for speech recognition

    are being developed. The most important tool in sign language is fingerspelling , which spells out words character by character , along with world level association where hand gestures convey the word meaning. Fingerspelling is a vital tool in sign language as it enables the communication of names addresses and other words that do not carry a meaning in word level association.

    Giving to sign language detector is frequently driven by a desire to improve society and communication need to be developed. Disrupting the sign language system with Machine Learning. Build trust with Peoples, reach the right people, and improve administration costs and efficacy.

  2. A new sign language detector model based on machine learning was developed in the study of paper [1], which analyses machine learning technology. Main thing here is this paper explained about how the detected hand gesture is translated to the users common language and its process.

    The paper [2], This paper is about the types of communication, how different peoples are communicating with each other using different communication methods.

    The paper [3]. Explains about the computer vision and most important thing for this project called as pattern recognition or sign recognition

    So we come to know that how the signs will be detected in front of webcam by this paper only .

    The paper [4], In this paper we got an idea about the sign language recognition process and how to generate the accurate output and importantly how to translate that sign language to user known language


  3. Using TensorFlow object detection API, the proposed system develops a real-time sign language detector that is trained through transfer learning using the created dataset . A webcam is used to capture images for data acquisition using Python and OpenCV as described in Section. After the data acquisition, a labeled map is created that contains the label of each sign (alphabet) along with its identifier. This map represents all the objects within the model.

    There are 26 labels on the label map, each representing a letter of the alphabet. Each label has a unique id ranging from 1 to 26. We will use this id as a reference when looking up the class name. A TF record is the binary storage format of TensorFlow and is used to train its object detection API. TF records are created using generate_tfrecord . which is used to train the TensorFlow object detection API. TensorFlow's binary storage format is the TF record. The use of binary files for data storage dramatically impacts the performance of the import pipeline, as well as the training time. Object detection models can be developed, trained and deployed easily with TensorFlows open-source framework, TensorFlow object detection API, which takes up less space on a disk, copies fast, and reads efficiently from the disk.

    For pipeline configuration, dependencies such as TensorFlow, contiguity, pipeline_pb2, and text format have been imported, in order to train a pre- trained model with the created dataset. It has been decided to change the number of classes from 90 to 26 and the number of signs (alphabets) that will be used to train the model. After setting up and updating the configuration, the model was trained in 10000 steps. During the training process, the hyperparameter was set up to 10000 steps in which the model will be trained.

  4. A software system or application's overall design and structure, including all of its parts, modules, interfaces, and data flows, is referred to as system


    Fig 1: System Architecture

    It entails drawing up a design for the system's construction, identifying the essential elements and how they relate to one another, and establishing the interfaces that connect them

    When creating a system architecture, it's crucial to keep in mind a number of fundamental principles, such as modularity, scalability, maintainability, and dependability. Scalability is the system's capacity to accommodate growing volumes of data or users over time, whereas modularity refers to breaking the system down into smaller, more manageable components. Reliability is the system's ability to function, whereas maintainability is the simplicity with which it may be upgraded or adjusted.

    In this architecture we will illustrate the design nd functional phase of our application, The User accesses the Graphical User Interface .

    The next step it will describe how the users are going to use this project . Firstly they have to register themselves and login to portal and they have to make a sign languages it will be captured by webcam and it will detected those signs and predict the outcome in known language.


    This Fig describes the workflow of the Sign Language Detector

    Fig 2: Workflow Diagram

    The flow chart of the project starts with the login, whoever may be the person they have to enter their login details to enter the platform.

    After the success of the login the person get

    the access to webcam then they make the hand gestures it will be captured by that webcam then image processing will be done there then it converting the image which is captured into grayscale for better vision then that converted image is compared with dataset trained and it extract the image meaning after the extraction it convert that grayscale image into input image which was given first and finally the result will be displayed for the user.

  5. Though Machine Learning has several advantages like Automation of everything and efficient handling of data it is still difficult to fulfill the needs of every user because of the unaccurate tracking of hand gestures, occlusion of hands, and high computational cost.

    Since detection is one the main aspect of the Sign Language Detector system, we need to make sure

    that all the sign has been predicted or not in the platform. To overcome this problem we used a best machine learning for more accuracy and exact detection.

    Since this is a real-time project and also the signs will be captured live so tracking or detection of hand gestures is way more important.

  6. We will study the method of Machine Learning technology and implementation of Machine Learning technology for detection of hand gestures .

    We will scrutinize the Machine Learning technology by the end and also we will build the platform which is more transparent and secure than any other platform by making it trustworthy by detecting every single signs in the platform which is clear to the users.

    • As Machine Learning itself is a new technology it is widely used in future .

    • The detection that we are projecting in this system will encourage the other country sign languages which will increase the users for our project.

    • Due to the digitalizing society ,our system will help to promote it as we are using graphical user interface to detect the sign languages rather than learning those languages in offline by others.

    1. R. Harini, R. Janani, S. Keerthana, S. Madhubala and S. Venkata Subramanian, "Sign Language Translation," 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 2020.

    2. Kapur, R.: The Types of Communication. MIJ. 6, (2020)

    3. 1. P Viola and M Jones "Rapid object detection using a boosted cascade of simple


features" Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001) December 8-14 2001.

4. Bragg, D., Koller, O., Bellard, M., Berke, L., Boudreault, P., Braffort, A., Caselli, N., Huenerfauth, M., Kacorri, H., Verhoef, T., Vogler, C., Morris, M.R.: Sign Language Recognition, Generation, and Translation: An Interdisciplinary Perspective. 21st Int. ACM SIGACCESS Conf. Comput. Access. (2019).