Face Mask Detection and Social Distance Monitor

DOI : 10.17577/IJERTV11IS060192

Download Full-Text PDF Cite this Publication

Text Only Version

Face Mask Detection and Social Distance Monitor

Mr. K Lakshmanan


Department of Information Technology PSG Polytechnic College

Coimbatore, India

R M Aaditya

Department of Information Technology PSG Polytechnic College

Coimbatore, India

J Dhanu Prasath

Department of Information Technology PSG Polytechnic College

Coimbatore, India

John K Simon

Department of Information Technology PSG Polytechnic College

Coimbatore, India

Abstract During pandemic COVID-19, the World Health Organization (WHO) reports suggest that the two main routes of transmission of the COVID-19 virus are respiratory droplets and physical contact. Respiratory droplets are generated when an infected person coughs or sneezes. Any person in close contact (within 1 m) with someone who has respiratory symptoms (coughing, sneezing) is at risk of being exposed to potentially infective respiratory droplets This theme consists of social distancing noticing and face mask detection for the events of disease like coronavirus can be solved by maintaining social distancing as well as wearing/putting on its face mask. This System can easily integrated/implemented to various embedded devices with limited computational capacity and will detect face masks in photos/images and in real-time videos.



    A droplet may also land on surfaces where the virus could remain viable; thus, the immediate environment of an infected individual can serve as a source of transmission (contact transmission). This little step of wearing the face mask, following social distancing would save plenty of lives as the spread of the novel coronavirus could be mitigated. In the last decade, AI/Deep Learning has demonstrated promising solutions on a variety of everyday living issues. Various daily tasks have been digitalized with the assistance of AI. In this mission, we will go over how to employ Python in conjunction with deep learning and computer imaginative and prescient to uncover social distance and identify face masks. Since this pandemic was caused by COVID-19, it is critical to wear masks and adhere to social distance for preventative purposes. People are being encouraged to restrict their contacts with one another in order to reduce the risks of the virus spreading through physical or close touch.


    Face Mask detector and social distancing reveal model are required for this. Face Mask Detection and Social Distancing is an important tool for slowing the spread of contagious diseases. In this job, we will put two program into action 1. Detection of Face Masks 2. Monitoring of Social Distancing. For facemask detection, we can train a face masks detector model using public datasets, then develop a deep learning-based version and test the findings

    on a real-time webcam. To monitor social distance, we will use item detection (for man or woman elegance) to find all persons, after which we will apply distancing measures (N pixels). We can also show distance between persons. For this utility, we are using Python, Open CV, YOLO (CNN), Deep learning, and computer imaginative and prescient.

    • According to a World Health Organization (WHO) study issued on August 16, 2020, corona virus disease (COVID-

      1. caused by acute respiratory syndrome has infected more than 6 million people worldwide, with an unknown number of people dying.

    • According to the Pan American Health Organization (PAHO), maintaining social distance, improving surveillance, strengthening fitness structures, and wearing a face mask or other covering over the nostrils and mouth reduces the risk of Corona virus spread. Face mask detection refers to determining whether a person is wearing a mask or not.



    This paper adopts a combination of lightweight neural network RestNet50 and YOLOV3 (You Only Look Once) with transfer learning technique to achieve the balance of resource limitations and recognition accuracy so that it can be used on real-time video surveillance to monitor public places to detect if persons wearing face mask and maintaining safe social distancing. Our solution uses neural networking models to analyze Real-Time Streaming Protocol (RTSP) video streams using Open CV and RestNet50.

    We mix the approach of modern-day deep learning and classic projective geometry techniques, which not only helps to meet the real-time requirements, but also keeps high prediction accuracy. If the person detected as not following the covid-19 safety guidelines, violation alerts would be sent to the control center at state police headquarters for taking further action. It allows automating the solution and enforces the wearing of the mask and follows the guidelines of social distancing. This model was created to run on computer local machine and the accuracy obtained was between 85% and

    95%. In recent years, object detection techniques using deep models are potentially more capable than shallow models in handling complex tasks and they have achieved spectacular progress in computer vision.


    • As a result, our newly suggested device first takes frequent photos of people's faces.

    • Then, using a custom Python script, adds a face mask to them, resulting in the creation of a fictitious dataset that is saved in a database.

    • In order to use facial landmarks to create a dataset of people wearing face masks, we must first start with a photo of someone who is not now wearing a face mask.

    • Following that, we use face detection and facial landmarks to calculate the bounding container position of the face to the image and localize the eyes, nose, mouth, and so on.

    • Following that, a picture of a mask is trained to be recognized by the system, so that the system can recognize and discriminate between a face with and without a mask.

      • If it recognizes the masks, it offers an inexperienced green shade and a scarlet hue if no masks are present. Similarly, the detection model recognizes persons and provides bounding box information.

    • Following human detection, the space between each detected centroid pair is estimated using the discovered bounding field and associated centroid facts. The usage of pixel to distance assumptions results in a set minimal social distance violation criterion. To examine, if the estimated distance is less than the violation threshold or not, the envisioned data is checked with the violation threshold.

    • The bounding container's color is initially set to green; if the bounding container falls inside the violation set; its color is updated to red.

    • Furthermore, the centroid tracking set of rules records the individual who breaches the detection and automatically sends an alarm message.



    The machine makes use of a switch gaining knowledge of method to overall performance optimization with a deep gaining knowledge of set of rules and a pc

    imaginative to reveal humans in public locations with a digital cam to discover humans with masks or no masks.

    We are loading the Mobile Net V2 with pre-skilled ImageNet weights, leaving the community head off and building a logo new FC head, attaching it to the bottom as adverse to the antique head, and freezing the bottom layers of the community of the humans deteced in video. If the gap among humans is much less than 2 meters, a pink bounding field is proven round them, indicating that they now no longer keep a social distance

    The end result extracts someone masks and presents a bounding field. The machine video displays units' public locations constantly and whilst someone without a mask is detected, his or her face is captured and an alert is dispatched to the government with face photo and on the equal time the gap among people is measured in actual time.


    The Proposed model uses ResNet50, which is a subclass of convolution neural network, and MobileNetV2 for the process of person detection using the framework of Tensor Flow. The key feature of this model is that it is able to detect multiple classes of objects at the same time. On the downside, this model will require more computations for more accurate results. The GPU acceleration is enabled which helps in performing faster computation compared to previous models. Various set of features such as eyes, nose, and mouth.

    The model will take the video frame as the input and output a list of coordinates in a bounding box in a rectangular across each person detected in the frame. The rectangular bounding box is represented as [x-min, y-min, width, height]. Each person in the video frame will have a centroid for the resulting bounding box. By calculating the distance between two centroids, the model is going to calculate the distance between two people. The calculation of distance between two centroids is done using the

    Euclidean Distance formula. If the computed distance is less than 3 feet then the person is not maintaining social distance if the distance is 6 feet or greater than 6 feet then the person is maintaining a safe distance.




        GeForce is a brand of snap shots processing units (GPUs) designed by NVidia. As of the GeForce 30 series, there were seventeen iterations of the design. Most recently, the GeForce era has been brought into NVidias line of embedded utility processors, designed for digital handhelds and cell handsets.


        The maximum vital hardware of our version is the digital cam. Customers can use any kind of digital cam he/she needs a good way to make certain their safety.


    SSD adoption started in excessive-overall performance era regions and in enthusiasts PCs, wherein the drives extraordinarily low get entry to instances and excessive throughput justified the better price. But they have got on the grounds that grow to be a frequent option — or maybe the default choice — in lower-price mainstream laptops and PCs


    1. PYTHON:

      Python is an interpreted high-level general-purpose programming language. Its design philosophy emphasizes code readability with its use of significant indentation. Python is dynamically typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly, procedural), object-oriented and functional programming.

    2. PYCHARM:

      PyCharm is an integrated development environment (IDE) used in computer programming, specifically for the Python programming language. It provides code analysis, a graphical debugger, an integrated unit tester, integration with version control systems (VCSes), and supports web development.


    The gadget is a deep getting to know answer that makes use of OpenCV and Tensor Flow, to teach the version. We integrate the deep getting to know YOLOv3 module with the SSD framework for a quick and green deep getting to know answer for actual-time human detection in video streams and use a triangular similarity method to degree distance among individuals detected by means of digital cam in actual time in public locations and accommodates custom designed facts series to remedy a face masks detection version with variance with inside the kinds

    of face mask worn by means of the general public in actual time by a switch of getting to know to a pre-skilled SSD face detector.

    In the proposed gadget, 3 steps are followed, such as:

      1. Model improvement and training

      2. Model testing

      3. Model implementation


    Our framework uses the transfer gaining knowledge of method and will fine-track the MobileNetV2 model, which may be a highly inexperienced shape that can be done to facet devices with constrained computing power, at the side of raspberry pi4 to encounter people in real time. We used 80% of our popular custom statistics set to train our model with a single shot detector, which takes handiest one shot to encounter a couple of devices that are determined in an picture graph the usage of multi box. The custom statistics set is loaded into the challenge list and the set of regulations is knowledgeable at the concept of the labeled images. We moreover use the YOLOv3 model for calculating the distance amongst humans. It creates a frame and devices and the usage of ok method it finds the distance the various devices.


    The gadget operates in an automated way and permits automatically perform the social distance inspection process. Once the model is knowledgeable with the custom facts set and the preknowledgeable weights given, we check the accuracy of the model on the test dataset with the useful resource of the usage of showing the bounding box with the decision of the tag and the self-notion score at the top of the box. The proposed model first detects all oldsters within side the style of cameras and indicates a green bounding box spherical certainly anybody whos a protracted manner from each exceptional after that model conducts a test on the identification of social distances maintained in a public place, if oldsters breaching social distance norms bounding box satiation changes to red for those oldsters and simultaneously face mask detection is finished with the useful resource of the usage of showing bounding boxes on the identified oldsters face with mask or non-mask labeled and moreover self-notion scores. If the mask is not visible within side the faces, and if the social distance is not preserved, the gadget generates a warning and sends an alert to monitoring authorities with a face image. The gadget detects the social distancing and masks with a precision score of 91.7% .The machine makes use of raspberry pi4 with a digital cam to routinely music public areas in real- time to save you the unfold of Covid-19.


    The machine makes use of raspberry pi4 with a digital digicam to routinely music public areas in real-time to save you the unfold of Covid-19. The skilled version with the custom facts set is set up withinside the raspberry pi4, and the digital digicam is hooked up to it .The digital digicam feeds real-time motion pictures of public locations to the version withinside , which constantly checks video

    display units public locations and detects whether or not humans preserve secure social distances and moreover checks whether no longer those human beings placed on masks. When the detection of a social distance violation thru humans is detected continuously in threshold time, there is probably an pink alert that instructs human beings to maintain social distance.


Chronic coughing and sneezing is one of the key signs and symptoms of COVID-19 contamination as according to WHO suggestions and additionally one of the main routes of ailment unfold to non-inflamed public. Deep mastering primarily based totally technique may be proved reachable right here to detect & restrict the ailment unfold through improving our proposed answer with frame gesture evaluation to recognize if an person is coughin and sneezing in public places whilst breaching facial mask and social distancing tips and based mostly on very last outcomes enforcement businesses can be alerted.


In this project, we have used a recent technique in the field of computer vision and in the deep learning. The proposed system will correctly detect the presence of face mask and person is in the safe the distance. The system is accurate, since we have used the MobileNetV2 architecture for detecting face mask and for distance computing we used Euclidean distance formula. Thus, it makes easier to deploy our model to embedded system and we believe that this approach will enlarge the safety of the individuals during the pandemic.


[1] https://keras.io/api/applications/mobilenet/#mobilenetv2- function

[2] https://arxiv.org/abs/1704.04861

[3] https://www.pyimagesearch.com/2019/06/03/fine-tuning- withkeras-and-deep-learning/

[4] https://www.who.int/emergencies/diseases/novel- coronavirus2019/advice-for-public

[5] https://www.cdc.gov/coronavirus/2019-ncov/prevent- gettingsick/cloth-face-cover.html

[6] https://www.who.int/news-room/detail/03-03-2020-shortage- ofpersonal-protective-equipment-endangering-health

Leave a Reply