Real-T ime Fatigue Detection System using Computer Vision

-Fatigue can lead to low productivity and can also cause accidents in case of drivers. According to our study, some work has been done for detection of drowsiness. The research aims to detect the onset onset of fatigue in people etc. which will also help detect a certain disease which have fatigue as symptoms. This paper proposes a Fatigue Detection System. Our system will work on Computer Vision. Our main focus is on the person’s facial expressions. Our system will have five stages-Face detection, Facial landmark detection


I. INTRODUCTION
Fatigue means feeling overtired, with low energy and a strong desire to sleep that interferes with normal daily activities. Nowadays, people are likely to get more fatigued due to change in lifestyle, eating habits, sleep patterns and other medical reasons. Fatigued workers are simply less alert and are more likely to exercise poor judgement. Fatigued drivers are not in the state of driving and this can cause accidents. Drowsiness is the major cause of accidents in the entire world. Maximum accidents in the world are caused due to fatigue or drowsiness. FDS is very important in vehicles to avoid accidental deaths. Fatigued students are also unable to concentrate in class and that in turn leads to poor judgement. Many companies have been researching to develop FDS. This proposed system aims to detect fatigue in humans and prevent the problems that fatigue causes using Computer vision. Computer vision is a field of computer science that focuses on replicating the human system and teaches computers to identify and process objects in images and videos. Computer vision has various applications and one such application can be for detection of facial features. Generally, there are various methods to detect fatigue which include, Behavioral measures and Physiological measures. This behavioral measure includes eye closure, eye blinking, yawning etc. The physiological measures include usage of sensors which are attached to the body to measure pulse state, heartbeats etc. If physiological measures are used the cost of the system increases. Hence, we have proposed a low cost, web-cam based approach to detect fatigue which will extract the eye and lip status from the face and give off an alarm when fatigue is detected. We focus on Image Processing and Machine Learning. The main objective of this system is to make to it low cost and easily deployable to Indian markets.
II. RELATED WORK With reference to the existing driver drowsiness detection systems help us to build up a "Fatigue Detection System" being more efficient. We learn from the findings of various research papers and intend to improvise the existing system and hence come up with our proposed system. Following are the various insights gathered from different papers which have proved helpful for our literature survey. In the paper "Real-time Driver Drowsiness Detection for Embedded System Using Model Compression of Deep Neural Networks",[1]the authors Bhargava Reddy, Ye-Hoon Kim, Sojung Yun, Chanwon Seo and Junik Jang have developed a system using deep neural networks. They have prepared a high baseline model on a light-weight board and so it is very accurate. The proposed model achieved an accuracy of 89.5% on 3-class classification and speed of 14.9 frames 3 per second (FPS) on Jetson TK1. In the paper "Driver Drowsiness Monitoring System using Visual Behaviour and Machine Learning",[2] the authors Ashish Kumar and Rusha Patra propose a low cost, real time driver's drowsiness detection system is developed with acceptable accuracy. Their findings lead to the conclusion that Bayesian classifiers give lesser accuracy as compared to SVM and hence using SVM is more beneficial than any other classifier. In the paper "Driver Drowsiness Detection with Audio-Visual Warning, [3] the authors Nikita Prajapati and Pooja Bhatt propose a hardware system which uses Raspberry Pi for the system. Face and eye detection are done using Haar cascade classifier. Open CV is used to increase efficiency.

Conventional Approaches for FDS
To detect fatigue while driving the driving pattern is calculated. This can be done by various ways. The steering wheel movement measurement is one of the ways. Lane deviation and lateral position deviation are also ways to detect drowsiness or fatigue in case of drivers. However, the driving pattern-based techniques are highly dependent on the driving skills, road conditions and vehicle characteristics. The second class of techniques uses data taken from physiological sensors, like EEG, ECG and EOG data. EEG signals contain information about brain's activity. Intrusive method is not always reliable. However, the major drawback of this method is the intrusiveness which disturbs the person by attaching many sensors on the body. Our system is an alternate and quick method to detect fatigue. Even if non-intrusive method is used then the system is expensive and not cost effective. The last one is based on facial feature extraction using Computer Vision, where behaviors such as eye closure, head movement, yawning duration, gaze or facial expression have been used. III THE PROPOSED SYSTEM AND COMPUTATION OF PARAMETERS The block diagram of our fatigue detection system has been given in the figure 1. As per our diagram we can see that the first step is to record the video using the webcam which will be placed in front of our subject so that a proper front image of the subject is captured easily. After the camera starts recording the video, the video is split into frames which are usually taken as 2D images. We detect face with the help of linear SVM algorithm and Histogram of Oriented Gradients. Once the face is detected, we mark the position of the eye and lips with the help of a facial landmark detector and use the eye aspect ratio (EAR) and Mouth opening ratio (MOR) to determine whether the subject is fatigued or not. If fatigue is detected with the help of these various machine learning approaches then an alarm is given off to alert the person. The details are given below.

A. Face Detection-
The video will be recorded with the help of the webcam of a laptop and it will be processed and then frames will be extracted from the real time video. Once the frames are extracted, the face is detected. There are various ways for face detection like OpenCV's Haar cascades or deep learning algorithms, but we are using the Histogram of Oriented Gradients for Face detection in our system. This algorithm takes B positive samples from our training data and then it will extract HOG descriptors from the samples. It then takes C negative samples from the negative data set which does not contain anything that we wish to detect and it then extracts HOG descriptors. B&gt; &gt; C. A linear SVM machine is trained on these samples and then hard mining is applied. The classifier is again retrained using the hard-negative mining approach. We found HOG with linear SVM a faster approach to detect faces in our system. These also draw a bounding box in front of the face.

B. Facial Landmark detection-
Facial landmarks are used for representing the various features of the face which are eyes, eyebrows, nose, mouth etc. We are focusing on the landmarks of eyes and mouth and we used dlib, OpenCV and Python for the same. Using the various shape prediction methods, it is important to localize the face first and then detect the facial features on the region of interest. The facial landmark detector that we are going to use is a part of dlib library and it's an implementation by Kazemi and Sullivan. This method works by taking a training set of images which have the facial landmarks labeled on them and the co-ordinates are specified as(x,y) of regions for each facial structure. It also calculates the probability distance between pixels. And this data is used to train regression trees and the end result is a facial landmark detector. This facial landmark detector works in real time and gives predictions.
Our facial landmark detector is pre-trained and has 68 coordinates that map on the face as given in figure 2. The first input given to this detector from our system is a grayscale image which is loaded on the disk via OpenCV and converted to grayscale after resizing and pre-processing. The second input is the number of image pyramid layers. Once this facial landmark detector is used, we get the landmarks of required facial features.

C. Facial Feature extraction-
The specific facial features will be extracted by using the indexes of these facial features. As given in the fig 3, we require the indexes for left eye, right eye and the mouth in our system. We extracted the regions using simple python indexing.

E. Mouth status detection-
Mouth opening ratio detects yawning during fatigue and it can be calculated as-Similar to EAR, MOR increases when mouth opens for yawning and then decreases. We initialize the yawn status and count and checked whether the distance between upper and lower lip is eccentric. If yawn is detected we increased the yawn counter. The MOR also has the same relationship between its numerator and denominator with respect to EAR.

F. Fatigue detection-
In the start, the subject is in awake state which is the setup state. The last step of the algorithm is to determine the person's condition based on a pre-set condition for drowsiness. The average blink duration of a person is 100-400 milliseconds (i.e. 0.1-0.4 of a second). Hence if a person is drowsy his eye closure must be beyond this interval. We set a time frame of 5 seconds. If the eyes remain closed for five or more seconds, drowsiness is detected and alert pop regarding this is triggered. For detection of fatigue, when EAR is calculated, if the eye ratio falls below a certain threshold then the number of frames are counted for how long the person has closed their eyes for. If the number of frames exceeds the limit then the alarm is given off. The MOR is only used as a counter and at the end the number of yawns are displayed.
IV. RESULTS AND DISCUSSION Implementation of the above procedure was done using OpenCV and Python. The video camera captured runtime and since the images from the frames were preprocessed and the resolution was increased so it became easier to detect the facial features. The detection of the face followed by the detection of eyes and mouth using facial landmark detectors made the system low cost and feasible. Other approaches like deep learning are slower as compared and are also not cheap. When the closure for eyes for successive frames is detected, it makes sure that drowsiness and detected and not just the single blink gives off an alarm. Hence it was found important to use multiple frames for drowsiness alarm. If the normal blink was registered then the state of subject was analyzed in a loop again and again.
This system has been developed with the help of generated data and tested with the same. The system is connected with the webcam for real time processing and analysis. The feature values for MOR and EAR are stored for each frame in the system. The threshold for EAR is set to 0.25 and MOR to 0.6 after testing by taking an average of 300 frames. Our developed algorithm has been tested with real time video streaming of around 6 people. The performance of our system is of acceptable accuracy. The real time videos had different light conditions hence our system can even work in low illuminating conditions with the help of infrared cameras. Subsequently, statistical analysis was done by classifying the frame as normal, yawning or eye-closed. An alarm goes off on both yawning and eye-closed conditions. SVM was used for classification and is used on real time data.
Sensitivity is calculated as the ratio of correctly classifying drowsy states out of all actual drowsy states and specificity is computed as the ratio of correctly classifying awake states out of all actual awake states. Overall accuracy is computed as the correctly classified states out of all the frames.
Method Sensitivity Specificity Accuracy SVM 0.956 1 0.958 Table 3. Accuracy of SVM  Figure 1. Graph for Precision rate per Instance As per graph I, we considered 6 instances of 6 different people to calculate the precision rate of the system. The first instance showed 100% precision whereas the last instance showed 75% precision. The average precision rate was 90.88%. The system is hence accurate and precise.

CONCLUSION
In future, this system can be converted into a mobile application wherein the subject will be able to log in and the application would keep track of the status of the subject. Also in few cases, like drowsiness detection of drivers, the alarm won't be enough to prevent accidents and it will be important to sync the fatigue detection system with the motor of the vehicle so that when drowsiness is detected the vehicle can give a warning signal and slow down automatically. In case of use of this system in educational organizations, multiple faces can be detected together to detect the attentiveness of students and the warning maybe given to the respective heads.
In this paper, a low-cost fatigue detection system has been proposed using Image Processing and Machine Learning. The equations like EAR, MOR are used and are computed from the real time video and it is analyzed. SVM is used for classification. From the study and design of our work it is clear that usage of OpenCV is even more suitable for this particular application in terms of size, cost and power requirement. Minimum facial features are considered as input to the system. Results showed that eyes and mouth play the major roles in drowsiness classification. The results are accurate and reliable for detection of eyes & mouth. As a future work, this system can be implemented in cars and connected to its system. Camera can be used for low lighting conditions can further increase the accuracy.