Real Time Road Surveillance and Vehicle Detection using Deep Learning

— There is a need for an intelligent transportation infrastructure and now there are technologies which could help us. Artificial Intelligence (deep learning in particular) could help with a lot of solutions to increase the efficiency of the current systems. The ability to detect and classify vehicles accurately is of paramount importance for the intelligent systems to succeed. In a country like India with growing population and limited space, these systems could play a vital role in helping us get around in the near future. Here, the focus of this project is to solve a few problems that are very relevant in the context of India. The aim is to detect and classify vehicles efficiently on a real time basis. This sets the base for further actions to be taken. For example, these actions can be detecting helmet, detecting triples, detecting seat-belt etc... (depending on the type of vehicle). This system could potentially help reduce traffic violations and also improve upon the safety of those using the road network.


INTRODUCTION
Traffic accidents are one of the major causes of death, injuries and property damage. The reasons that lead to these accidents are driving over speed limit, driving under influences and not using helmets and seatbelts. It is reported in India there are almost 5 lakh traffic related accidents which have caused over 1 lakh deaths. Out of this approximately half of them are motorcycle related accidents. Travelling by a motorcycle has a higher risk of accidents than driving by a car or other vehicles. Motorcycle accidents have a high likelihood in resulting in an injury most of which are concussions and brain damage. This risk is higher for the riders who are not wearing a helmet. Wearing a helmet can somewhat prevent the rider from fatal injuries to the head and thus preventing death. In our country, the law asks the citizens to wear a helmet when riding or travelling in a motorcycle but there are many people violating it. So in order to make sure that the motorcycle riders are wearing helmet, a system should be there to detect helmet on a motorcycle riders and issue a penalty to these not wearing it. The existing systems used are either manual detections or using algorithms that are slow or less accurate. The proposed system uses YOLO model for detection which is fast and has high accuracy. The process of issuing the penalty is also automated in the system by detecting the registration number of the vehicle by means of the optical character recognition (OCR) and messaging the owner of the vehicle. This system can be further developed to detect more safety equipments.
As we head towards a more machine oriented future where we let machines take the important decisions, the need to design state of the art models with real time analysis and output capability is of paramount importance.
This project aims at developing a system where we could detect traffic violators, be it 2 wheelers or 4 wheelers and automate the process of detecting their license plate to issue fines. The helmet is checked upon in 2 wheelers to verify if laws are being violated. In case of 4 wheelers, the seat belt is being checked upon to deem a violator and hence the detection of the license plate.
Another area where technology looks to play a huge role is in the entire transportation sector. It's on due for a massive revamp which will render current techniques and methods obsolete. Self-driving cars are just around the corner and will definitely take the market by storm. In addition to self-driving cars, another booming industry is the autonomous navigation bots etc. In all these systems that require minimum human intervention, the basic step to be done is object detection and localization. It's analogous to us humans perceiving things around us and processing it in our heads before taking actions. Similarly we need to implement computer vision algorithms to enable computers to see and using this we can build upon models to take appropriate decisions that could later be put into a lot of applications. One area this project focusses is on developing a cheap, easy object detection technique for real time applications. This could provide the base for countless applications including helmet detection, seat belt detection, targeted business, autonomous bots, self-driving cars, assistive devices etc.
II. RELATED WORKS In the paper by Girish G. Desai et al [1] they have used raspberry pi and webcam as the basic setup. Using this setup only one license plate can be recognized at a time. It will take the back an front image of the number plate and will be sent to OpenALPR system for further processing. The image will be captured at 10 frames per second and is processed frame by frame. OpenALPR utilizes OpenCV cascade classifier for license plate detection and utilizes Tesseract OCR to distinguish the characters in the license plate. Cascade classifier and OCR need to be trained with Indian number plate for recognition purpose.
In the paper by C. Vishnu et al [2] they first use adaptive reference subtraction in the proposed system to track moving items. Then these moving items are given as input to a CNNs classifier, which then classifies them into two classes, motorcyclists and non-motorcyclists. After that, items other than motorcyclists are discarded and moved along artifacts expected as motorcyclists for the next phase in which we decide whether or not the motorcyclist is wearing a helmet using another CNNs classifier. Assume the head is in the center of the incoming photos and place the head in the top fourth part of the pictures. The motorcyclist's positioned head is then provided to second CNNs as feedback, which is qualified to identify with helmet versus no helmet.
In the paper by Zhiheng Yang et al [3] they used YOLO architecture for detection of pedestrians and vehicles. In this model first they will extract the pedestrian information from KITTI dataset as label information. Pedestrians will be slim in appearance and height is greater than width while vehicles are square or rectangle in shape and width will be always greater than height. These datasets are given to the YOLO model and will be trained using that architecture. Then hard negative will be generated from the same dataset. These will be then fine-tuned using KITTI dataset.
In the paper by Soumen Santra et al [4] chooses YOLO model due its high accuracy and speed. YOLO means you only look once. Unlike other model it will not go through each and every grid but analyses everything in single pass. They have used this scheme for all the detection process in this project like buildings, pedestrians and vehicles.
In the paper by Rohit C A et al [5] the videos from traffic surveillance camera is inputted into object detection system. When this system identifies human from the surveillance footage the system create a box around the person in the motor bike. This output is given to an image classifier and classifier will compare reference image with the new output image and tells if the rider and co-passenger is wearing a helmet or not based on a classification model. The image is classified using Inception model which gives more accurate results.
In the paper by Kuna Dahiya et al [6] background subtraction is used to distinguish between bikes and nonmoving objects. Then SVM classifier classifies rider with helmet and rider without helmet. The model had an accuracy of 92.87%.
In the paper by Felix Wilhelm Sieberta et al [7] uses RetinaNet model for the detection of the helmet. RetinaNet model was choosen because two-stage approaches are time consuming. The model achieved 76.4% accuracy.

III.
PROPOSED METHODOLOGY Figure 1 shows flow chart of the proposed model.

A. Vehicle and Person Detection
We have trained the YOLOv3 model to detect the motorcycle and person using more than 1000 images for each class. When an image of bike rider is inputted the model detects the motorcycle and person with great accuracy. Then after detection the image is used for the next step.

B. Helmet Detection
Like in the case of motorcycle and person, for helmet detection the YOLOv3 model is trained with more than 1000 images of helmet and images of human head. From the previous step the extraction image is divided into two halves and only top half of the image is used for detection since the helmet will be most probably in the top half of the image. This cropped top half is given to the model. This way the detection time can also be significantly reduced. If helmet detected is in the first half of the image then that image is discarded. If helmet is not detected in the cropped image then that image is considered as non-helmet wearer and the bottom half of the image which contains the license plate is saved to a folder along with the whole image for the next step. The model successfully detected the helmet and non-helmet wearing riders.

C. License Plate Extraction
The image of non-helmet wearer from the previous step is used as the input for this process. The extraction of license plate is done using tesseract OCR. The YOLOv3 model which trained was successful in detecting license plate from the image and draw bounding box around it. Then using the coordinates of the bounding box only the portion inside the bounding box is cropped out. This this cropped image is used for image processing for making better reading by the OCR. Using the OpenCV library canny edge detection is done followed by dilation process. Then contours are drawn and also unwanted contours are filtered out. This image is then given as input to tesseract OCR. Then this OCR with extract the license plate number from the image processed image.

D. Database Creation
A csv file is created which holds the details of vehicle owners and their contact details. Then from the characters recognised by Google tesseract OCR (license plate number), it is saved as second csv file. From the second csv file the column which contains the license plate number is matched with the first csv file and details are merged and saved as another csv file. The merging of database is done with the assistance of Vlookup implemented in python.

A. Experiments
The images for the system were captured using mobile phones. The whole system was done in python on Ubuntu 20.04 operating system. The system aims at detecting vehicles, license plates, helmets. If a violation is being detected, correspondingly the license plate is detected for further. The training was done on a custom dataset that was created from scratch. 1000 images for each labels i.e. Helmets, Human face, Person, Motorcycle, License plate were manually captured and labeled using LabelImg to get the appropriate coordinates for the bounding boxes. In total around 6000 images were collected and used for the training process. The training was done for 10,000 iterations with a batch size of 64. It took a little over 8 days for the training to complete and the model thus generated had a loss of 0.85. The next step is the detection and taking action part. It's shown in Fig.5.1, Here the live video feed is sampled and each frame is checked to detect objects. This is done by our trained Yolo network. The detected objects could be a 2 wheeler, 4 wheeler, Helmets, License Plate etc. These detected features could be further used for taking actions. Figure 2 shows the detection of the rider wearing helmet.  Figure 3 shows the rider not wearing helmet. Since the rider is not wearing helmet the bottom half of the image will be saved for the next step which is the license plate extraction.    CONCLUSION Road safety is emerging as a major social concern in the country and the government has been attempting to tackle the issue for so many years. The biggest problem is the lack of the strict enforcements of the traffic laws no matter what how minors the infringement. The more that the drivers are aware that they are constantly getting away with the breaking rules, the more it expands to other drivers. For making the safety efforts more successful people need to follow the rules more persistently. One such law that should be strictly monitored and enforced should be wearing of helmet by motorcycle riders. Wearing the helmet reduces the risk of concussions, brain damage and even death by reducing the impact of a force or collision to the head. So a helmet is a protection that while riding one should ignore. This system ensures that the motorcycle riders are wearing a helmet and if not issued a penalty. This helps to improve road safety and transportation efficiency and reduce the hassle of all the manual operations involved in the process of helmet detection and issue of penalty. Moreover it ensures people are abiding by the laws and setting example for future generations. For motorcycle riders wearing a helmet reduces the risk of death. This system ensures that the motorcycle riders are wearing helmet and if not they would be issued a penalty. The system finds the violators who are not wearing helmet by license plate extraction and matching it with the database.

ACKNOWLEDGMENT
First of all, we thank the God Almighty for his enlightening presence throughout the work and helping in completing it successfully. We are obliged to Ms. Ancy S. Anselam, Associate professor of department of ECE, our guide for all the help and guidance given to us for doing this project.