Human Face Recognition Using Image Processing

Download Full-Text PDF Cite this Publication

Text Only Version

Human Face Recognition Using Image Processing

Khushbu Pandey1, Reshma Lilani2, Pooja Naik3, Geeta Pol4 Electronics & Telecommunication Engineering Department, KCCEMSR, Thane, India

1 khushipandey05@,2 reshmalilani1@,3 Naik.Pooja60@


Image compression is a relatively recent technique based on the representation of an image by a contractive transform, on the space of images, for which the fixed point is close to the original image. The aim is to discover which techniques are the most efficient and best applies to the project undertaken. It is a computer application for automatically identifying or verifying a person from digital image or a video frame from a video source. This paper presents a real-time image processing of human face identification for home service robot (HSR). This vision system is set up by two individual sub-systems. The first one is face detection and tracking sub- system based on adaptive skin detector, condensation filter with parallel computing particles, and Haar-like classifier. And a simple and fast motion predictor is also proposed for face tracking.


    Image Processing is a method to convert an image into digital form and perform some operation on it, in order to get an enhanced image or to extract some useful information from it. It is a type of signal dispensation in which input is image, like video frame or photograph and output may be image or characteristics associated with that image. Usually image processing system includes treating images as two dimensional signals while applying already set signal processing methods to them. It is among rapidly growing technologies today, with its applications in various aspects of a business. It includes basically three steps as importing the image with optical scanner or by digital photography, analyzing and manipulating the image which includes data compression and image Enhancement and spotting patterns that are not to human eyes like satellite photographs and output is the last stage in which result can be altered image or report that is based on image analysis. Computer vision(CV)is computer imaging where the application does not involve a human being in visual loop. One of the major

    topics within this field of computer vision is image analysis. First image analysis involves the examination of the image data to facilitate solving vision problem. Second analysis includes two other topics as feature extraction which is the process of acquiring higher level image information, such as shape or color information and next is Pattern Classification which is the act of taking this higher level information and identifying objects within the image.Face recognition has repeatedly shown its importance over the last years and so not only it is a vividly research area of image analysis, pattern recognition in more precisely biometrics, but also it has become an important part of our everyday lives since it was introduced as one of the identification methods to be used in e- passports[16].Our topic on image processing is a technique of identifying the persons by a Robot on a real time basis. We are using Image Processing Technique that can detect multiple faces. It effectively tracks the human faces and detects it [6]. It is a system that works by recognizing human faces and then giving a relay on the basis of its result or conclusion. Software along with hardware is created which will recognize the human face by various algorithms used. The algorithm used will compare the different images with the pre defined or the learned images with the real video images.The final aim is to bring about a change in the current face recognition system thus making it more efficient and robust [6].


    1. Face detection is the most fundamental step forautomated face analysis. The step can be considered as a sub-system input the images from camera and output the location and size of faces. The face detection system output can be an input of face recognition, face tracking, face

      authentication, facial expression recognition and facial gesture recognition system. If the face image is given with its size and location of frame, we can normalize the scale, illumination or orientation to continue our face analysis.However, human face belongs to a dynamic object, so many classes of approach proposed to solve this problem. The three main classes are skin-

      color-based, shape-based and feature-based. The skin- color-based approach uses the

      property of skin color distribution in a color space. If we have the skin color model in a color space, we can build a skin color filter to remain the pixels in the range of the skin color domain. The second class, shape-based approach uses shape model to detect face. For example, try to match an ellipse shape with the edge of image. It assumes face edge is similar with ellipse shape.Our face detection system adopts the Haar Classifier approach to detect human face. The Haar Classifier uses a form of AdaBoost and belongs to feature-based class. It uses Haar-like feature which consists of adding and subtracting image regions, and integral image technique enables rapid computation.


      This generation of our face detection system, calledParallel Haar-like Face Detection System (PHFDS), whichconsists of the several processes involves the search region ofinterest (ROI) determination by motion predictor, adaptiveskin detection, condensation filter with parallelcomputing confidence of particles, Parallel Haarlikewavelets classifying based on AdaBoost finished byOpenCV, and predicting the motion for next time.

      Fig. 1shows the flowchart of PHFDS.

    3. Determine the Region of Interest (ROI) of Image

    ROI is a region of image which is interesting and allowed to process only on it. The concept about ROI is a kind of local search and a very useful tool to reduce computation and increase object hit rate. The first advantage is easy to understand, and the second one is an important basis of our motion tracking. Given a video or sequence of images, we can assume the motions of the human or object is continuous. It means that the human or object cannot disappear or appear suddenly. It is easy to combine the concept about ROI,in other words, we can set a bit bigger ROI than last region which detected the human or object. If a disturbance does not appear in the ROI, it will not be detected and increase the robustness. In real word, because webcam has the maximum frame rate (30 fps.) constraint, the human or object sometimes move too fast to track. In the situation, we can initialize our motion tracker back to the global search mode. The meaning is that we will have very low miss-rate with high performance.


    Software working includes the installation of openCV with the algorithm which will first detect the images and learn them. The database will be created containing different images. The recognition will be done in three steps:

    1. Face detection: This generation of our face detection system called parallel haar-like face detection system, which consist of several process involves the search region of interest (ROI) determination by motion predictor, adaptive skin detection, parallel haar- like wavelets classifying based on AdaBoost finished by openCV, and predicting the motion for the next time.

    2. Facial skin colour model:HSV stands for hue, saturation and value (or brightness), and is particularly common in colour analysis intuitively corresponds to the colour system of human [2]. We adopt an adaptive skin colour detector proposed by Dadgostar and Sarrafzadeh. The algorithm based on adaptive hue thresholding and hue histogram of skin pixels. To avoid undefined mapping, we set H, S, and V to be zero when R, G and B are all equal to zero.

    3. Determine the region of interest (ROI) of image:ROI is the region of image which is interesting and allowed to process only on it. The concept about ROI is a kind of local search and a very useful tool to reduce computation and increase object hit rate. The advantage is easy to understand and the second one is an important basis on our motion tracking [6]. Given a video or sequence of images we can assume the

      motions of the human or object is means that the human or object cannot disappear or appearsuddenly. It is easy to combine the concept about other words, we can set bigger ROI than last region which is detected the human or object. If a disturbance dose not appear in the ROI, it will not be detected and increase the real word, because webcam has the maximum frame rate (30fbs) constraint, the human or object cannot disappear or appear is easy to combined the concept about ROI in other words we can set a bit bigger ROI than last region which detected the human or object. If disturbance does not appear in the ROI it will not be detected and increase the robustness. In real word because webcam has the maximum frame rate (30fps.) constraint the human or object sometimes move to fast to track. In the situation we can initialize our motion tracker back to the global search more. The meaning is that we will have very low miss rate with high performance.

    4. Creation of a database: After extracting the features of face it is stored in the database with its id using openCV library.

    5. Face Recognition: The recognition process involves a robot which detect the face using algorithms PCA, LDA, LBPH which is an inbuilt algorithm in openCV library for face recognition. The robot will move a capture the images on a real time basis and again perform the face detection process. The robot is a wheeled robot with a ruster wheel of a 10 rpm. The speed should be slow in order to detect the faces by the camera and its proper resolution.


    1. 89C52 Microcontroller: The above Project uses 89C52 Microcontoller. This unit is the heart of the complete system. It is actually responsible for all the process being executed. It will monitor & control all the peripheral devices or components connected in the system. In short we can say that the complete intelligence of the project resides in the software code embedded in the microcontroller. Atmel fabricated the flash ROM version of 8051 which is popularly known as AT89C52 (C in the part number indicates CMOS). The flash memory can erase the contents within seconds which is best for fast growth. Therefore, 8751 is replaced by AT89C52 to eradicate the waiting time required to erase the contents and hence expedite the development time [11]. To build up a microcontroller based system using AT89C52, it is essential to have ROM burner that supports flash memory. Note that in Flash memory, entire contents must be erased to program it again. The contents are erased by the ROM burner. Atmel is working on a newer version of AT89C52 that can be programmed using the serial

      COM port of IBM PC in order to get rid of the ROM burner.

    2. L293D Motor Driver: It is a dual H-bridge motor driver integrated circuit (IC). Motor drivers act as current amplifiers since they take a low-current control signal and provide a higher-current signal. This higher current signal is used to drive the motors.L293D contains two inbuilt H-bridge driver circuits. In its common mode of operation, two DC motors can be driven simultaneously, both in forward and reverse direction. The motor operations of two motors can be controlled by input logic at pins 2 & 7 and 10 & 15. Input logic 00 or 11 will stop the corresponding motor. Logic 01 and 10 will rotate it in clockwise and anticlockwise directions, respectively. Enable pins 1 and 9 (corresponding to the two motors) must be high for motors to start operating. When an enable input is high, the associated driver gets enabled. As a result, the outputs become active and work in phase with their inputs. Similarly, when the enable input is low, that driver is disabled, and their outputs are off and in the high-impedance state [11].

    3. LCD (Liquid Crystal Display): It is an electronic display module and finds a wide range of applications. A 16×2 LCD display is very basic module and is very commonly used in various devices and circuits. These modules are preferred over seven segments and other multi segment LEDs. The reasons being: LCDs are economical; easily programmable; have no limitation of displaying special & even custom characters (unlike in seven segments), animations and so on.A16x2 LCD means it can display 16 characters per line and there are 2 such lines. In this LCD each character is displayed in 5×7 pixel matrix. This LCD has two registers, namely, Command and Data.The command register stores the command instructions given to the LCD. A command is an instruction given to LCD to do a predefined task like initializing it, clearing its screen, setting the cursor position, controlling display etc. The data register stores the data to be displayed on the LCD. The data is the ASCII value of the character to be displayed on the LCD. Click to learn more about internal structure of a LCD [10].

    4. MAX232 IC: It is used to convert the TTL/CMOS logic levels to RS232 logic levels during serial communication of microcontrollers with PC. The controller operates at TTL logic level (0-5V) whereas the serial communication in PC works on RS232 standards (-25 V to + 25V). This makes it difficult to establish a direct link between them to communicate with each other. The intermediate link is provided through MAX232. It is a dual driver/receiver that includes a capacitive voltage generator to supply RS232 voltage levels from a single 5V supply. Each receiver converts RS232 inputs to 5V TTL/CMOS levels. These receivers (R1& R2) can accept ±30V inputs. The drivers (T1& T2), also called transmitters,

      convert the TTL/CMOS input level into RS232 level.The transmitters take input from controllers serial transmission pin and send the output to RS232s receiver. The receivers, on the other hand, take input from transmission pin of RS232 serial port and give serial output to microcontrollers receiver pin. MAX232 needs four external capacitors whose value ranges from 1µ F to 22µ F.

    5. Power Supply: This unit will supply the various voltage requirement of each unit. This will be consists of transformer, rectifier, filter and regulator. The rectifier used here will be Bridge Rectifier.


    OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products. Being a BSD- licensed product, OpenCV makes it easy for businesses to utilize and modify the code.The library has more than 2500 optimized algorithms, which includes a comprehensive set of both classic and state-of-the-art computer vision and machine learning algorithms [7]. These algorithms can be used to detect and recognize faces, identify objects, classify human actions in videos, track camera movements, track moving objects, extract 3D models of objects, produce 3D point clouds from stereo cameras, stitch images together to produce a high resolution image of an entire scene, find similar images from an image database, remove red eyes from images taken using flash, follow eye movements, recognize scenery and establish markers to overlay it with augmented reality,etc. OpenCV has more than 47 thousand people of user community and estimated number of downloads exceeding 7 million. The library is used extensively in companies, research groups and by governmental bodies [10].Along with well- established companies like Google, Yahoo, Microsoft, Intel, IBM, Sony, Honda, Toyota that employ the library, there are many start-ups such as Applied Minds, Video Surf, and Zeitera, that make extensive use of OpenCV. OpenCVs deployed uses span the range from stitching streetview images together, detecting intrusions in surveillance video in Israel, monitoring mine equipment in China, helping robots navigate and pick up objects at Willow Garage, detection of swimming pool drowning accidents in Europe, running interactive art in Spain and New York, checking runways for debris in Turkey, inspecting labels on products in factories around the world on to rapid face detection in Japan [11].

    It has C++, C, and Python, Java and MATLAB interfaces and supports Windows, Linux, Android and Mac OS. OpenCV leans mostly towards real-time vision applications and takes advantage of MMX and SSE instructions when available. A full-featured

    CUDA and OpenCV interfaces are being actively developed right now. There are over 500 algorithms and about 10 times as many functions that compose or support those algorithms. OpenCV is written natively in C++ and has a templated interface that works seamlessly with STL containers [10].

    1. .NET: The .NET Framework is a software framework developed by Microsoft that runs primarily on Microsoft Windows. It includes a large library and provides language interoperability (each language can use code written in other languages) across several programming languages. Programs written for the

      .NET Framework execute in a software environment (as contrasted to hardware environment), known as the Common Language Runtime (CLR), an application virtual machine that provides services such as security, memory management, and exception handling. The class library and the CLR together constitute the .NET Framework [11].The .NET Framework's Base Class Library provides user interface, data access, database connectivity, cryptography, web application development, numeric algorithms, and network communications. Programmers produce software by combining their own source code with the .NET Framework and other libraries. The .NET Framework is intended to be used by most new applications created for the Windows platform. Microsoft also produces an integrated development environment largely for .NET software called Visual Studio.

    2. C# : It is a multi-paradigm programming language encompassing strong typing, imperative, declarative, functional, procedural, generic, object-oriented (class- based), and component-oriented programming disciplines. It was developed by Microsoft within its

      .NET initiative and later approved as a standard by Ecma (ECMA-334) and ISO (ISO/IEC 23270:2006).

      C# is one of the programming languages designed for the Common Language Infrastructure [10].C# is intended to be a simple, modern, general-purpose, object-oriented programming language.


    Fig2.System Block Diagram


    Fig 3. Detecting Face in Real Time

    Fig 4. Detect the face when the person is moving

    Fig 5. Cropping for matching with database

    The result of our project is a creation of a system which will be able to detect the faces on a real time basis and would finally be applied on wide platform to identify the validity of a person in a particular premises.


      1. This system can be used as a home service robot (HSR).

      2. It can also be used as a face detector in restricted organisations.

      3. The other features can also be added in the system in order to use it for security purpose in home as well other institutes.

      4. It can be used in the electronic gadgets such as pc, laptop, mobile phones, etc. in order to get better security level.


    1. It can be used in a college or a company premises for better security in order toh recognise the valid students or candidates.

    2. Also can be used on door for identifying the known persons of the home.


      1. Much more better algorithms can be prepared for better performance.

      2. With more data base so that more number of images can be learnt.Cheaper components can be proposed.


On the basis of this project we can conclude that with the two mentioned method we can make the learning and detection procedure for robot. It presents a real time parallel vision system for service robot. The vision system is setup by individual subsystems face detection tracking subsystem based on adaptive skin detector, and a simple and fast motion predictor is composed for face tracking. This system is useful in many applications of robot for example face detection, face tracking and face determination. The second is the face regocnition system based on algorithms such as PCA and recognition procedures. It is robust and efficient to recognise many people on line in different views and unknown scene. The future scope for the project is creation of much wider database i.e., with larger space which can recognise more number of human faces with much précised algorithms.


[1] Facial Recognition Application Animetrics.

Retrieved 2008-0-04

[2]. Bonsor, K. How Facial Recognition Systems Work Retrieved 2998-0-02

[3]. Smith, Kelly. Face Recognition

[4]. R. Brunelli and T. Poggio, Face Recognition: Features versus Templates

[5]. Kimmel, Ron. Three Dimensional Face Recognition

[6]. Real-time image processing using Home surveillance robot.

[7]. Crawford, Mark. Better Face Recognition Software

[8]. Greene, Lisa. Face scans match few suspects [9]. IEEE trans. On PAMI, 1993,(15) 10 :1042 -1052

[10]. s.aspx?dDocName=en01029


[12]. Image processing with neural networksa review M. Egmont-Petersena; , D. de Ridderb, H. Handelsc.

[13]. A Survey on Image-Based Rendering – Representation, Sampling and Compression.Cha Zhang and Tsuhan Chen Advanced Multimedia Processing Lab

[14]. A Review of the Fractal Image Coding Literature. BrendtWohlberg and Gerhard de Jager, Member, IEEE.

[15]. Image Enhancement. Ashish Mehta.

[16]. Review of "Image processing and analysis: variational, PDE, Wavelet, and Stochastic methods" by Tony F. Chan and Jianhong (Jackie) Shen.

[17]. Image Compression in Face Recognition-a Literature SurveyKresimirDelac, Sonja Grgic and MislavGrgic.University of Zagreb, Faculty of Electrical Engineering and Computing Croatia.

[18]. Comparative review of image processing and computer vision Bruce A. Maxwell. University of North Dakota, Department of Computer Science, Grand Forks, ND 58202-9015.

Leave a Reply

Your email address will not be published. Required fields are marked *