An Optimistic Approach for Implementing Viola Jones Face Detection Algorithm in Database System and in Real Time

DOI : 10.17577/IJERTV4IS070758

Download Full-Text PDF Cite this Publication

Text Only Version

An Optimistic Approach for Implementing Viola Jones Face Detection Algorithm in Database System and in Real Time

Shaily Pandey Sandeep Sharma

Student Assistant Professor

Computer Science and Engineering Computer Science and Engineering

PSIT, Kanpur, India PSIT, Kanpur, India

Abstract–In this paper, Human face detection with the database system and in real time approach is shown using the algorithm viola Jones algorithm which has already been implemented on the MATLAB. Here only the focus id on the human face tracking i.e. entities other than the human face is being ignored. It is a hybrid approach as to implement both in real time as well as in database. But both can be done at a single instant of time. Simultaneously both the tracking would not be possible.

As faces represent complex, multidimensional, meaningful visual stimuli and developing a model for face detection is difficult. We present a Voila Jones Face Detection Algorithm which comprises of identifying a face image from the faces unique features. This technique for real time human face detection and tracking using a modified version of the Algorithm suggested by Paul viola and Michael Jones. The paper starts with the introduction to human face detection and tracking, followed by apprehension of the Viola Jones algorithm and then discussing about the implementation in real video applications. Viola Jones algorithm was based on object detection by extracting some specific features from the image. We used the same approach for real time human face detection and tracking.. This algorithm computes data and produce results in just a mere fraction of seconds.

Keywords- Face detection, Face Recognition, Haar features, Integral Image, Cascade Architecture, Matlab, and Ada-boost Algorithm.

Using viola Jones algorithm. This framework is demonstrated on, and in part motivated by, the task of face detection.

Toward this end we have constructed a frontal face detection system which achieves detection and false positive rates. The existing face detection algorithms may be divided into two main approaches. The first approach is based on utilization of skin color. The second approach is based on utilization of facial features

The problem is further complicated by differing lighting conditions, image qualities and geometries, as well as the possibility of partial occlusion and disguise. An ideal face detector would therefore be able to detect the presence of any face under any set of lighting conditions, upon any background. Among the face-based detection algorithms, the one based on the Viola-Jones object detection approach

  1. has been shown to be most robust to environmental lighting changes and thus it has been implemented in hardware in

    digital camera products.



      Face detection involves separating image windows into two classes; one containing faces (targets), and one containing the background (clutter).

      It is difficult because although commonalities exist between faces, they can vary considerably in terms of age, skin color and facial expression.

      This Paper brings a Hybrid approach for the human face tracking

      It is the first Object Detection Framework for providing the competitive Object detection rates in Real time. Although it is trained for the detection of variety of objects. Its algorithm is implemented in OpenCv.It is Robust in nature that means very high detection rate (true-positive rate) & very low false-positive rate always along with this it is Real time that means For practical applications at least 2 frames per second must be processed.

      And the last feature is that it is for the Face detection and not recognition this signifies the goal is to distinguish faces from non-faces (face detection is the first step in the identification process)

      Viola Jones algorithm has mainly 4 stages:

      1. Haar Features Selection

      2. Creating Integral Image

      3. Adaboost Training algorithm

      4. Cascaded Classifiers

      There are three main contributions of our object detection Framework. Here we will introduce each of these ideas briefly below and then describe them in detail in subsequent parts.


      The Viola-Jones face detection method uses combinations of simple Haar-Like features to classify faces. Haarlike features are rectangular digital image features that get their name from their similarity to Haar-wavelets.

      The value of a two-rectangle feature is the difference between the sum of the pixels within two rectangular

      conjunction with the integral image , the efficiency of the rectangle feature set provides ample compensation for their limited flexibility


      The first contribution of this paper is a new image representation called an integral image that allows for very fast feature evaluation. Motivated in part by the work of Papageorgiou et al. our detection system does not work directly with image intensities [2].Our object detection procedure classifies images based on

      the value of simple features. The integral image can be computed from an image using a few operations per pixel. Once computed, any one of these Haar-like features can be computed at any scale or location in constant time

      The integral image at location (x,y) contains the sum of the pixels above and to the left of x,y inclusive

      regions. The regions have the same size and shape and are horizontally or vertically adjacent (see Figure 1). A three- rectangle feature computes the sum within two outside rectangles subtracted from the sum in a center rectangle. Finally a four-rectangle feature computes the difference between diagonal pairs of rectangles.

      Value of the Rectangular Features can be evaluated as Value = (pixels in black area) – (pixels in white area)

      (, ) = <,< (, )

      Where ii(x, y) is the integral image and i(x, y) is the original

      image intensity

      Fig. 3. ii(x,y) = sum of image intensities in shaded area


      Fig.1. Various Haar like features.

      Fig.2 . 3rd and 4th kind of Haar Feature


      There are many different types of machine-learning techniques can be used to train a classification function. The Viola-Jones method uses a variation of the AdaBoost algorithm, formulated by Freund and Schapire in [3], to select a small set of critical features to form an effective classifier. For best results, the training data should contain images over a range of lighting conditions and facial properties (e.g., skin,color, glasses, facial hair). The set of possible features in a given sub-window is huge. In


        It is a Adaboost Training algorithm. The speed with which features may be evaluated does not adequately compensate for their number, however. For example, in a standard 24×24 pixel sub-window, there are a total of M=162,336 possible features, and it would be prohibitively expensive to evaluate them all when testing an image. Thus, the object detection framework employs a variant of the learning algorithm Adaboost to both select the best features and to train classifiers that use them. This algorithm constructs a strong classifier as a linear combination of weighted simple weak classifiers.


        () = ( ()) (2)

        Each weak classifier is a threshold function based on the feature .



        = { (3)

        Add (s,i,j) to detection list

        Where hreshold is and are determine in the training, as well as the coefficients .


      In general face detection algorithm based on AdaBoost may divided into three major parts, first of all, using the integral image to extract faces rectangle feature, the second is formed weak classifier, which is based on single rectangle feature, and using AdaBoost algorithm to trained the weak classifier, then combined some accurate feature to forming a strong classifier that is more accurately in distinguish between face and nonface mode.

      The third is in accordance with the principle of first heavy after the light cascade multiple strong classifier, in other words, it is put these strong classifier in the front which is formed by important features and have more simple structure, it can be filtering out numerous non-face sub window, so it will put the detection focus on these regions which have lager possibility of exist human face, it greatly enhanced face detection speed.

      Extract face rectangle feature

      Weak classifier based on single

      Combined several weak classifier

      Cascade several strong classifer

      Fig.4. The face detection algorithm flow based on several cascade Classifiers

    2. ALGORITHM The Original Viola-Jones Algorithm

      Input: A greyscale image, a scaling factor (s) and scanning factor (p)

      Output: The location and size of a detected face

      Size = detector. Size

      while size _ image. height AND size _ image. width do

      for i from 0 to image. width-size in increments of p do

      for j from 0 to image. height-size in increments of p do

      if runCascade(subwindow of image of size size located at (i,j)) then

      size = RoundUp (size * s) return average of detections.


      After the training process is complete, the detection algorithm is fairly straightforward. One simply has to scan all possible sub-windows of an image at a range of scales, running the cascade on each window. If a sub-window passes the final level of the cascade then the sub-window likely contains a face. It is important to note that the detector is invariant to small changes in scale and location, so there will often be many hits centered around each face. To filter out these duplicated results, overlapping hits can be averaged together to form a single detection.

      Clearly, running the cascade on every possible sub- window is computationally expensive. Thus the amount we adjust scale and position between tested sub-windows (scanning factor) must be adjusted to achieve a balance between speed and accuracy.

      Viola and Jones achieved best results using a scaling factor of 1.25, and a scanning factor proportional to the current scale.


      In Matlab 2013 version, we have used computer system Vision toolboxs function called vision.CascadeObjectDetector which uses the Viola-Jones algorithm to detect people's faces, noses, eyes, mouth, or upper body. We can also use the trainCascadeObjectDetector function if we want to train a custom classifier to use with this System object.

      We carried out our demonstration on dell, i5 third generation processor and found image detection to work profoundly. We developed the code for real time detection and tracking of human faces. For this we used imaqhwinfo command in Matlab 2013 to get information about all the adapters installed in our laptop.

      >> imaqhwinfo ans =

      InstalledAdaptors: {'gentl' 'gige' 'matrox' 'winvideo'} MATLABVersion: '8.1 (R2013a)'

      ToolboxName: 'Image Acquisition Toolbox'

      ToolboxVersion: '4.5 (R2013a)'


      Configuration of the camera

      Input the video

      Human Face Detection and tracking algorithm using cascade object Detector

      Creating building box for tracking using shape inserter

      Output the Video


      Formation of Integral image


      Cascading of classifiers

    5. RESULTS

      Platform on which detection is performed.



      n Single Face detection in Image

      Fig.5. Flowchart for real time detection and tracking algorithm



      Run the main Program from the Directory

      Select the .jpg file from the database

      Press the TURN-ON

      Else invalid msg is shown

      If pic is present

      Detected face is shown


      Fig.6. Face Detction from the Database

      Human Face detection in image.

      Real time Face detection in image


      False Positive rate is defined as the number of false imagesate of is dected for the given input image whereas correct detection rate is defines as the Rate of detecting correct images for the given set of input images.

      ROC curve is illustrated as Receiver Operatinf Characteristics and is defined as is a graphical plot that illustrates the performance of a binary classifier system as its discrimination threshold is varied. The curve is created by plotting the true positive rate against the false positive rate at various threshold settings.


      Curve is shown between the Correct Detection rate and False Positive Rate.ROC CURVE is being shown by taking Inputs as an set of images.


      After the first step of detection from the learning algorithm performance can be seen in the below graph i.e ROC curve is between correct detection rate and false positive rate.


      We have presented an approach for face detection which minimizes computation time while achieving high detection accuracy. The approach was used to construct a face detection system which is approximately 15 times faster than any previous approach. Preliminary experiments, which will be described elsewhere, show that highly efficient detectors for other objects, such as pedestrians or automobiles, can also be constructed in this way. Paper brings together new algorithms, representations, and insights which are quite generic and may well have broader application in computer vision and image processing.

      This paper presents a set of detailed experiments on difficult face detection and tracking data set which has been widely studied. This data set includes faces under a wide range of conditions including: illumination, scale, and pose and camera variation. Nevertheless the system which work under this algorithm are subjected to same set of conditions and but the algorithm is flexible enough to adjust according to the changing conditions.


  1. P. Viola and M. Jones, "Rapid Object Detection Using a Boosted Cascade of Simple Features," Proc. IEEE CVPR,


  2. C. Papageorgiou, M. Oren, and T. Poggio. A general framework for object detection. In International Conference on Computer Vision, 1998.

  3. Y. Freund and R. E. Schapire. A decision-theoretic generalization of online learning and an application to boosting.

    Journal of Computer and System Sciences, 55, 1997.

  4. P.Viola and M.Jones. Robust real-time face detection. International Journal of Computer Vision, 57:137154, 2004.

  5. Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Computational Learning Theory: Eurocolt 95, pages 2337.

    Springer-Verlag, 1995

  6. H. Schneiderman and T. Kanade. A statistical method for 3D object detection applied to faces and cars. In International Conference on ComputerVision, 2008

  7. P. Viola and M. Jones. Robust real-time face detection. International Journal of Computer Vision, 57:137154, 2004.

  8. O. Jesorsky, K. J. Kirchberg, and R. Frischholz. Robust face detection using the hausdorff distance. In AVBPA 01: Procedings of the Third International Conference on Audio and Video-Based Biometric Person Authentication, pages 9095, London, UK, 2001. Springer-Verlag.

Leave a Reply