Depth AAM Based Feature Extraction

DOI : 10.17577/IJERTCONV2IS12016

Download Full-Text PDF Cite this Publication

Text Only Version

Depth AAM Based Feature Extraction

Abidha parveen.A, Sivasakthi.J, Jenitta.A Department Of Electronics and Communication Engineering, Idhaya Engineering College for Women, Chinnasalem,Villupuram,Tamilnadu,India.,,

Abstract- Feature extraction in video sequences takes an important role in face recognition, expression profile analysis, and human computer interaction. Formerly this system was proposed with curvelet transform and wavelet transform. In the existing system Affine transform is used.AAM (Active Appearance Model) method for facial features localization always concentrate on fitting efficiency with in global few concrete analysis of the characteristic of initial position and model instance. The location, accuracy and speed are not ideal. The main idea of this method is to use the face detection algorithm with a web camera to accurately locate human head and estimate head pose. The head position is used to initialize the AAM global shape transformation which guarantees the model fitting to the correct location. The depth AAM algorithm takes four channel- R, G, B, D into consideration which combines the colors and the depth of input images. To locate facial feature robustly and accurately, the weights of RGB information and D information in global energy function are adjusted automatically. Region based segmentation is also used in order to separate an individuals feature. Experimental results show that the depth AAM algorithm can effectively and accurately locate facial features from video objects in condition of complex background.


    Facial feature extraction from video image is aimed to locate the exact positions and shapes of the components of a human face from the input images, including the eyes, nose, mouth and outline. It provides the basic foundation for face recognition, gesture expression analysis, human-computer interaction studies and soon. Recently, people have proposed various facial feature extraction algorithms which can be divided into two categories depending on the data dimension, eitherbasedon2Dimagesorbasedon3Dimages.Because of the limitations of the existing face detection technology, locating face feature based on 2D images would be greatly impacted by the conditions of complex background and various pose. Locating face feature based on 3D image needs the capturing systems which are usually very expensive to acquire or operate.

    Significant improvements have been made which has the potential to revolutionize many fields of research, including computer vision, computer graphics and human computer interaction. The web camera has great advantages in human posture recognition, although it is unable to locate facial features directly. However, the depth information provided by the camera plays an important role in the head location and head pose estimation. The AAM algorithm has a profound

    mathematical background, excellent characterization capabilities and it takes full advantage of a priori information about the object model. A large amount of facial feature location algorithm shave been constantly proposed.This paper is organized as follows:SectionII discusses the algorithm for head pose estimation. SectionIII presents the depth AAM supporting the useof depth information. SectionIV presents the implementation of the depth AAM algorithm with a Web camera. The experimental results and comparison with other experiments are presented in SectionV.


    The AAM algorithms fitting efficiency is closely concerned with the initial position and its model examples, and directly prevents its potential applications. A widely used pose estimation algorithm is the Ada boost method proposed by Viola and Jones and it is an improved versions. The face detection is implemented with the combination of a bunch of weak detectors, and the nitusesa threshold-based image processing methods to process and analysis the face image obtained. First, the Ada boost algorithm fornon-frontal face detection in which the accuracy is relatively low, and sensitive to the background and the body gesture. The match is still working when the image does not contain a face, which is a waste of time. Secondly, the face image processing with a threshold value is highly subjective. Light conditions are would have a serious impact, which can easily make an estimation error. The Head Pose Estimation based on the depth of the image is more robust to a complex background and various body poses.

    The depth imaging technology has advanced dramatically over the last few years, especially with the launch of web camera. Pixels in a depth image indicate the calibrated depth in the scene, rather than a measure of intensity or color. Web camera offers several advantages over traditional intensity sensors, working in low light levels, giving a calibrated scale estimate, being color and texture invariant and resolving silhouette ambiguities impose. It greatly simplifies the task of background subtraction. The pairs of depth and body part image a reused as fully labeled data for learning the classifier. Booster classifier for face detection, SVM for face recognition and GPU for each pixel in parallel computing that could speed up the efficiency. RBF i.e. Region based function is used for the classification of image in the output.


    In general the depth information is very accurate, though a

    closer look at the face region shows that it is still much noisier than laser scanned results. Traditional AAM algorithm only use the three color channels- RGB data as the data input, while facial feature location is in accurate due to lock of the depth data. In this paper, we propose the depth AAM algorithm to fit both the texture images and the depth images, while both of them come from our Web camera. The camera is capable of recording t h e depth of the image with high pixel resolution.

    An affine transformation or affine map or an affinity from the Latin wordaffinis, is a function between affine spaces which preserves points, straight lines and planes.

    Fig.3 Affine Transform

    A. Web Camera


    Fig.1 Block Diagram

    D. Preprocessing

    Image processing is any form of signal processing for which the input is an image, such as a photograph or video frame; the output of image processing may be

    A webcam is a video camera that feeds its image in real time to a computer or computer network.

    1. Face detection

      Face detection is a computer technology that determines the locations and sizes of human faces in arbitrary digital images. It detects facial features and ignores anything else, such as buildings, trees and bodies.

      Fig.2Face Detection

      C.Affine Transform

      either an image or a set of characteristics or parameters related to the image. Most image-processing techniques involve treating the image as a two-dimensional signal and applying standard signal-processing techniques to it. Image processing usually refers to digital image processing, but optical and analog image processing also are possible. It is about general techniques that apply to all of them. It helps to remove the distortion and noise present in the image.

      1. Facial Feature Extraction

        In image processing, feature extraction is a special form of reduction. When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant (e.g. the same measurement in both feet and meters) then the input data will be transformed into a reducedrepresentation set of features (also named features vector). Transforming the input data into the set of features is called feature extraction. If the features extracted are carefully chosen it is expected that the features set will extract the relevant information from the input data in order to perform the desired task using this reduced representation instead of the full size input.

        Fig.4 Feature extraction

      2. Support Vector Machine

        In machine learning, support vector machines (SVMs, also support vectornetworks) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, making it a non- probabilistic binary linear classifier. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on in addition to performing linear classification, SVMs can efficiently perform a non-linear classification using web camera implicitly mapping their inputs into high-dimensional feature spaces.

      3. Region Based Function

        Image segmentation is the process of partitioning a digital image into multiple segments. The sets of pixels, also known as super pixels. The goal of segmentation is to simplify or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries which also includes lines, curves, etc in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain visual characteristics.

      4. Training and Testing

    The training and testing procedure are as follows,

    Fig.5 Flow diagram

    The extracted features are trained and compared in the testing session. If the value suits it gives the corresponding output.

    The output will be provided in three cases. Web camera is used to read the input image. Face detection is the process in which the captured frames are cropped from the background fully. Affine Transform is implemented to preserves corner point on the face. Preprocessing is the stage in which the noise and distortion in the image is reduced. Facial feature detection is the process to detect the assumed feature. In facial feature extraction the assumed feature are extracted and it is compared with the predefined recorded value and produce the corresponding output using region based function. We have three stage of outputs (1) Authorized Image (2) Unauthorized Image and (3) No face detected.

    It shows our algorithm do good job at locating facial feature in conditions of complex background and various pose.


    These are the experimental results. In this the authorized image is obtained when it exactly matches with the predefined value.

    with predefined value. When it does not recognize the

    face properly then it coins a term that no face detected. Finally we study that our projected method has a lower value in both FAR and FRR error rate. Similarly, the proposed system has a higher accuracy compared to different cases such as pose invariant, illumination invariant. This can be represented in following tables and graphs.

    The Real time face Recognition system undergoes various process, if the value suit the calculated value it gives an authentication to open the system .If the value does not suit

    ,the image of the person will be stored as the folder.

    If the face is not properly identified, gives the comment as no face detected.

    Best frame will be chosen for the face recognition. This graph help as to know the rate of face that are recognized from various frames.

    Fig.7 No Face Detected


    Table produces our experimental result.

    Fig.8. Recognition rate


    In this paper, we present a face detection method based on the camera. It is able to segment facial images and estimate the head pose accurately. Then we introduce the depth AAM algorithm which can be used to locate facial features with the depth images. The algorithm can use depth information comprehensively and its accuracy and performance is higher than the traditional AAMs. We also show the effectiveness of our approach for real video images. For the future work, we will further improve our method and develop applications in computer vision and human computer interaction using IR (Infrared) vision sensor.


    1. Y. Wu and X. Ai, Face Detection in Color Images Using Adaboost Algorithm Based on Skin Color Information, in First International Workshop on Knowledge Discovery and Data Mining, 2008, pp. 339-342.

    2. S. Millsboro and F. Nicholls, Locating Facial Features with an Extended Active Shape Model, in Computer Vision ECCV 2008, vol. 5305, pp. 504-513.

    3. I. Matthews and S. Baker, Active Appearance Models Revisited, International Journal of Computer Vision, vol. 60, no. 2, pp. 135-164, Nov. 2004.

    4. A. Caunce, D. Cristinacce, C. Taylor, and T. Cootes, Locating Facial Features and Pose Estimation Using a 3D Shape Model, in Advances in Visual Computing, 2009, vol. 5875, pp. 750-761.

    5. W. Zhang, Q. Wang, and X. Tang, Real Time Feature Based 3-D Deformable Face Tracking, in Proceedings of the 10th European Conference on Computer Vision: Part II, Berlin, Heidelberg, 2008, pp. 720732.

    6. X. Lu and A. Jain, Deformation Modeling for Robust 3D Face Matching, in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006, vol. 2, pp. 1377- 1383.

    7. Y. Zhu and K. Fujimura, 3D head pose estimation with optical flow and depth constraints, in Fourth International Conference on 3-D Digital Imaging and Modeling (3DIM), 2003, pp. 211- 216.

    8. V. Frati and D. Prattichizzo, Using Kinect for hand tracking and rendering in wearable haptics, in 2011 IEEE World Haptics Conference (WHC), 2011, pp. 317-321.

    9. T. F. Cootes and C. J. Taylor, Constrained active appearance models, in Eighth IEEE International Conference on Computer Vision (ICCV), 2001, vol. 1, pp. 748-754.

    10. R. Navarathna, S. Sridharan, and S. Lucey, Fourier Active

Appearance Models, in 2011 IEEE Interna

Leave a Reply