A Comparative Study of Face Detection and Recognition Techniques

DOI : 10.17577/IJERTV11IS060142

Download Full-Text PDF Cite this Publication

Text Only Version

A Comparative Study of Face Detection and Recognition Techniques

M. Aishwarya Sri

PG Scholar,

Department of Electrical and Electronics Engineering, Rajalakshmi Engineering College, Thandalam, Chennai 602 105

Dr. V. Geetha Priya


Department of Electrical and Electronics Engineering, Rajalakshmi Engineering College, Thandalam, Chennai 602 105

Abstract: To encourage academics to pursue this field, we have given an overview of the existing algorithms or methods for facial recognition in this study. This review is based on previous and continuing inquiries conducted by other researchers on the same topic. The most general criteria for a distinct algorithms have been chosen. Principle Component Analysis (PCA), Adaptive boosting, Linear Discriminant Analysis (LDA), Local Binary pattern Histogram (LBPH), Convolutional Neural Network (CNN) and Artificial Neural Network (ANN) are the various algorithms used. For the review of the algorithms, several parameters were taken into consideration. The settings include database size and type, illumination tolerance, facial expression and position variations. However, because it is just a review paper based on prior studies, no unique rationale can be stated.


Biometrics is a field of technology that uses human body parts such as fingerprints, palms, retinas (eyes), and faces to detect and recognise people. Biometrics ID is a way of access that not only authenticates but also confirms a person's identification in relation to the allowed access. Biometrics outperforms the traditional password- based access approach in terms of access dependability and security. The password access method only "knows" the user after it has been authenticated. Someone else's password can be easily stolen or hacked by others. Then the hacker will be able to connect into the protected system and access sensitive and important information belonging to other people.

The scientific approach of biometric identification (facial, fingerprint) is more competent and stable than the behavioural method (keystrokes, voice). The physiological method is more stable since features such as the face are difficult to modify unless there is substantial damage to the face. Instead of using a behavioural method such as voice printing, which can readily vary due to factors such as health, disease, and stress. Biometrics traits are difficult to reproduce, making forgery extremely tough. This could be one of the reasons why facial recognition is so popular because of its usefulness. The advancement of computer capabilities, as well as market demand for security, has pushed face detection and identification research to new depths.


    In 2016, a team lead by Deep – Demis Minda's Hassabis created an artificial intelligence (AI) product

    named AlphaGo. It was also victorious over Ke Jie in May of 2017, who was the best Go player. The DeepMind group confirmed the launch of its new product in October 2017. AlphaGo Zero [1] is the strongest version of the game. Playing chess and recognising people's faces are both about finding out who they are. Despite the fact that they share the same ideas. Likewise, the metamorphosis of facial recognition is extremely difficult which is larger than the difficulty of determining the best approach in the chessboard. A explanation from Shang-Hung Lin's research [2] is used as an example in this paper.

    A face image detector and a face recognizer are the two functional elements that make up this architecture. The face image detector examines the image for human faces and isolates them from the background. After a face has been identified or located, the face recognizer will perform a recognition process to determine who the people are. They have a feature extractor and a pattern recognizer for face identification and recognition. The feature extractor converts images' pixels into vector representations. The widespread adoption of this technology has sparked a dispute about the system's function, including how faulty recognition can lead to gender norms and racial profiling. Various strategies were employed to implement such systems. A method for manually evaluating characteristics was introduced in the early 1960's. There were limits in terms of system specs in acknowledging and detecting individuals because it wasn't the modern computer era. Later, improvements were made every decade, culminating in the current development, which is by far the best proposed system. This technology, which was recently developed, employs a neural network to simulate the operation of the human brain in a computational fashion.

    Some of the reviewed algorithms are

    • Principle Component Analysis (PCA),

    • Adaptive boosting

    • Linear Discriminant Analysis (LDA),

    • Local Binary pattern Histogram (LBPH),

    • Deep Learning Techniques

        1. Principle Component Analysis

          PCA is a method for simplifying the difficulty of determining a consistent representation for eigenvalues and accompanying eigenvectors. This is accomplished by reducing the representation's dimension space. The dimension space must be decreased in order to provide

          rapid and reliable object detection. PCA also saves the data's original information. The PCA basis is used in the Eigenface based method.

          The main component analyses were brought into face recognition by Turk and Pentland of the MIT Media Laboratory in 1991 [2]. PCA is commonly used to pre- process data before to further analysis. It may remove superfluous information and noise from data with more dimensions, keep the important properties of data, drastically lower the dimensions, enhance data processing speed, and save a lot of time and money [3], [4]. As a result, this approach is commonly used to reduce dimensionality and visualise multi-dimensional data.

          Fig 2.1.1 Block Diagram of PCA

          The function of the PCA process is matrix generation for the input image and the data centralization

        2. Adaptive boosting

          The first developed adaptive boosting is the statistical classification of meta-algorithm. It can be combined with a number of different learning techniques to improve performance. The results of other learning algorithms are combined into a weighting factor that denotes output of the boosted classifier. It is used to detect faces. Any learning algorithm that uses a boosting technique can enhance its accuracy. The fundamental concept is to use some simple criteria to combine several classifiers into a stronger signal classifier, resulting in improved overall performance [5].AdaBoost is adaptive in the sense that it tweaks succeeding weak learners in favour of instances misclassified by earlier classifiers. It may be less prone to overfitting than other supervised learning in certain situations. Individual learners can be bad, but as long as their efficiency is marginally better than random guessing, the final model will converge to a good learner. AdaBoost is frequently referred to be the optimum out of the box. When combined with machine learning algorithms, data obtained on every stage of the AdaBoost algorithm on the proportionate 'hardness' of each training sample is fed into the tree-growing process, causing succeeding trees to emphasis on more difficult-to-classify samples.

        3. Linear Discriminant Analysis

          We can utilise linear discriminate analysis (LDA) for face recognition datasets with labels [4]. It is mployed in the face of classification [6]. After dimensionality, PCA requires the data variance.to be as large as feasible so that the data can be analysed. LDA demands the variance to be divided as broadly as feasible. After projection within the same category of data groups to be as little as feasible, and the difference across groups to be as small as possible as large as feasible [2]. This LDA has supervised the lowering of dimensionality. It should also use the label information to distinguish across various groups.as many data

          categories as feasible. Fisher's Linear Discriminant (FLD) is another name for LDA [7]. The FLD approach is used to shrink the dimension space. FLD This method makes use of data from within a single class within each class, restricting variation, and enhancing the separation of classes. A system [8], [9] with Generalized Singular Value Decomposition was proposed. LDA is a prominent supervised dimensionality reduction method, and numerous LDA-based face recognition algorithmic systems have been described in the recent decade. Face recognition systems based on LDA, on the other hand, have a scalability issue. An incremental method is a natural way to get around this issue. The inverse of the within-class scatter matrix is the most difficult aspect of designing the incremental LDA (ILDA). In this research, they designed a novel ILDA technique called GSVD-ILDA, which has a net influence on the result parameter such as the GSVD-95 percent, based on the generalised singular value decomposition LDA (LDA/GSVD). The given technique has scalability concerns, which is a limitation.

        4. Local Binary Pattern

          The Local Binary Pattern (LBP) is a basic yet effective texturing operator that labels the pixels of an image by thresholding the pixel's vicinity and treating the output as a binary form. The threshold pixel vicinity is represented in the histogram format which is used as feature extraction. It generally calculates the matrix value which is converted into binary form as their output.

          Fig.2.4. Process of LBP in 3X3 matrix

          The study [10] discusses the fact that a low-cost face recognition system may provide a satisfactory output of illumination factor 115 to 130 and a horizontal face rotation factor of -5 to +5 degrees. The problem is that the system comes with internal capability due to its low cost, and the camera included in the device is not suited for extended ranges. The work[11] is shown to pin point the accuracy of the system where the false positive rate lies nearly about 28% and the identification rate of success is 77%. This Even though it has an improved rate of identifying the unkown faces as 60% these type of algorithms dont stand a chance with the deep learning or machine learning algorithm.

        5. Deep Learning Techniques

          Face recognition technology is currently quite mature when the lighting is controlled and there is little intra-class variation. Face recognition performance under less-than-ideal conditions, on the other hand, still has to be improved. Face recognition should address the PIE problem [12], which is a non-ideal condition that includes changeable illumination, posture, and expression.

          Deep learning techniques were used to recognise facial expressions that convey human emotions. The following is how our face expression recognition technology works: First, Haar-like features are used to detect the face in the input image. Second, the deep network is designed to recognize facial expressions from faces that have been observed. Deep neural networks and convolutional neural networks are two types of deep networks that can be employed in this stage. As a result, they examined the performance of two types of deep networks, finding that the convolutional neural network outperformed the deep neural network. A work [13] published in 2017 used a deep learning-based object detection algorithm to detect the operator's safety in a harbour scenario. By comparing the identified personnel position with the location of the calibration area, they were able to determine the operator's violation. They also use real-time tracking and data retention to limit the number of unlawful operators on the road. Experiments demonstrate that our method is competitive in detecting the safety of operating individuals and calibrating dangerous operation regions using the Harbours Dataset. Recently, there have been some fresh research findings. In 2017, the paper suggested a multi-task CNN based on multi-task learning for face recognition. They suggested a pose directed multi- task CNN that learns pose-specific identity information simultaneously across all postures by grouping distinct poses [14]. To overcome the PIE problem in face recognition, Mahantes et al. presented a transform domain solution [15]. This work presented the collaborative representation discriminant projections (CRDP) supervised feature extraction technique [16].

          1. Generalized Adversarial Networks

            Generative models is an unsupervised recognition system that entails automatically finding and learning regularities or patterns in incoming data so that the model may be used to synthesis or generate training samples that could have been obtained from the original dataset.

            GANs are a smart way of training a generative model by framing the problem as a supervised problem with two sub-models: the generator model, which we train to create additional depictions, and the discriminator model, which attempts to classify examples as real or fake (generated). In an adversarial both models are trained until the discriminator model is fooled around half of the time, suggesting that the generating model is producing credible examples.

            Flow chart for working of the GAN model

            Fig Working of GAN Model

            This paper [17] employs a conditional generalised neural network approach that employs a generator and discriminator. The generator's job is to recover the occluded item from the supplied image. The processed image may or may not resemble the original. When the images are reconstructed, some loss functions, such as feature loss and network losses, may be produced. The network's other portion compares the synthesised face to the input image. The created system's accuracy is inversely proportional to the network's overall loss function. In some circumstances, the regenerated face differs from the ground truth, resulting in a false identification of the individual with a 91 percent accuracy rate. The two-player min-max game is a simple yet effective approach to estimate target distribution and create new image samples.

            The study [18] works with a system in 2018 that uses the DC GAN method, although the image quality isn't up to par. GANs are generative models that use a generator and a discriminator to estimate the distribution of pictures. DCGANs are a version that can help with training volatility. In-painting approaches for facial images now use generative models like GANs or AEs to manually pick a block of pixels as the missing part, and the potential content of this block is constructed semantically. Using a distinct prediction error, a mask is inferred from a blocked facial image, which is being used by pre-trained DCGANs to digitally in-paint the occlusions. Semi-occlusion is a new challenge, and the input image is limited to 64X64 pixels. This research [19] discusses a system in 2021 that uses an OA-GAN (Occlusion Awareness Generalized Adversarial Network) to anticipate the occluded region while still recovering a non-occluded face image. A generator and a discriminator make up OA-GAN. An occlusion-aware module plus a face completion module makes up the generator. The occlusion-aware module of the generator predicts an occlusion mask from a face image with occlusion, which is then fed to the face completion module of the generator along with the occluded face picture for face de-occlusion. The discriminator includes an adversarial loss for distinguishing between actual face photos without cclusions and de-occluded face images, as well as an attribute-preserving loss to ensure that the de- occluded face images preserve the same attributes as the input face images. The findings reveal that, while previous approaches may not operate in the absence of manual occlusion masks, the suggested OA-GAN can still get a very good de-occlusion outcome. The proposed system [16] is a deep generative adversarial network (FCSR-GAN) that uses multi-task learning to conduct joint face completion and face super-resolution. FCSR-generator GAN's tries to recover a high-resolution face picture without occlusion from a low-resolution face image with occlusion as an input. To ensure the high quality of the recovered high-resolution face images without occlusion, the FCSR-GAN discriminator employs a collection of carefully constructed losses (an adversarial loss, a perceptual loss, a pixel loss, a smooth loss, a style loss, and a face prior loss). A two-stage training technique can be used to train the entire FCSR-GAN network end-to-end.

            The suggested system uses the FCSR-GAN algorithm, which has a net influence on parameters like PSNR and MSSIM. The disadvantage of the strategy is that balancing de-occlusion while maintaining subject identity is challenging. The work [20], which was released in 2020, uses a Novel GAN technique in which the created model only offers excellent results in face recognition with a limited type of occlusion. Because GAN is also an unsupervised learning model, it is commonly employed in both unsupervised and semi-supervised learning [21], [22].

          2. Convolutional Neural Network

      In 2018, the paper [23] was published that improves the Convolutional Neural Network (CNN) by retraining it on the Pascal VOC 2012 dataset using the fast VGG network (VGG-t). The suggested technique can effectively isolate the saliency region and discover the same item (human face) as the query, as demonstrated by an experiment on the FIFA dataset. Experiments on the David l and Face l sequences show that the suggested approach can efficiently deal with a variety of problematic aspects, such as appearance and size fluctuations, shape distortion, and partial occlusion. The study [24] was published in 2018 gives a idea of object detection methods like Haar Cascade and convolutional Neural Network are covered (CNN). Convolutional neural networks are a deep learning strategy that can be used to identify various items, whereas the Haar Cascade classifier is a simple face detection technique which can also be trained to detect other things. The custom dataset consists of 2300 photos divided into three classifications. This comparison is being conducted to see if CNN is a suitable technique for this system in the form of performance in real-time conditions. This research [25] published in 2018 introduces a revolutionary deep learning-based face recognition attendance system that takes advantage of recent breakthroughs in the creation of deep convolutional neural networks (CNN) for face detection and recognition applications. The entire process of developing a facial recognition model is covered in detail. The CNN cascade for face recognition and CNN for generating face embeddings are two key components of this model, both of which were built using today's most sophisticated techniques. The main goal of this research was to see if these cutting-edge deep learning methods might be applied to handle real-world facial recognition difficulties. Because CNNs produce the best results for larger datasets, which is not the case in commercial applications, the main issue was applying these approaches to smaller datasets. A new method for image augmentation for face recognition applications is proposed.

      The performance of CNNs in 2020 will be heavily influenced by their architectures. The designs of the most cutting-edge CNNs are frequently hand-crafted by experts in both CNNs and the topics under investigation. The obstacles posed by individuals with hardly any familiarity with CNN designs in constructing optimal CNN structures for their own image classification problems of interest are addressed in this study [26],[27]. To efficiently solve

      picture classification issues, the research developed an autonomous CNN architecture design technique based on evolutionary algorithms. The suggested algorithm's strongest characteristic is its "automatic" capability, which implies that individuals don't really need domain expertise of CNNs to use it and may still obtain a promising Classifier model for the given images.For better detection and recognition, various works include auto encoders in the system [28], [29]. The system work with Viola Jones algorithm to have best SIFT and SURF values [29].


Other researchers might use the data from this publication as a fundamental reference and guideline for developing model which require face recognition. This document can also be used as a quick read to gain a basic understanding of the facial recognition techniques and approaches. Many accessible review studies have only reviewed two different algorithms, but this study has expanded its review to include five different algorithms, providing readers with more information on the existing methods.


[1] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, Y. Chen, T. Lillicrap, F. Hui, L. Sifre, George van den Driessche, T. Graepel, and D. Hassabis, “Mastering the game of go without human knowledge,'' Nature, vol. 550, no. 7676, p. 354, 2017.

[2] LIN, S.H, An Introduction to Face Recognition Technology Informing Science on Special Issue on Multimedia Informing Technologies.Vol.3, Part 2, No.1, (2000).

[3] R. Gottumukkal and V. K. Asari, An improved face recognition technique

based on modular PCA approach, Pattern Recognit. Lett., vol. 25, no. 4, pp. 429_436, Mar. 2004.

[4] D. C. Hoyle and M. Rattray, “PCA learning for sparse high- dimensional data,'' Europhys. Lett. (EPL), vol. 62, no. 1, pp. 117_123, Apr. 2003.

[5] K.Vijay and K. Selvakumar, “Brain FMRI clustering using interaction K means algorithm with PCA,'' in Proc. Int. Conf. Commun. Signal Process. (ICCSP), Apr. 2015, pp. 0909_0913. VOLUME

[6] Y. Freund, R. Iyer, E. R. Schapire, Y. Singer, and G. T. Dietterich, “Aneffcient boosting algorithm for combining preferences,'' J. Mach. Learn.Res., vol. 4, no. 6, pp. 170_178, 2004.

[7] J. Li, B. Zhao, H. Zhang, and J. Jiao, “Face recognition system using SVM classifier and feature extraction by PCA and LDA combination,'' in Proc. Int. Conf. Comput. Intell. Softw. Eng., Dec. 2009, pp. 1_4.

[8] S. Chintalapati and M. V. Raghunadh, “Automated attendance management system based on face recognition algorithms,'' in Proc. IEEE Int. Conf. Comput. Intell. Comput. Res., Dec. 2013, pp. 1_5.

[9] Haitao Zhao and Pong Chi Yuen, Incremental Linear Discriminant Analysis for Face Recognition, IEEE Transactions On Systems, Man, And Cybernetics, VOL. 38,2008.

[10] J. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos, “Face recognition using LDA-based algorithms,'' IEEE Trans. Neural Netw., vol. 14, no. 1, pp. 195_200, Jan. 2003.

[11] Sandoval, Ramiro Camino, Vanessa Moyano, Ricardo Flores Riofrio, Daniel Perez, Noel Benitez, Diego , On the Use of a Low-Cost Embedded System for Face Detection and Recognition, IEEE Colombian Conference on Applications of Computational Intelligence, ColCACI 2020 .

[12] Bharath Tej Chinimill, Anjali T, Akhil Kotturi, Vihas Reddy Kaipu, Jathin Varma Mandapati, Face Recognition Based Attendance System Using Haar Cascade And Local Binary Pattern Histogram Algorithm, Proceedings of the fourth

international conference on trends in electronics and informatics

@2020, IEEE Xplore, ISBN: 978-1-7281-5518-0.

[13] J. M. Voas, “PIE: A dynamic failure-based technique,' IEEE Trans. Softw. Eng., vol. 18, no. 8, pp. 717_727, Aug. 1992.

[14] Guoan Cheng, Abnormal Behaviour Detection For Harbour Safety Under Complex Video Surveillance Scenes , International Conference On Security ,Pattern Analysis And Cybernetics,2017.

[15] X. Yin and X. Liu, “Multi-task convolutional neural network for pose invariant face recognition,'' IEEE Trans. Image Process., vol. 27, no. 2, pp. 964_975, Feb. 2018.

[16] K. Mahantesh and H. J. Jambukesh, “A transform domain approach to solve PIE problem in face recognition,'' in Proc. Int. Conf. Recent Adv. Electron. Commun. Technol. (ICRAECT), Mar. 2017, pp. 270_274.

[17] D. Zhang and S. Zhu, “Face recognition based on collaborative representation discriminant projections,'' in Proc. Int. Conf. Intell. Transp., Big Data Smart City (ICITBS), Jan. 2019, pp. 264_266.

[18] Chen, Xiang Qing, Linbo He, Xiaohai Su, Jie Peng, Yonghong, From Eyes To Face Synthesis: A New Approach For Human- Centered Smart Surveillance, IEEE Access, Volume 6,2018.

[19] Lei Xu, Honglei Zhang, Jenni Raitoharju and Moncef Gabbouj unsupervised facial image de-occlusion with optimized deep generative models, isbn 9781538664285.

[20] Cai, Jian Cheng, Han, Hu Cui, Jiyun Chen, Jie Liu, Li Kevin Zhou, S. Semi-Supervised Natural Face De-Occlusion, IEEE Transactions on Information Forensics and Security, volume 16,2021.

[21] Cai, Jian cheng Han, Hu Shan, Shiguang Chen, Xilin FCSR- GAN: Joint Face Completion And Super-Resolution Via Multi- Task Learning, IEEE transactions on biometrics, behavior, and identity science, volume 2, issue 2, 2019.

[22] J. T. Springenberg, “Unsupervised and semi-supervised learning with categorical generative adversarial networks,'' 2015, arXiv:1511.06390.

[23] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen, “Improved techniques for training GANs,'' in Proc. Adv. Neural Inf. Process. Syst., 2016, pp. 2234_2242.

[24] N. DIn, Nizam Ud Javed, Kamran Bae, Seho Yi, Juneho, Effective Removal of User-Selected Foreground Object from Facial Images Using a Novel GAN-Based Network, IEEE Access, vol.8,2020.

[25] Duzhen Zhang , Top-Down Saliency Object Localization Based On Deep Learned Features, International Conference On Image And Signal Processing , Bio-Medical Engineering And Informatics ,West University Of Timisoara, UVT Timisoara , Romania, 2018.

[26] Samkit Shahl and Jayraj Bandariya, CNN based Auto-Assistance System as Boon for Directing Visually Impaired Person ,20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, 2019.

[27] Yanan Sun , BingXue , Mengjie Zhang , , Gary G. Yen and Jiancheng Lv , Automatically Designing CNN Architectures Using the Genetic Algorithm for Image Classification , IEEE Transactions On Cybernetics ,2020.

[28] Hoang, Toan Minh Nam, Gi Pyo Cho, Junghyun Kim, Ig Jae, Deface: Deep Efficient Face Network For Small Scale Variations, IEEE Access, volume 8, 2020.

[29] Gao, S. Zhang, Y. Jia, K. Lu, J. Zhang Y , Single Sample Face Recognition via Learning Deep Supervised Autoencoders IEEE Transactions on Information Forensics and Security, volume 10,2015.DOI: 10.1109/TIFS.2015.2446438

[30] Sajjad, Muhammad Nasir, Mansoor Muhammad, Khan Khan, Siraj Jan, Zahoor Sangaiah, Arun Kumar Elhoseny, Mohamed Baik, Sung Wook Raspberry Pi assisted face recognition framework for enhanced law-enforcement services in smart cities", Future Generation Computer Systems, volume 108,2020.

Leave a Reply