Face Image Retrieval using Attribute Codewords

DOI : 10.17577/IJERTV3IS20224

Download Full-Text PDF Cite this Publication

Text Only Version

Face Image Retrieval using Attribute Codewords

P. Petchimuthu #1,K. Benifa#2

1 Professor,

2PG Scholor,

#Department of Computer Science and Engineering, Anna University-Chennai, SCAD College of Engineering and Technology,

cheranmahadevi, Tamilnadu, India.

Abstract – Many applications including face verification use face image retrieval method. Its a challenging technique since all the faces will be similar due to its similar geometrical configuration of face structure. This paper retrieves similar faces using content based method. In earlier stage content based retrieval was performed using low level features such as appearance and posing. But low level features lack to give correct semantic description of images. For an e.g., face of different people is retrieved as similar in low level features. This problem is solved by incorporating low level features with high level human attributes like race, gender, hair etc. Two valuable methods such as attribute enhanced sparse coding and attribute embedded inverted indexing are proposed here to retrieve the image more effectively. These techniques can achieve above 43.5% improvement by comparing with existing techniques.

Index Terms Face image, human attributes, content- based image retrieval, content based image retrieval, attribute enhanced sparse coding, and attribute embedded inverted indexing, high level features.


    Social networks such as facebook twitter etc are widely used in our day to day life. Many of them use human face images for their profile. And also people use celebritys faces. Now a days human faces are mostly used for manipulations such as searching and mining. Face image retrieval using content based method is an emerging technology in many real world applications. Due to different people having similar faces, problems can be faced while we arrive to retrieve for similar faces. To solve this issue technology such as Retrieval based face annotation use common outline for same categories of image. For example kid cap can be set as constrain to retrieve childrens, long hair for womens.

    Two main challenges should be faced while we overcome the existing system that is the first challenge is we have to efficiently short list the similar face images. On the other hand we have to effectively exploit the short list of face image and its week labeled information that differs from the original face. Our main goal is to retrieve the similar images from large scale database using content

    based. In existing content based method used low level features for retrieval and also it cannot detect the human faces automatically. These issues are solved in proposed by incorporating low level and high level attributes. Low level features are just appearance and posing in which we cannot get the exact information whether it is similar or different human faces. For this purpose we use high level attributes which can differentiate the unique faces from all common faces. High level attributes include gender, race, hair etc. The attributes should be selected effectively as it can provide a crystal clear result from all the faces in large scale database. By incorporating low level and high level attribute we can gain promising result to retrieve similar faces from large scale database.

    It is clear from the image that when we use low level feature, the result is unsatisfactory. But by combining high level feature with the low level then the result will be satisfactory. High level attributes i.e. Gender and hair shows that whether the face belongs to men or women. High level attribute will give the semantic meaning, whereas low level appearance lack to give semantic description about the face.

    A face image is given as input to retrieve similar faces from large scale database using content based image retrieval system (CBIR). CBIR is also known as QBIR that is query based image retrieval. Before storing the image in the database an index number is given to the images. Using the index number features are extracted from the image. And the extracted features are stored in feature database. By using this technique similar faces can be extracted. It was an important technology in many upcoming applications such as crime investigation, face annotation etc.


    Whenever we give an image as input without background then it will automatically detect the facial position. Then the face position is divided into multiple grid points. And from this grid points we can get landmarks of the facial features and from that features we can get the local patches, by that we can generate the sparse codeword

    from attribute enhanced sparse coding.


    Retrieving similar faces from large scale database is a tedious process. Content based image retrieval techniques are used for retrieving images that are similar. But in existing content based only low level features are used. This method lack to provide the correct semantic description about the face. And the face images will usually having high intra class variance (e.g., Posing, Expression). So the result will be unsatisfactory. To overcome this problem we can combine the low level features and high level attributes to gain the semantic description of the image. By this we can achieve exact similar faces from large scale database. Existing content based image retrieval has other major drawbacks that it cannot detect the human faces automatically.

    Human attributes contains only limited dimension. When there are too many people in the dataset then it will lose the discriminability.Because some people will have the similar face appearance. Human attributes will be denoted as vector of floating points. When the data size is huge it suffers from low response time and scalability issue. It wont work properly while developing large scale indexing method.

    And attribute embedded inverted index compares both the sparse code from database and input image and retrieve the similar ranking faces. The related images are displayed on the window separately. Similar images are displayed by dark outlined images, and the images not present in dark outlined will be unrelated images. This work will clearly describe about how sparse coding works. It describes about image retrieval of video source. One video will be the collection of multiple frames, so before performing the operation the video source are divided into frames. Then the point tracker assigns the pointer to every frame and gives the resultant to model detector. In this the model detector finds the high level description for the frames and then it is stored in database. Then the images are retrieved from large database.

    This is how in proposed method the images are converted into local patches and the patches will be converted into sparse coding and indexing is performed.

    Traditional CBIR uses two kinds of indexing Known as inverted and hash table indexing. But both the indexing method suffered from low recall problem because of the semantic gap. Recently many researchers have been performed to build the semantic gap and improve the performance of the CBIR. By automatically detected human attributes many real time application have achieved promising result.


    This paper automatically detects the human attributes using two valuable methods called attribute enhanced sparse coding and attribute embedded inverted indexing. Attribute enhanced sparse coding will create the sparse code from the image by combining the low level and high-level attribute. Only by including high levl attributes we can gain the semantic description of the image. Each detected face is divided into multiple grids and from each grid image patch will be extracted and LBP feature descriptor is used to extract the local features that are local patches. And these local patches will be used to generate the sparse coding. All these steps are performed in attribute enhanced sparse coding algorithm. Attribute embedded inverted indexing will extract the sparse code words form the input image and database image and retrieve the similar faces from the database. Sparse coding is generated in offline stage and inverted indexing is performed in online stage. By incorporating these two algorithms with low level features (appearance) and high level attributes (gender) we can achieve the promising result in extracting similar faces from the large scale database. This will be an efficient procedure for extracting related faces from large scale database.

    coding. These collections of sparse coding represent the original image.

    Attribute embedded inverted indexing: It collects the sparse code words from the attribute enhanced sparse coding and check the code words with the online feature database and retrieve the related images similar the query image.

    The architectural design explains the processing of sparse coding and inverted indexing. Image is given as input. Before entering into major algorithm, the query image will go through preprocessing. Preprocessing removes the background and identifies the face region. In matlab, imfilter and imadjust is used to filter the face and remove the background region. The noisy data from the query image is also removed here. By extracting the face region, it can be divided into multiple grids

    From the grid points, Local patches are extracted and by using the patches LBP features are obtained. From every LBP descriptor sparse code words are quantized. In face recognition technology mostly we crop only face region using preprocessing method and we normalize the posing, lighting etc. By doing these steps we are ignoring hair, color, and skin etc., rich semantic cues are ignored so while performing preprocessing we cannot get the correct semantic description about the image. For example hair is one of the major attribute in deciding whether the image is the man or women. In that case it fails to identify the correct semantic meaning of the face. After preprocessing step the information are lost to find the attribute of the images. And when the faces are cropped then it will fail to compare the cropped version with uncropped. So only by using the surrounding context of the face we can get the exact semantic meaning of the image. This issue is solved by performing post processing that takes the face as the center point and detects its surrounding area including hair. By post processing we can gain extra information about the face. Both pre processing and post processing are performed to provide the semantic cues of image. Preprocessing is performed to extract the inner face patterns that are sparse code. Post processing is performed to extract outer face attributes such as hair and its color, size etc.

    Attribute Enhanced Sparse Coding:

    It describes the automatic detection of human attribute from the image and also creates the different sparse

    For every image in the database face detector is used to detect the location of face region. 73 possible attributes can be taken. For example hair, color, race, gender etc. Active shape model is used to mark the facial landmarks and by using that land mark alignment of the face is done. For each face component 7*5 grid points are taken. Each grid will be a square patch. These grid components include eyes, nose, mouth corners etc. LBP feature descriptor is used to extract features from those grids. After extracting the features we quantize it to code words known as sparse coding. All these code words are summed and generate a single pattern for the image. These steps are obtained by using attribute enhanced sparse coding. Before storing the image in database an index number will be provided to it and by using that index number we can identify the image. All these process will be performed in offline stage. Attribute embedded inverted indexing will be performed in online stage which compares the sparse codeword of query image and the database image and finally provide all the similar faces from the database.

    This technology is the emerging one that is used in real time applications.


    This work highlights the improvement of the proposed system. By using rich high level attributes we can get exact semantic description of the image. But past content based used only low level attributes.

    This technology shows 43.5% improvement when compared with Mean average precision.LBP settings is 175grids*59dimensions. This settings is better in performance level when compared with original settings 49grids*59dimensions. This will be robust in case of pose variations. Memory usage for storing the index number will be less when compared in early algorithms. And also memory is needed to save the sparse codeword. Reasonable amount of memory will be used for general computer server. Online retrieval time will be effective when the similar faces are extracted from the database, index of one million faces will be extracted in 0.2 seconds. It shows 43.5% improvement when comparing with mean average precision. Proposed method seems to be high performance technique when retrieving the image from large scale database. Including high level together with low level features makes the content based method more efficient.


    Two orthogonal methods worked out well in retrieving the similar faces from a large scale database. Automatic detection of attributes will be efficient when comparing with MAP. By using sparse codeword,

    quantization errors are reduced and salient images are taken from the database. In existing system, appearances are used but we propose high level human attributes which are effective in case of image retrieval from large database. Attribute enhanced sparse coding will extract less amount of images which exactly match with the query images. This is an efficient method compared to the existing methods.


  1. Bor Chun Chen, Yan Ying Chen, Yin Hsi Kuo, and Winston H. Hsu ,Scalable Face Image Retrieval Using Attribute-Enhanced Sparse Code words Senior Member, IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 15,

    NO. 5, AUGUST 2013 1163

  2. Y.-H. Lei, Y.-Y. Chen, L. Iida, B.-C. Chen, H.-H. Su, and

    W. H. Hsu, Photo search by face positions and facial attributes on touch devices, in Proc. ACM Multimedia, 2011.

  3. D. Wang, S. C. Hoi, Y. He, and J. Zhu, Retrieval- based face annotationby weak label regularized local coordinate coding, in Proc. ACM Multimedia, 2011.

  4. U. Park and A. K. Jain, Face matching and retrieval

    using soft biometrics, IEEE Trans. Inf. Forensics Security, vol. 5, no. 3, pp. 406415, Sep 2010.

  5. Z. Wu, Q. Ke, J. Sun, and H.-Y. Shum, Scalable face image retrieval With identity-based quantization and multi-reference re-ranking, in Proc. IEEE Conf. Computer Vision and Pattern Recognit., 2010.

  6. B.-C. Chen,Y.-H. Kuo, Y.-Y. Chen, K.-Y. Chu, and

    W. Hsu, Semi-supervised face image retrieval using sparse coding with identity constraint, in Proc. ACM Multimedia, 2011.

  7. T. Ahonen, A. Hadid, and M. Pietikainen,

Face recognition with local binary patterns, European Conference on Computer Vision, 2004. [7] L. Wu, S. C. H. Hoi, and N. Yu, Semantics-preserving bag- of-words models and applications, Journal of IEEE Transactions on image processing, 2010.

P.Petchimuthu currently working as a professor in computer science department and having more than 24 years of Engineering college teaching experience both in the under graduate and post graduate levels in addiion to 4years of industrial experience.

K.Benifa currently doing ME Computer science in SCAD college of engineering and technology, Cheranmahadevi, Tirunelveli district. She has completed her undergraduate B.Tech Information Technology in Bharath University, Chennai in 2012.

Leave a Reply