Facial Expression Recognition based on Deep Learning for Annotate the Music

DOI : 10.17577/IJERTCONV8IS12019

Download Full-Text PDF Cite this Publication

Text Only Version

Facial Expression Recognition based on Deep Learning for Annotate the Music

S. Saravanan

Assistant Professor,

1,2Department of Information Technology, Kongunadu College of Engineering and Technology,

Namakkal-Trichy Main Road, Thottiam Taluk, Dist Trichy, Tholurpatti, Tamil Nadu, India

  1. Muthuselvan, B. Akash, A. Veeramalai, K. Ajay


    Department of Information Technology, Kongunadu College of Engineering and Technology,

    Namakkal-Trichy Main Road, Thottiam Taluk, Dist Trichy, Tholurpatti, Tamil Nadu, India

    Abstract – Facial expressions give important indication about emotions. Computer systems based on affective interaction could play an important role in the next generation of computer vision systems. Implement emotion based recognition system to annotate the music. Using HAAR cascade algorithm to extract the facial features and also classify the emotions using deep learning algorithm. Finally predict music using K-Nearest Neighbour algorithm. Index TermsExpression recognition, Basic emotion, Com- pound emotion, Deep learning.


      Emotion recognition is the process of point out human emotion. Using this technology to help blind people with emotion recognition is a relatively nascent research area. Different emotion methods are detected through the integration of information from facial expressions, body movement and gestures, and speech.


      Active shape Model: Extract the facial features points. Adaboost classifier: Classify the emotions based on geometrical notations. Viola jones: Using wavelet approach to extract facial points in synthetic datasets.


      At the time of facial points extraction, large number irrelevant features are extracted.So emotion classification can be become wrong. Each and every facial structure was trained for emotion recognition. Only match image to image for expression and not implement in real time based camera capturing.


      Tries to provide an interactive way for the user to play music based on emotions.Using HAAR Cascades to extract facial parts regions such as nose, eyes, lips and cheeks. Using to classify the emotions with voice alert of emotions.KNN (K- Nearest Neighbor) to play music based on emotions.


      Can be used in real time environments with blind people easy to identify neighbour emotionUser friendly applications and reduce the depression. No trouble of troublesome selection of songs. Reduce number of features are extracted.


      Background subtraction using binarization:

      The background is imagine to be the frame at time t. This difference image would only show some potency for the pixel locations which have changed in the two frames. Though we have seemingly removed the background, this approach will only work for cases where all foreground pixels are moving and all background pixels are static.

      Face detection using HAAR cascade algorithm:

      Integral images can be defined as two-dimensional lookup tables in the form of a matrix with the same size of the original image. Each element of the integral image contains the sum of all pixels located on the up-left region of the original image (in relation to the element's position). This allows computing sum of rectangular areas in the image, at any position or scale, using only four lookups:

      Sum = I(C) + I (A) +I(B)- I(D)

      Where points A, B, C, D belong to the integral image I

      Face detection using cascade classifier:


Feature Extraction:

In this part, the network will perform a series of convolutions and pooling operations during which the features are detected. If you had a picture of a zebra, this is the part where the network would recognize its stripes, two ears, and four legs.


Here, the fully connected layers will serve as a classifier on top

of these extracted features. They will assign a probability for the object on the image being what the algorithm predicts it is.


  1. Fan, Yingruo, Jacqueline CK Lam, and Victor OK Li. "Multi- region ensemble convolutional neural network for facial expression recognition." International Conference on Artificial Neural Networks. Springer, Cham,


  2. Chang, Feng-Ju, et al. "ExpNet: Landmark-free, deep, 3D facial expressions." 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG

    2018). IEEE, 2018.

  3. Tautkute, Ivona, Tomasz Trzcinski, and Adam Bielski. "I know how you feel: Emotion recognition with facial landmarks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.


  4. Mehta, Dhwani, Mohammad Faridul Haque Siddiqui, and Ahmad Y. Javaid. "Facial emotion recognition: A survey and real-world user experiences in mixed reality." Sensors 18.2 (2018): 416.

  5. Patel, Abhishek R., et al. "MoodyPlayer: a mood based music player." Int. J. Compute. Apple 141.4 (2018):


  6. Zhang, Zhanpeng, et al. "From facial expression recognition to interpersonal relation prediction." International Journal of Computer Vision 126.5 (2018): 550-569.


for all the unknown samples UnSample(i) for all the known samples Sample(j) compute the distance between Unsamples(i) and Sample(j)

end for

find the k smallest distances locate the corresponding samples Sample(j1),.,Sample(jK)

assign UnSample(i) to the class which appears more frequently end for

Leave a Reply