Music Tune Generation based on Facial Emotion

DOI : 10.17577/IJERTV8IS110257

Download Full-Text PDF Cite this Publication

Text Only Version

Music Tune Generation based on Facial Emotion

Pooja Mishra

Professor, (SPPU)Dept. of Computer Engineering, Dr. D. Y. Patil Institute of Engineering, Management & Research, Akurdi,

Pune, India

Himanshu Talele

Yogesh Sawarkar

(SPPU)Dept. of Computer Engineering, Dr. D. Y. Patil Institute of Engineering, Management & Research, Akurdi, Pune, India

Rohit Vidhate (SPPU)Dept. of Computer Engineering, Dr. D. Y. Patil Institute of Engineering,

Management & Research, Akurdi, Pune, India

Abstract: – An individual's face is an important part of a human body and it especially plays an important role in knowing an individual's temperament. Educe the required input from the individual face can now be done directly employing a camera. This input will then be used in various ways. One of the applications of this input can be for extracting the information to analyze the temperament of an individual. This data can then be used to get a list of songs that comply with the temperament derived from the input provided earlier. This will result in removing the time- consuming and tedious task of segregating or grouping songs into different lists manually and helps to bring into existence a good playlist based on an users' emotional features. Various algorithms have been developed and proposed for automating the playlist generation process. Facial Expression Based Music tune intents to scan and interpret the data and accordingly creating a playlist based the parameters provided. The scanning and interpreting include audio feature extraction and classification to get a list of songs belonging to a similar genre or to get a list of similar sounding songs. For mutual understanding and sharing feelings and intentions with each other human emotions are important. The emotions are manifested in verbal and facial expressions. This paper mainly focuses on what are the methodologies available for detecting human emotions for developing emotion-based music tune. These methodologies will be then used to create a music player which will not only detect facial expression but will also generate unique music tune for user. Here we capture face and detect emotion using Viola Jones and CNN. The paper aims to tell how system will work and also playlist generation based on emotion classification. The application is so developed in such a way that it can manage content accessed by user, analyze the image properties and determine the temperament of the user which will be used to detect mp3 file properties so that they can be added into emotion-based play lists according to the temperament.

Keywords: Facial Emotion, Facial Expression, Audio Feature Extraction, Viola Jones, CNN.

(SPPU)Dept. of Computer Engineering, Dr. D. Y. Patil Institute of Engineering, Management & Research, Akurdi, Pune, India

Ganesh Naikare (SPPU)Dept. of Computer Engineering, Dr. D. Y. Patil Institute of Engineering,

Management & Research, Akurdi, Pune, India

  1. INTRODUCTION

    Music plays an important role in an individuals life. It is an important source of entertainment and is often associated with a therapeutic role. With the advent of technology and contiguous advancements in multimedia, sophisticated music players have been designed and have been enriched with numerous features, including volume modulation, genre classification etc. Even though users need is successfully addressed using this feature, user has to search his playlist for songs according to his emotions. In traditional music player, a user has to surf through his playlist on his own and select songs that would enlighten his mood and emotional experience. This method of choosing the songs is complex and time consuming and user would face a dilemma in finding appropriate song. The advent of Audio Emotion Recognition (AER) and Music Information Retrieval (MIR) equipped the traditional systems with a feature that automatically parsed the playlist, based on different classes of emotions.

    In existing system user selects every time to play some tune for himself, so we introduce a system that detects user emotion to generate tunes. Users are bore when they listen same tune every time, so we create new tune for every time when user accesses the system. When user emotion is detected same, system generates same tune every time. This also becomes tiresome for the user to listen same song again and again. So, in our system we are using method that will generate new tune which will be advancement of previous generated tune. Hence whenever same facial expression is detected, new tune will be generated every time.

  2. LITERATURE SURVEY

    1. Paper Name: An Emotional Symbolic Music Generation System based on LSTM Networks

      Author: Kun Zhao, Siqi Li, Juanjuan Cai, Hui Wang, Jingling Wang

      Description: With the evolving of AI technology in recent years, Artificial Neural Networks have been used in the task of algorithmic music composition and have shown significant results. Music is highly associated with human emotion, however, there are few attempts of intelligent music composition in the scene of expressing different emotions. In this work, Biaxial LSTM networks have been used to generate polyphonic music, and the thought of Look Back is also introduced into the architecture to improve the long-term structure. Above all, we design a novel system for emotional music generation with a manner of steerable parameters for 4 basic emotions divided by Russells 2-demonsion valence-arousal (VA) emotional space. The evaluation indices of generated music by this model is closer to real music, and via human listening test, it shows that the different affects expressed by the generated emotional samples can be distinguished correctly in majority[1].

    2. Paper Name: Algorithmic Music Composition Based on Artificial Intelligence: A Survey

      Author: Omar Lopez-Rincon, Oleg Starostenko, Gerardo Ayala-San Martín

      Description: Here present a taxonomy of the Artificial Intelligence (AI) methods currently applied for algorithmic music composition. Algorithmic music composition is the area which concerns about research on processes of composing music pieces automatically by a computer system. The use of Artificial intelligence for algorithmic music includes application of AI methods as the main tool for the composition of music. There exist various models of AI used in music composition. They are as follows: generative models, heuristics in evolutionary algorithms, neural networks, stochastic methods, agents, decision trees, declarative programming and grammatical representation. This survey aims to present the trending techniques for automatic music composition[2].

    3. Paper Name: Emotion Based Music Player Using Facial Recognition

      Author: Prof. Vijaykumar R. Ghule, Abhijeet B. Benke, Shubham S. Jadhav, Swapnil A. Joshi

      Description: An individual's face is an important part of a human body and it especially plays an important role in knowing an individual's temperament. Educe the required input from the individual face can now be done directly employing a camera. This input will then be used in various ways. One of the applications of this input can be for extracting the information to analyze the temperament of an individual. This data can then be used to get a list of songs that comply with the temperament derived from the input provided earlier. This will result in removing the time-consuming and tedious task of segregating or grouping songs into different lists manually and helps to bring into existence a good playlist based on an users emotional features. Various algorithms have been developed and proposed for automating the playlist generation process. Facial Expression Based Music tune intents to scan and interpret the data and accordingly creating a playlist based the parameters provided. The scanning and interpreting include audio feature extraction and classification to get a list of songs belonging to a

      similar genre or to get a list of similar sounding songs. For mutual understanding and sharing feelings and intentions with each other human emotions are important. The emotions are manifested in verbal and facial expressions. This paper mainly focuses on what are the methodologies available for detecting human emotions for developing emotion-based music tune. These methodologies will be then used to create a music player which will not only detect facial expression but will also generate unique music tune for user. Here we capture face and detect emotion using Viola Jones and CNN. The paper aims to tell how system will work and also playlist generation based on emotion classification. The application is so developed in such a way that it can manage content accessed by user, analyze the image properties and determine the temperament of the user which will be used to detect mp3 file properties so that they can be added into emotion-based play lists according to the temperament [3].

    4. Paper Name: An Accurate Algorithm for Generating a Music Playlist based on Facial Expressions

      Author: Anukriti Dureha

      Description: Manual segregation of a playlist and annotation of songs, in accordance with the current emotional state of a user, is labor intensive and time consuming. Various algorithms have been used to automate this process. However, the existing algorithms are slow, increase the overall cost of the system by using additional hardware (e.g. EEG systems and sensors) and have less accuracy. This paper presents an algorithm that automates the process of generating an audio playlist, based on the facial expressions of a user, for rendering salvage of time and labor, invested in performing the process manually. The algorithm proposed in this paper aspires to reduce the overall computational time and the cost of the designed system. It also aims at increasing the accuracy of the designed system. The facial expression recognition module of the proposed algorithm is validated by testing the system against user dependent and user independent dataset. After experiment it is found that, user dependent results give 100% accuracy. Although user independent results for joy and surprise are 100 %, for sad, anger and fear are 84.3 %,

      80 % and is 66% respectively. For user independent dataset, 86% is the accuracy of emotion recognition algorithm. In audio, for joy and anger, recognition rates obtained are 95.4% and 90 % respectively, while it gives 100% accuracy for sad, sad-anger and joy-anger. 98% is the overall efficiency of the audio emotion recognition algorithm. Implementation and testing of the proposed algorithm are carried out using an inbuilt camera. Hence, the proposed algorithm reduces the overall cost of the system successfully. Also, on average, the proposed algorithm takes 1.10 sec to generate a playlist based on facial expression. Thus, it yields better performance, in terms of computational time, as compared to the algorithms in the existing literature [4].

    5. Paper Name: An AI Based Intelligent Music Composing Algorithm: Concord

      Author: Saurabh Malgaonkar, Yudhajit Biswajit Nag, Rohit Devadiga and Tejas Hirave

      Description: This algorithm automates the composition of music. The main concern in making amazing music so that the user cannot discern the computers composition from a individual work of art. Art is the main aim of human intelligence as it deals with ideas as well as emotions at the subordinate level. One of the most esteemed form of art is music and making computer generate good music composition is a rare achievement. By using completely automated algorithmic process, music is composed in the system. It will have some prerequisite knowledge of music

      chord and scale patterns. It will select the right/valid notes and play it [5].

  3. EXISTING SYSTEM

    Music tune recommendation has been extensively studied from different angles, taking into account a variety of features and factors, that could influence the users choices or preferences. The most common approaches are variations of collaborative filtering and content-based filtering. These methods perform satisfactory in the long run, but immediate preferences can be heavily influenced by a range of different factors and characteristics, which is generally referred to as context.

  4. EXISTING SYSTEM DISADVANTAGES

    • Existing system focused on the activity itself.

    • Existing system do not ask about the intent of listening.

  5. PROBLEM STATEMENT

    Develop a web application to get users emotions and generating new tunes from multiple songs by removing words from songs and getting only notes. Finally merge that different notes and play for users. Music Tune Generation Based on Facial Emotion of the users here face detection done by viola Jones algorithm and CNN.

  6. PROPOSED SYSTEM

    In our web application, user login to the system and user face is captured by laptop camera. After that, only face is detected by using Vialo Jones algorithm and it detects face accurately using Haar function. Then users face emotions are detected by using CNN. Then according to the users detected emotions system generates different new tunes and finally merges that tune and play merged tune for users, every time new tune is generated for the user if the same emotion is detected.

  7. PROPOSED SYSTEM ADVANTAGES

      • Generate new tunes of the user based on emotions.

      • When user emotions are same then also generate new tunes.

      • It is efficient and faster.

  8. SYSTEM ARCHITECTURE

    Figure 1 System Architecture

  9. ALGORITHM DETAILS

    • CNN

      • Step 1: Convolution Operation.

      • Step 1(b): ReLU Layer.

      • Step 2: Pooling.

      • Step 3: Flattening.

      • Step 4: Full Connection.

    • Viola-Jones Algorithm

      • Set the minimum window size, and sliding step corresponding to that size.

      • For the chosen window size, slide the window vertically and horizontally with the same step.

      • At each step, a set of N face recognition filters is applied.

      • If one filter gives a positive answer, the face is detected in the current widow.

      • If the size of the window is the maximum size stop the procedure.

      • Otherwise increase the size of the window and corresponding sliding step to the next chosen size and go to the step 2.

  10. APPLICATION AND FUTURE SCOPE

    • Tutorial for beginner musicians

    • Music Pattern Study

    • Music Enhancement

    • Music Testing

  11. CONCLUSION

    The algorithm proposed in this paper aims to generation music tunes based on facial expressions. Experimental results indicate that the proposed algorithm was successful in automating music tune generation on the basis of facial expressions and hence reduced the amount of labor and time, incurred in performing the task manually. The use of laptop or camera helped in eradicating the requirement of

    any additional hardware, such as EEG systems and sensors, and thus helped in curtailing the cost involved. Since, face emotion recognition is not performed in real time, the total time taken by the algorithm is equal to the amount of time taken by the algorithm to recognize facial expressions and the amount of time taken by the algorithm to query the meta data file. Hence, the proposed algorithm yields better performance, in terms of computational tim, than the algorithms reported in the existing literature. Also, since the time taken by the algorithm to query the Meta data file is negligible (0.0008 sec), the total time taken by the algorithm is proportional to the time taken to recognize facial expressions. Viola Jones algorithm Detect face of the user accurately. Then this gives input to the convolutional neural network and we get emotions.

  12. REFERENCES

    1. Kun Zhao, Siqi Li, Juanjuan Cai, Hui Wang, Jingling Wang, An Emotional Symbolic Music Generation System based on LSTM Networks, 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference.

    2. Omar Lopez-Rincon, Oleg Starostenko, Gerardo Ayala-San Martín, Algoritmic Music Composition Based on Artificial Intelligence: A Survey, 978-1-5386-2363-3/18/31.00 ©2018 IEEE

    3. Prof. Vijaykumar R. Ghule, Abhijeet B. Benke, Shubham S. Jadhav, Swapnil A. Joshi, Emotion Based Music Player Using Facial Recognition, International Journal of Innovative Research in Computer and Communication Engineering, Vol. 5, Issue 2, February 2017 Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2017.

    4. Anukriti Dureha, An Accurate Algorithm for Generating a Music Playlist based on Facial Expressions , International Journal of Computer Applications (0975 8887) Volume 100 No.9, August 2014.

    5. Saurabh Malgaonkar, Yudhajit Biswajit Nag, Rohit Devadiga and Tejas Hirave, An AI Based Intelligent Music Composing Algorithm: Concord, Manuscript received September 8, 2012.

    6. W. Elliot, D. Eck, A. Roberts, and D. Abolafia, Project Magenta: Generating longterm structure in songs and stories, 2016.

    7. H. Lim, S. Rhyu, K.Lee, Chord Generation from Symbolic Melody Using BLSTM Networks, unpublished.

    8. H. Chu, R. Urtasun, and S. Fidler, Song from pi: a musically plausible network for pop music generation, unpublished.

    9. N. Boulanger-Lewandowski, Y. Bengio, P. Vincent, Modeling temporal dependencies in high-dimensional sequences: application to polyphonic music generation and transcription, Proceedings of the 29th International Conference on Machine Learning, vol. 18, issue 13, pp. 3981-3991, 2012.

    10. S. Lattner, M. Grachten, and G. Widmer, Imposing higher-level structure in polyphonic music generation using convolutional restricted boltzmann machines and constraints, unpublished.

    11. G. Hadjeres, F. Pachet, and F. Nielsen, Deepbach: a steerable model for bach chorales generation, Proceedings of the 34th International Conference on Machine Learning, pp. 1362-1371, 2017.

    12. O. Mogren, C-rnn-gan: continuous recurrent neural networks with adversarial training, Constructive Machine Learning Workshop (CML) at NIPS 2016 in Barcelona, S pain, 2016.

Leave a Reply