Survey on Sentiment Analysis to Predict Twitter Data using Machine Learning and Deep Learning

DOI : 10.17577/IJERTV11IS070253

Download Full-Text PDF Cite this Publication

Text Only Version

Survey on Sentiment Analysis to Predict Twitter Data using Machine Learning and Deep Learning

Monalisha Sahoo

Computer Science and Engineering Odisha University of Technology and Research

Bhubaneswar, India

Jyotirmayee Rautaray

Computer Science and Engineering Odisha University of Technology and Research

Bhubaneswar, India

Abstract Since the beginning of time, written text has been a means to communicate, express, and document something of significance. Even in the modern age, it has been proven a lot of times that an individual writing style can be a defining aspect of ones psyche. Ever since science social media emerged microblogging became the new form of writing, expressing, or documenting an event. Text classification is nothing but classifying unstructured data into various categories. We have unstructured text data and to do text classification on this unstructured data we can follow two approaches. We can either make a few rules where the collection of words will decide the sentiment of the input text, this approach can be helpful for a handful of data but analyzing a large set of data is neither efficient nor cost-effective better approach is making use of natural language processing and classification using machine learning. For proceeding with this thesis work two methods have been taken into consideration, one is machine learning and the other one is deep learning.

KeywordsSentiment; Naïve Bayes; Preprocessing; Stemming;Tokenization


    With the power of the internet, businesses today get a huge number of customers to feedback through their business website, social media page, etc. But the majority of businesses do not even know how to use this information to improve themselves. A huge number of people share their product and service experiences directly on these platforms. However, the problem is this feedback is unstructured. So, the problem is, how do businesses analyze these unstructured feedbacks at scale? This is where machine learning comes from the picture. We can process our textual data in one of two ways, one is machine learning and the other one is deep learning method. But the drawback is that these machine learning algorithms ignore the factors. The way it works is how often the words are duplicated, a probability is obtained out of it and, then executea classification task this is the reason why we have a deep learning model for NLP tasks. In deep learning, we have a recurrent neural network, LSTM, transformer network, Google's bird algorithm, and many more. These deep learning models learn the pattern of words and then try to forecast the expected result. We can perform NLP using deep learning in one of two ways one by pre-trained models such as Googles WordPress or global vector models and the other way is to train our model for which we would require a very huge amount of data and also compute power to support it. Many professionals, ranging from

    salespeople, psychologists, and even entrepreneurs, have used NLP tenets to succeed in their line of work and have also helped others to reach their life goals in the process. Using NLP, you do not just understand human behavior you can alter and control it too. According to NLP theory, our behavior and decisions are products of our neural patterns. And these patterns find expressions in verbal and nonverbal cues. So, the job of NLP is to translate between structured and unstructured data. When we translate from one language to another, we need to understand the context of that language. NLP contains so many different tasks like Speech recognition, Machine translation, Speech segmentation, Sentiment analysis, Auto summarization, Search auto correct & auto- complete, Question answering, and so on. Sentiment analysis deals with detecting the sentiment from a set of data. Machines and computer software are capable of acknowledging emotions of the humans due to sentiment analysis. Emotions are broken into three categories positive, negative, and neutral. Scores correlating to them are +1, -1, and0 accordingly. People express thoughts and feelings with an open mind on social media or networks and customers express opinions or feedback about products on e-commerce websites. The primary objective of sentiment analysis techniques is to find out polarity like positive, negative, and neutral from the textual information or data. Sentiment analysis is becoming very crucial to keep track of peoples feelings and emotions like happiness, anger, and sadness and peoples intentions like interested or not interested. There are three different classification levels of sentiment analysis.

    Document level: – Here polarity has been extracted from the entire document, to determine the sentiments whether its positive, negative or neutral.

    Sentence level: – Our data set contains two kinds of data; one is subjective and the other is objective data. Subjective means influenced by emotions. Subjective sentences contain loaded words like write, wrong, best, better, worst, etc. At the same time, those words must contain connotations. They are either positively connoted or negatively connoted and they convey a feeling to the reader. So subjective sentences are particularly important for finding sentiment. On the other hand, objective sentences are neutral those are not influenced by anykind of emotions. Objective sentences do not help in any kind to detect the, we can remove it if we want to. Attribute level: -In this

    level of sentiment analysis, we need to identify the features of each entity. Here we are dealing with every aspect of the entire dataset.

    The final ultimate goal is to give an outline of all the aspects and their emotions. This kind of sentiment analysis is very helpful for companies to determine customers opinions, by monitoring aspects of all reviews given by customers.


    Barakat AlBadani, Ronghua Shi, et al. [1], created a model using deep learning by combining SVM and fine- tuning. They used their model on three datasets and detected the accuracy. They got accuracy as 99.78%, 99.71%, and 95.78% for Twitter US Airlines, IMDB, and GOP debate respectively. In their work, they considered document-level sentiment analysis but didnt consider aspect-level analysis.

    A. Pasumpon Pandian et al. [2], in their research used 6 datasets and they represent models which use combinations of automation extraction and hand-crafted separation of features. In the future, they are going to extend their work and test it by using different languages. Kanhav Gupta, Dr. Munish Mehta, et al. [3], in their paper for sentiment analysis a brief idea about lexicon- based and machine learning analysis. According to their research lexicon-based analysis works best for fewer amounts of data as it does not require any testing, training, and multiple processes for preprocessing but for a abundant data machine learning approach is needed. They used SVM, Naïve Bayes, and logistic regression and got an acquire and accuracy of 90.85%, 85.6%, and 89.4% respectively.

    Shanshan Yi, Xiaofang Liu, et al. [4], analyze customers review and determine the sentiment of customers towards a product. Created a hybrid recommendation system using a machine learning regression model. The performance of the model was analysed using three matrices namely MAE (mean absolute error), MSE (mean squared error), and MAPE (mean absolute percentage error). In the future, they are going to extend their work on customer interest across different geographical locations.

    Peng Cen, Kexin Zhang, and Desheng Zheng et al. [5] used RNN LSTM, and CNN for analyzing the sentiment of movie reviews from the IMDB dataset. CNN provides more accuracy than RNN and LSTM. The accuracy of CNN, RNN, and LSTM are 88.22%, 68.64%, and

    85.32%, respectively.

    Ayushi Mitra et al. [6], used a lexicon-based approach for sentiment analysis results were not that efficient when the size of the lexicon increased? To overcome this deficiency, they are going to use machine learning classifiers.

    Nirag T. Bhatt, Asst. Prof. Saket J. Swarn deep et al. [7], in their paper they demon straight about sentiment analysis, levels of sentiment analysis, advantages and disadvantages of sentiment analysis and its application. Their main focus is on different feature extraction techniques and how they perform different machine learning approaches.

    Alpna Patel and Arvind Kumar Tiwari et al. [8], in their paper the three methods for sentiment analysis are lexicon-basedapproach and deep learning approach. They analyzed the sentiment of the IMDB movie review data set by using the CNN and RNN methods. According to their result, RNN gives more accuracy than CNN. The accuracy level of RNN is 87.42%.

    Hassan Raza, M. Faizan, Ahsan Hamza, et al. [9], in their research work, consider a scientific text for sentiment analysis where they use different machine learning classifiers like NB, SVM, LR, KNN, RF, and three different feature extraction techniques like unigram, bi-gram, and trigram.

    Brian Keith Norambuena, Exequiel Fuentes Lettura, et al. [10], in their study of sentiment analysis, used three approaches supervised, unsupervised and hybrid approaches. NB, SVM is used as a supervised method, the scoring algorithm is used as an unsupervised method and HS-SVM is used as a hybrid model.

    Mr. Kundan Reddy Mand et al. [11], in their research, analyze sentiment from Twitter data by using machine learning algorithms like Naïve Bayes, XGBoost and Random Forest, and CNN-LSTM. They come to an end that CNN-LSTM gives satisfactory results in anticipating the accuracy of sentiment.

    Gurshobit Singh Brar, Asst. Prof. Ankit Sharma et al. [12], created a web-based API that determines whether sentiment is positive or negative. They tested this on 50 plus different reviews. They got average accuracy of 81.22%. In the future, they are going to extend the model and test it using a large- size dataset.

    Wahyu Calvin Frans Mariel, Siti Mariyah, and Setia Pramana, et al. [13], combined different classification techniques and feature extraction techniques. They found that deep learning neural network gives better results than SVM and Naïve Bayes. According to their analysis, the Deep learning neural network along with Bigram provides the best result.

    Dr. M. Sujithra et al. [14], in their paper, discussed data pre- processing, feature extraction, and different methods of machine learning. Predict the emotion of twitter data by using machine learning algorithms. In the future existing models can be further improved by increasing semantic knowledge.

    Dipak R. Kawade, Dr.Kavita S. Oza et al. [15], collected twitter data about the Uri attack which was held on 18-sep- 2016, and analyze the sentiment of people by using preprocessing and machine learning classifiers. In the future, they are going to use big data analysis techniques to classify the emotions of large amounts of data.

    Jaspreet Singh, Gurvinder Singh, and Rajinder Singh et al.

    [16] considered four machine learning classifiers for detecting emotions of three different datasets. According to their results Owner provides more accuracy than others. Whereas Naïve Bayes is faster among all. In the future, they are going to proceed with the same research using CNN and RNN.

    Ms. Bhumika Gupta et al. [17], in their paper, examine the correctness of the model by putting the trained data in a machine learning model which is going to be used in the future to predict the result or to analyze the sentiment of the different dataset. In the future, different topics can be taken into consideration to check the accuracy of the model.


    Fig. 1. Proposed Model

    Data collection is the action of assembling knowledge on diverse topics. The purpose of data collection isto obtain information that you are looking for and to make written decisions about issues. Primary data collection and secondary data collection are two data accumulation techniques. Primary data is fresh data collected for the first time. Secondary data are already collected by someone else.

    Data preprocessing has ended with enhancing the standard of the data. Removes noisy data, inconsistent data, and incomplete data. It transforms the data into an understandable form. There are different ways of data processing.

    Lower Casing: – This is the initial step in the procedure of pre-processing. Lowercasing means converting every capital alphabet into a lower form. This will be very helpful during parsing as well.

    Tokenization: – Separating a sentence or text into tokens is called tokenization. Tokens may be a word or character.

    Stemming: – It is an activity of minimizing intonation in words to their root shape such as plotting a bunch of words to the matching stem. We used it to remove noise from the data. We are removing suffixes for example walking will become walking, eating will become eat.

    Lemmatization: – It converts a word to its base form. In lemmatization, the base form of the word will make sense. In many languages, words appear in many different forms so both stemming and lemmatization are different for different languages. Developing efficient lemmatization algorithms is an open area for research.

    Feature extraction is normally used when original raw data is hugely different, and we cannot use raw data for machine learning modeling then we transfer raw data into the desired form. It is the method for fabricating the latest and Feature extraction is normally used when original raw data is hugely dissimilar, and it cannot be used in any classifiers. So raw data need to be transformed to proper shaped. It is the method for fabricating the latest and compact set of features that encapsulate primarily beneficial information of the raw data. When we actually work on real-world machine learning problems then we rarely get data in the shape of CSV, so we must extract useful information from raw data. So, some of the popular types of raw data from which features can be extracted are Data, Images, Date and time, web data, andsensor data. We are working here with text data, as the machine does not understand text data. In machine learning we will feed our machine learning model with a lot of data andthe model can find the patterns in data and it can learn from it, as a result of which it can make new predictions. But when we have data in the form of text it will be hard for a computer to understand the text data whereas it can easily understand numerical data. So, it is required to trasform text data into numerical data, this is where feature extraction comes into consideration.

    Text data has to be converted into a vector or matrix format. So, to convert data from text to matrix or vector by using some course of action is called feature extraction from text data.

    Bag of Words: – In this model text is denoted as the bag of its words, not taking into account grammar and even word order but keeping multiplicity. A Bag of words represents text in the document term matrix. Bag of words characterization is a sum of sparse one hot- encoded-vectors. A Bag of words is normally written as

    BOW. Let us there is a corpus with 1000 words and in that 1000 words, there are 800 unique words present so BOWs will have all the unique words and there is no preservation for ordering of sentences.

    TF-IDF: – TF-IDF represents term frequency and inverse document frequency. Here we create a list of all the words present in the dataset and we count the number of times the words repeat. The formula for term frequency (TF) is (numberof times term t appearsin the document) / (total no of terms present in the document) so, this helps to identify which words are more important. The formula for inverse document frequency (IDF) = log (N/n), where N is the total no of documents and n is how many documents t has appeared. The IDF value of a rare word is high whereas the IDF value of a frequent word is low. The Data set may contain nouns and articles like and etc. These words may be repeated a lot of times in the dataset so for these kinds of words the IDF valueis less.


    1. SVM

      It is a supervised machine learning algorithm. Accustomedcreate the best hyperplane, which separates an n-dimensional space into different classes, so that it will be easy to put new objects in the correct category. Hyperplane is being generated as a consequence extreme points. There are multiple boundaries present, but we need to find out the best boundary which is our hyperplane. The feature of the data set contributes a pivotal role in establishing the correct hyperplane. Data points that are closest to the hyperplane are called support vectors.

    2. Naive Bayes

      It is quite a simple and effective supervised machine learning classifier. Based on the probability of an object it predicts the result, also called a probabilistic classifier. It is based on the Bayes algorithm.

    3. Random forest

      This supervised machine learning algorithm is used in both Classification and Regression. Random forest classifies various subsets of a given data set into a number of decision trees. The number of trees and accuracy is directly proportional to each other when the number of trees increases accuracy also increases. To boost the concert of the model different classifiers are combined, this mechanism is also called ensemble learning.

    4. Logistic Regression

      This probabilistic classifier is used for classification but not for regression. This classifier is used to predict the probability of the variable. Also called a linear regression model. Its functions are somehow more complex. The Predicted value of a variable will be

      converted into binary numbers like 0 and 1.

    5. CNN

      It is a class of deep neural networks only that is habitually claimed for examining the visual image. When we talk about computer vision the term convolutional neural network abbreviated to CNN comes to mind. Examples of CNN are face recognition, image classification, etc. It is quite similar to the basics of neural networks. CNN also has learnable parameters like a neural network that is waits and biased.

      What a human is doing to recognize an image is, after visualizing the image own knowledge has been applied by a human being to predict the image. But a computer is not a human being, so how does it recognize this image? An image is an array or a matrix or a squared pixel arranged in columns and rows. This is how a computer sees an image particularly. The pixel generated in a neural network is a quite number weight and it is unmanageable. To manage the weight and bias we use CNN. Because of the characteristics of CNN, it decreases the image into a new pattern without damaging its features, which is essential for getting a satisfactory result.

    6. LSTM

      It authorizes a neural network to recollect the stuff that it requires to the grip of context but also to element the stuff that is no longer relevant. For example, we need to predict what the next letter in the sequence is going to be. But by just looking at the letters individually, its not possible to predict the next letter but if we go through all the previous sequences then it is possible to predict the next sequence.

      LSTM is a type of recurrence neural network. Neural networks contain some node that receives some input that input is processed in some way, so there is some kind ofcomputation that results in an output. But RNN is different from other neural networks as it loops around. In RNN step one produces some output on processing some input. Next, step 2 takes in new input but also gains the output from the previous step as well. That is what makes RNN a little bit different as it can remember prior steps that are part of a sequence. But RNN suffers from the problem of long-term dependency. Over time as the information keeps piling up, RNNs effectiveness to learn new things also decreases. So, LSTM provides the solution to this long- term dependencyproblem by adding an internal state to the RNN node. Now when an input to the RNN comes in, it is receiving at leaststate information as well. So, a step collects the output fromthe prior step, the input of the new step, as well as some state information from the LSTM state.


      1. Experimental result of Lexicon-based approach

        Lexicon-based methods are easy to understand. It can also be used to detect multi-labeled mixed emotions. A human-generated sentence can convey multiple and sometimes contradicting emotions and this can be detected by lexicon-based methods. Another major advantage is that we do not need to provide label training data to start using this method and we can use off-the-shelf packages to quickly implement emotion detection. It cannot recognize innovative words, for instance, COVID-19 is a word created in 2019. The algorithm is not going to be understood by this word unless we update the dictionary supporting the package. Machine learning and deep learning methods are better than lexicon- based methods at detecting negation. In this experiment, the 1st lexicon- based approach has been used to determine the sentiment.

        1. Dataset collection

        2. Preprocessing of data

        3. By using TextBlob determine the polarity of eachsentence TextBlob(text). sentiment. Polarity

        4. Determine the sentiment of each sentence

          1. if [polarity]>0, sentiment is +ve.

          2. if [polarity]<0, sentiment is ve.

          3. if [polarity]=0, sentiment is neutral.

          4. Delete all the neutral sentences.

        5. Use the group by () function to group all sentencesaccording to their sentiment.

        6. Use count () to determine the total number of positiveand negative sentences present in the text.

        Word Cloud is strongly supported in the lexicon analysis method. But analyzing sentiment manually using the algorithm is very difficult when a large amount of data is present as input for that region machine learning and deep learning came into the picture where models are trained to predict the sentiment using different classification methods.

        Word Cloud of frequently accrued words: –

        Fig. 2. Word Cloud

      2. Experimental result of machine learning approach

        After all basic steps like data collection & data pre- processing, feature extraction has been considered to be performed. For feature extraction, the BOW algorithm hasbeen used. The input data has been split into two parts training and testing. Training data size and testing data size are 75% and 25% respectively. Then different classification models are used and the accuracy of each model has recorded as shown below.


        Sl no.







        Logistic regression



        Random forest



        Naïve Bayes


      3. Experimental result of deep learning method

    After analyzing the results of some machine learning classifiers, two deep learning approaches have been used as well, one is LSTM another one is CNN.


    Long short-term memory approach is a part of RNN. After finishing the data cleansing part tokenizer has been used whichconverts al the strings into integers. Here the dataset has been divided into two parts where 80% of data is for training and 20% of data is for testing. Then an LSTM model has been created where the first layer is an embedding layer which the takes total no of words present in the dataset, it returns the output to the LSTM layer. Here a stack of LSTM layers has been used so whatever output is coming from the 1st LSTM layer has passed through the 2nd LSTM layer. So, the cell size of both LSTM layers is the same. At the end, the output layer is present. Sigmoid is used as an activation function. As sigmoid values lie between 0 and 1 only so, it is best for binary classification. As an optimization function Adam has been used. The accuracy of this model is 59.20%. Models are being trained using training data and validated using validation data. Here for training, the entire data multiple epochs have been used, one epoch means the network is going to train the dataset completely. Since one epoch is too big to feed to the model at once, it needs to be divided into smaller batches. Val loss and Val accuracy represent the error and accuracy of validation data respectively. Loss and accuracy represent the error and accuracy of training data respectively.

    Table 2 represents how loss, Val Loss, accuracy, and Valaccuracy vary for each epoch on the LSTM model. Here on each epoch loss and Val Loss is decreasing and merged at a specific point, which indicates that the model is in optimal condition, represented in fig-3 and the accuracy and Val Accuracy of the model are constant in every epoch as shown infig-4.




    Val Loss


    Val Accuracy
























































    Val Loss


    Val Accuracy



















































    Fig. 3. Loss, Val Loss graph of LSTM model

    Fig. 5. Loss, Val Loss graph of CNN Model


    Fig. 4. Accuracy and Val Accuracy of LSTM model

    Fig. 6. Accuracy and Val Accuracy of CNN

    The model created by using the CNN approach is a sequential model which indicates layers are present one after another. The output of one layer will be provided as input to the next layer. The first layer is the embedding layer. The Nextlayer is the Conv1D layer in this layer we are using a filter of size 3 likewise we are using 64 filters with activation function as relu. Then the next layer is the MaxPooling1D layer. In the end, the dense layer has been used which takes input from the MaxPooling1D layer and produces the final output, where SoftMax is used as an activation function. The accuracy of the CNN model is 67.46%.

    Table 3 represents how loss, val_loss, accuracy, and val_accuracy vary for each epoch on the CNN model. Here validation loss is greater than training loss as shown in fig-5, which indicates the model performs well on training data but performs poorly on new data in the validation set. In fig-6 val_accuracy, and accuracy both are decreasing, so the mode isin an overfitting state.


  6. CONCLUSION AND FUTURE WORK In this research work sentiment or polarity of the dataset has been determined by using the lexicon-based approach, machine learning approach (Naïve Bayes, SVM, Random Forest, Logistic Regression), and deep learning approach (LSTM, CNN) and got an accuracy of 88.38%, 92.31%, 92.39%, 91.32%, 59.20%, 67.46%. By visualizing all the accuracy percentages, it is clear that machine learning gives more precise results than deep learning and lexicon-based approach. Sentiment extraction is a utilitarian factor in so many different fields like education, politics, entertainment, health, and many more. In the future, this model could be improved more as in this work deep learning is not fortunate with satisfactory results. Here we used an epoch size of 10, epoch size could be increased to get more percentage of accuracy.

Fig. 7. Comparison graph


[1] AlBadani, Barakat, Ronghua Shi, and Jian Dong. "A Novel Machine Learning Approach for Sentiment Analysis on Twitter Incorporating the Universal Language Model Fine-Tuning and SVM." Applied System Innovation 5, no. 1 (2022): 13.

[2] Ligthart, Alexander, Cagatay Catal, and Bedir Tekinerdogan. "Systematic reviews in sentiment analysis: a tertiary study." Artificial Intelligence Review 54, no. 7 (2021): 4997-5053.

[3] Yi, Shanshan, and Xiaofang Liu. "Machine learning based customer sentiment analysis for recommending shoppers, shops based on customers review." Complex & Intelligent Systems 6, no. 3 (2020): 621-634.

[4] Qin, Zhou, Fang Cao, Yu Yang, Shuai Wang, Yunhuai Liu, Chang Tan, and Desheng Zhang. "CellPred: A behavior-aware scheme for cellular data usage prediction." Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, no. 1 (2020): 1-24.

[5] Mitra, Ayushi. "Sentiment analysis using machine learning approaches (Lexicon based on movie review dataset)." Journal of Ubiquitous Computing and Communication Technologies (UCCT) 2, no. 03 (2020):145-152.

[6] Pareek, Prashant, Neha Sharma, Mr. Ashish Ghosh, and Mr. Kota Nagarohith. "Sentiment Analysis for Amazon Product Reviews Using Logistic Regression Model." Center for Development Economic Studies 9, no. 11 (2022): 29-42.

[7] Patel, Alpna, and Arvind Kumar Tiwari. "Sentiment analysis by using recurrent neural network." In Proceedings of 2nd International Conference on Advanced Computing and Software Engineering (ICACSE). 2019.

[8] Raza, Hassan, M. Faizan, Ahsan Hamza, Ahmd Mushtaq, and Naeem Akhtar. "Scientific text sentiment analysis using machine learning techniques." International Journal of Advanced Computer Science and Applications 10, no. 12 (2019): 157-165.

[9] Kim, Hannah, and Young-Seob Jeong. "Sentiment classification using convolutional neural networks." Applied Sciences 9, no. 11 (2019): 2347.

[10] Manda, Kundan Reddy. "Sentiment Analysis of Twitter Data UsingMachine Learning and Deep Learning Methods." (2019).

[11] Brar, Gurshobit Singh, and Ankit Sharma. "Sentiment analysis of movie review using supervised machine learning techniques." International Journal of Applied Engineering Research 13, no. 16 (2018): 12788-12791.

[12] Mariel, Wahyu Calvin Frans, Siti Mariyah, and Setia Pramana. "Sentiment analysis: a comparison of deep learning neural network algorithm with SVM and nave Bayes for Indonesian text." In Journal of Physics: Conference Series, vol. 971, no. 1,

p. 012049. IOP Publishing, 2018.

[13] Siddharth, S., R. Darsini, and M. Sujithra. "Sentiment analysis on twitterdata using machine learning algorithms in python." Int. J. Eng. Res. Comput. Sci. Eng 5, no. 2 (2018): 285-290.

[14] Kawade, Dipak R., and Kavita S. Oza. "Sentiment analysis: machine learning approach." International Journal of Engineering and Technology 9, no. 3 (2017): 2183-2186.

[15] Singh, Jaspreet, Gurvinder Singh, and Rajinder Singh. "Optimization of sentiment analysis using machine learning classifiers." Human-centric Computing and information Sciences 7, no. 1 (2017): 1-12.

[16] Gupta, Bhumika, Monika Negi, Kanika Vishwakarma, Goldi Rawat, Priyanka Badhani, and B. Tech. "Study of Twitter sentiment analysis using machine learning algorithms on Python." International Journal of Computer Applications 165, no. 9 (2017): 29-34.

[17] Cliche, Mathieu. "BB_twtr at SemEval-2017 task 4: Twitter sentiment analysis with CNNs and LSTMs." arXiv preprint arXiv:1704.06125 (2017).