TWITTER SENTIMENT ANALYSIS.
PROJECT SYNOPSIS
OF MINI PROJECT
BACHELOR OF TECHNOLOGY
CSE(DS)
SUBMITTED BY GUIDED BY
SARANSH SHARMA Ms. Mimansha Singh
2000321540052 Assistant Professor
DEEP SHARAN
2000321540025
ABES ENGINEERING COLLEGE
GHAZIABAD
2020-2024
AFFILIATED TO
DR. A.P.J. ABDUL KALAM TECHNICAL
UNIVERSITY U.P., LUCKNOW
STUDENT’S DECLARATION
I / we hereby declare that the work being presented in this report entitled “Twitter
Sentimental Analysis.” is an authentic record of my/ our own work carried out under the
supervision of Mr. Prabhat Singh, Assistant Professor, CSE-DS. The matter embodied
in this report has not been submitted by us for the award of any other degree.
Date:
Signature of student Signature of student
Name: Saransh Sharma Name: Deep Sharan
Roll No.2000321540052 Roll No.2000321540025
Department: CSE-DS Department: CSE-DS
This is to certify that the above statement made by the candidate(s) is correct to the best
of my knowledge.
Signature of HOD Signature of Supervisor
…………………… Mr. Prabhat Singh
CSE-DS Assistant Professor
Date: CSE-DS
i
ACKNOWLEDGEMENT
We would like to convey our sincere thanks to Ms. Mimansha Singh for giving the
motivation, knowledge and support throughout the course of the project. The continuous
support helps in a successful completion of project. The knowledge provided is very
useful for us.
We also like to give a special thanks to the department of Information and Technology
for giving us the continuous support and opportunities for fulfilling our project.
We would also like to extend our sincere obligation to Mr. Prabhat Singh, Head
of Department, CSE(DS) for providing this opportunity to us.
Signature of student Signature of student
Saransh Sharma Deep Sharan
2000321540052 2000321540025
ii
iii
ABSTRACT
The data provided by people, or the users of a particular social -
networking site, has changed due to the behaviour of different types
of social networking sites, like Instagram, Twitter, Snap-Chat, the
use of social networking sites is growing rapidly. It may generate
millions or perhaps billions of data points. Text, video, or audio
content is posted every day. This is due to the fact that a certain
website has millions of users. These users want to express their
ideas and opinions on whatever subject they choose. Even some of
these users post ineffectively. These posts are brief; as a result,
they are only intended to reflect one user's viewpoint on a particular
subject. In this essay, we attempt to glean the emotions that
underlie these posts. Twitter has been selected as our social
networking platform for this. Tweets are the posts on this social
networking site. In this work, we investigate approaches for cleaning
and extracting twitter data using Python in order to infer the
sentiments underlying tweets. After that, we use a classifier to train
and assess the data.
1
CHAPTER 1
INTRODUCTION
Today's globe has transformed microblogging sites into a sea of data
that analysts may use. This is due to the fact that the majority of
people in today's society use a microblogging platform to express all of
their enthusiasm for various topics. It wouldn't be incorrect to suggest
that everyone who has access to these microblogging sites now has a
right to free speech in some way. In real time, people from all over the
world are free to talk, comment, and express their thoughts on any
subject of their choice. These blogs primarily consist of complaints or
expressions of gratitude regarding any issue of the author's choosing.
They benefit from getting a fair assessment of their business or
product, which enables them to understand consumer demand and the
changes that need to be made in order to provide better goods in the
future. Therefore, if sentiment analysis could be applied to these
microblogging sites, it could be inferred from the explanation above
that they could benefit a variety of organisations, both public and
private. An effective tool for examining many websites where
individuals publish their ideas on a topic of interest is sentiment
analysis, often known as analysis of feelings. With the use of this type
of analysis, businesses can learn what consumers think about a
specific entity or product that interests them by reading their
comments, tweets, or reviews.
2
CHAPTER 2
RELATED WORK
The related work associated with our project is given below:
1.1. Existing Approaches
Twitter Sentiment Analysis using Python:
To do the sentiment analysis of twitter data using python and find the
positive and negative tweets percentage [5].
Word frequency and sentiment analysis of twitter messages during Coronavirus
pandemic [9]
To find the frequency of each word and do the sentiment analysis of the
pandemic dataset [2].
1.2. Comparative Analysis of Existing Works
In the existing projects, the words with positive or negative polarity are obtained
but our project we are obtaining the polarity of the overall data set.
In existing projects, it is not specified that which machine learning model is best
for sentiment analysis but in our project we will be determining that too.
3
CHAPTER 3
PROJECT OBJECTIVE
This project will analyze the emotions of people.
To implement an algorithm for automatic classification of tweets into positive,
negative or neutral.
This project will analyze different Algorithms and finds the one with best
accuracy.
4
CHAPTER 4
PROPOSED METHODOLOGY
The proposed methodology related to our project is given below:
Step 1: Identify the famous hashtags during the pandemic in India on Twitter. Tweets
under those hashtags are extracted from the Twitter API using Tweepy library.
Step 2: The preprocessing of the dataset is done. It involves the following steps:
Removal of hashtags.
Removal of links, gifs, emoji, images and special characters.
Removal of stop words.
Removal of non-English words.
Lemmatization
Step 3: Analyzing the polarity of the dataset.
Step 4: Giving the step 3 output in different machine learning algorithms and analyze it to
find the algorithm with best accuracy.
Step 5: The results are represented using different charts.
Extraction of Dataset from Twitter API
Pre-processing of Data to remove special characters, punctuations, Stop Words and Images
Processing of Data to analyze the polarity of the Dataset
To use Algorithm and find which fits best for performing Sentiment Analysis
Results
Fig.1. Proposed Approach
5
CHAPTER 5
DESIGN AND IMPLEMENTATION
The design and implementation of our project is as follows:
5.1. Work Flow Diagram
The dataset has been extracted from Twitter API using the tweepy library in
python. Python library Numpy is used for the numerical computation and pandas is
used for the data manipulation. Natural Language Toolkit is used for the
preprocessing of the dataset. Text Blob library is used for spelling checks and
analyzing the sentiments.
Matplotlib is used for the graphical representation of results.
Fig.2. Work Flow Diagram
6
CHAPTER 6
RESULTS AND DISCUSSION
The result we got from analyzing the tweets is given below in Fig.3.
Fig.3. Proportion of positive, negative and neutral tweets.
Fig.3. shows that 46 % of the total tweets are neutral, about 36.5% tweets are positive
and 17.5% tweets are negative.
7
CHAPTER 7
CONCLUSION AND FUTURE SCOPE
The project will give the overall polarity score of Tweets and will find which is the
best Algorithm for performing Sentiment Analysis.
From the analyses of the tweets, we observe that most of the people feel neutral,
positive or negative.
In future we will be planning to perform the analysis on various other social
platforms Instagram, Facebook, etc. and also try to further classify the sentiments.
8
REFERENCES
[1] Medford, R. J., Saleh, S. N., Sumarsono, A., Perl, T. M., & Lehmann, C. U. (2020). An"
Infodemic": Leveraging High-Volume Twitter Data to Understand Public Sentiment
Outbreak. medRxiv.
[2] Rajput, N. K., Grover, B. A., & Rathi, V. K. (2020). Word frequency and sentiment
analysis of twitter messages . arXiv preprint arXiv:2004.03925.
[3] Samuel, J., Ali, G. G., Rahman, M., Esawi, E., & Samuel, Y. (2020). Covid-19 public
sentiment insights and machine learning for tweets classification. Information, 11(6), 314.
[4] Kumar, A., Khan, S. U., & Kalra, A. (2020): a sentiment analysis. European
Heart Journal.
[5] Ahuja, S., & Dubey, G. (2017, August). Sentiment analysis on Twitter data. In 2017
2nd International Conference on Telecommunication and Networks (TEL- NET) (pp. 1-
5). IEEE.
[6] Suman, C., Saha, S., Bhattacharyya, P., & Chaudhari, R. S. (2020). Emoji Helps! A
Multi-modal Siamese Architecture for Tweet User Verification. Cognitive Computation, 1-
16