sustainability
Article
Evaluating Polarity Trend Amidst the Coronavirus Crisis in
Peoples’ Attitudes toward the Vaccination Drive
Rakhi Batra 1 , Ali Shariq Imran 2, * , Zenun Kastrati 3 , Abdul Ghafoor 1 , Sher Muhammad Daudpota 1
and Sarang Shaikh 1
1
2
3
*
Citation: Batra, R.; Imran, A.S.;
Kastrati, Z.; Ghafoor, A.; Daudpota,
S.M.; Shaikh, S. Evaluating Polarity
Trend Amidst the Coronavirus Crisis
in Peoples’ Attitude toward the
Vaccination Drive. Sustainability 2021,
13, 5344. https://doi.org/10.3390/
Department of Computer Science, Sukkur IBA University, Sukkur 65200, Pakistan;
rakhi@iba-suk.edu.pk (R.B.); aghafoor.mscsf19@iba-suk.edu.pk (A.G.); sher@iba-suk.edu.pk (S.M.D.);
sarang.msse17@iba-suk.edu.pk (S.S.)
Department of Computer Science (IDI), Norwegian University of Science & Technology (NTNU),
2815 Gjøvik, Norway
Department of Informatics, Linnaeus University, 351 95 Växjö, Sweden; zenun.kastrati@lnu.se
Correspondence: ali.imran@ntnu.no
Abstract: It has been more than a year since the coronavirus (COVID-19) engulfed the whole world,
disturbing the daily routine, bringing down the economies, and killing two million people across
the globe at the time of writing. The pandemic brought the world together to a joint effort to find
a cure and work toward developing a vaccine. Much to the anticipation, the first batch of vaccines
started rolling out by the end of 2020, and many countries began the vaccination drive early on while
others still waiting in anticipation for a successful trial. Social media, meanwhile, was bombarded
with all sorts of both positive and negative stories of the development and the evolving coronavirus
situation. Many people were looking forward to the vaccines, while others were cautious about
the side-effects and the conspiracy theories resulting in mixed emotions. This study explores users’
tweets concerning the COVID-19 vaccine and the sentiments expressed on Twitter. It tries to evaluate
the polarity trend and a shift since the start of the coronavirus to the vaccination drive across six
countries. The findings suggest that people of neighboring countries have shown quite a similar
attitude regarding the vaccination in contrast to their different reactions to the coronavirus outbreak.
su13105344
Academic Editors: Ohbyung Kwon,
Keywords: coronavirus; COVID-19; pandemic; polarity assessment; opinion mining; emotion detection; Twitter posts; BERT; GloVe; DNN; LSTM; FastText; global crisis
Kyoung-yun “Joseph” Kim, Namgyu
Kim and Namyeon Lee
Received: 30 March 2021
Accepted: 7 May 2021
Published: 11 May 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affiliations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
1. Introduction
COVID-19 is an infectious disease and the first case was reported in December 2019 in
Wuhan city of China. It rapidly spread around the globe and declared as pandemic on 11
March 2020 by World Health Organization (WHO) (https://www.who.int/news/item/27
-04-2020-who-timeline---covid-19, accessed on 23 March 2021). As of 22 March 2021, the
pandemic infected 123,868,982 people and 2,727,738 deaths have been reported around the
globe according to Worldometer (https://www.worldometers.info/coronavirus/, accessed
on 23 March 2021). The USA, Brazil, and India are the worst affected countries in terms of
both case count and mortality (https://www.nationalgeographic.com/science/graphics/
mapping-coronavirus-infections-across-the-globe, accessed on 25 March 2021) as shown in
the Figure 1. Multiple variants of the coronavirus has been detected for instances UK and
South African variants. On 14 December 2020, UK authorities notified the WHO about the
coronavirus variant and initial studies investigated this variant may spread rapidly people
to people. Researchers have stated that the COVID-19 variant first time reported in the UK
is up to 100 percent more fatal than earlier strains (https://www.aljazeera.com/news/2021
/3/10/uk-covid-19-variant-30-100-more-deadly-study-finds, accessed on 26 March 2021).
creativecommons.org/licenses/by/
4.0/).
Sustainability 2021, 13, 5344. https://doi.org/10.3390/su13105344
https://www.mdpi.com/journal/sustainability
Sustainability 2021, 13, 5344
2 of 14
Figure 1. COVID-19 cases and deaths reported country-wise, 19 March 2021.
COVID-19 emergencies have affected the individual mental health causing insecurity,
emotional isolation, confusion, and depression due to loss in business, education, and
work [1]. This pandemic situation changed the normal routine of people around the
world such as academic activities shifted from physical to online mode, change in the
way people interact daily, conduct business or do shopping. Although it disturbed all
the activities, people from different cultures did not react and respond to the pandemic
in the same way. Our previous study has discussed this cultural difference concerning
COVID-19 outbreak [2]. Twitter data of six countries from three different continents were
collected to explore the emotions of people from different cultures about the decisions
their respective governments took to control the coronavirus outbreak. The selected
countries were India and Pakistan from Asia, Sweden and Norway from Europe, and
USA and Canada from North America. Experimental results showed a high correlation
between emotions from India and Pakistan and USA and Canada. Whereas, Norway and
Sweden being neighboring countries with many cultural similarities showed the opposite
polarity trends.
Almost after one year, now many countries worldwide have rolled out the COVID19 vaccine to cure this infectious disease. Western countries are leading in the COVID19 vaccination whereas African countries are lagging as can be depicted from Figure 2.
United Kingdom (UK) became the first nation in the world to approve the BioNTech-Pfizer
vaccine and a UK Grandmother Margaret Keenan has became the first person in the world
to receive COVID-19 vaccine on 8 December 2020. Both the USA and Canada have started
mass COVID-19 vaccination program outside a clinical trial on 14 December 2020. Sandra
Lindsay was the first American vaccinated at Long Island Jewish Medical center and the
first person from Canada was Anita Quidangen, a personal support worker injected in
Toronto. Nordic neighboring countries Sweden and Norway rolled out the coronavirus
vaccine drive on 27 December 2020. A 67-year-old Svein Andersen was the first person
in Norway to receive the vaccine and from Sweden Gunn-Britt Johnsson, the 91-year-old
woman was the first person. Manish Kumar, a hospital cleaning worker was the first Indian
to receive vaccine on 16 January 2021. Pakistan kicks off vaccination on 2 February 2021,
and Rana Imran Sikander from PIMS hospital Islamabad was the first person to receive
the vaccine.
Sustainability 2021, 13, 5344
3 of 14
Figure 2. COVID-19 vaccine doses covered population, 24 March 2021 (Source Bloomberg).
According to Bloomberg vaccine tracker (https://www.bloomberg.com/graphics/
covid-vaccine-tracker-global-distribution/, accessed on 25 March 2021) as of 24 March 2021,
more than 468 million COVID-19 shots have been given across 135 countries. USA is leading
with more than 128 million doses which cover the 19.7% of USA population. Canada has
vaccinated 4.2 million people. India has vaccinated 50 million people and its bordering
country Pakistan has injected just 325,000 doses. Sweden has vaccinated 1.4 million
people and Norway has vaccinated 771,000 people. Few countries have also reported the
side effects of COVID-19 vaccine. On 18 February 2021, Norwegian Medicine Agency
acknowledged more than 1200 side effect reports (https://tinyurl.com/db2x86j7, accessed
on 15 March 2021). Two Swedish regions (https://tinyurl.com/2empedan, accessed on 15
March 2021) stopped vaccination after receiving side effects reports on 14 February 2021.
In earlier March, following Denmark, including Norway and other Nordic and central
European countries halted giving AstraZeneca vaccines shots to its citizen amid deaths
due blood clotting as a side-effect.
People generally are quick in sharing such news and personal experiences over social
networks, and to base their opinions upon what they hear. Many would react and express
various sentiments while commenting. The paper is motivated by the fact that such trends
could pick up quickly—social trends could easily turn into mass gatherings and protests
which ultimately turn into chaos as was observed in Arab spring. Timely analysis of
people’s sentiment on social platforms could help avoid such a situation and sentiment
analysis is an efficient tool to automatically examine sentiment expressed in social media.
Deep neural networks, especially LSTM networks and its different variants have shown
good promise to process text for sentiment polarity extraction. The performance of the task
has also benefited hugely from pretrained word embedding like GloVe, FastText, BERT,
etc. Our previous study [2] has demonstrated the potential of these networks to extract
sentiments related to COVID-19 from tweets posted from six countries, i.e., Pakistan, India,
Norway, Sweden, the USA and Canada. The purpose of this study, therefore, is to detect
changes in polarity and emotions of people after the launch of vaccine and its side effects
expressed in tweets, and to find connection between the events that took place during the
vaccination drive across various countries and emotions expressed on social networks.
We proposed to utilize deep natural language models to analyse the tweets for sentiment
polarity as well as emotion detection.
Sustainability 2021, 13, 5344
4 of 14
The key contributions of this study are:
1.
2.
3.
4.
Collection of tweets on COVID-19 related hashtags for the period of two months
during the vaccination drive to analyze sentiment polarity and emotions.
Providing insights into the collective reactions amidst second wave, and to establish
links with on-going events.
Finding correlation between emotions expressed at the start of the COVID-19 and the
vaccination drive after a year for six countries across three continents.
Analysing polarity and emotions via state-of-the-art deep learning based NLP models
trained on benchmark data sets Sentiment140 for polarity assessment and EmotionTweet for emotion classification and tested the model on COVID-19 Tweets.
The rest of the paper is organized in following manner. Section 2 presents the related
work. Methodology Section 3 describes the model used to study people’s attitudes from
their tweets posted on Twitter. Results and their analysis are presented in Section 4, whereas
the conclusion is drawn in Section 5.
2. Related Work
A recent development in sentiment analysis and affective computing is to explore
textual data to get public views on financial markets [3], politics [4], education [5,6], etc.,
just to name a few. Various research studies have also discussed the people’s reactions to
events expressed in social media, in general, and Twitter in particular. Types of events
include pandemic [7], protest [8], criminal and terrorist events [9], natural disasters [10],
healthcare-related events [11], and so forth [12,13].
Many research studies have been conducted for different reasons including investigation of Twitter data to find the spreading pattern information on Ebola [14] and on
the COVID-19 outbreak [15], track and know the public views on Twitter amid pandemic [16,17], examine the intuitions that Global Health can draw from social networks [18],
and the reaction of people from different nations during the pandemic, toward the actions
their respective governments took to control the coronavirus outbreak [2]. Fung et al. [19]
investigated people’s reactions toward the Ebola outbreak on Twitter and Google. Experimental results showed a majority of emotions express the negative sentiment. The authors
in [20] examine people’s emotional answers during the Middle East Respiratory Syndrome
(MERS) outbreak in South Korea. They found that 80% of tweets were neutral. Anger
increased over time. The majority of people were blaming the Korean government and a
decline in fear and sadness tweets were reported over time.
Many sentiment analysis studies related to COVID-19 have been done based on the
social media data as shown in Figure 3, mainly focused on sentiment analysis concerning
the use of masks [21], fake information detection [22], emotion classification [23], polarity
detection [24], depression monitoring [25], Tourism [26] and so on.
2.1. Sentiment Polarity Assessment on COVID-19 Data
Research has been done to classify the sentiment polarity of Twitter data for the
coronavirus. Sakun et al. in research paper [27] have explored the Twitter trends related
to COVID-19. They collected 107,990 English tweets about the coronavirus and used
sentiment analysis and topic modeling to explore the tweets. Experiment results showed
three main aspects of tweets. (1) trends related to symptoms and the spread of COVID19 can be divided into three stages. (2) Sentiment analysis reveals that most people’s
views were negative about Coronavirus. (3) COVID-19 tweets were divided into three
topics namely: the COVID-19 pandemic emergency, how to control COVID-19, and reports
on COVID-19. Barkur et al. [28] explored the Twitter data for sentiments of people in
India about COVID-19 lockdown, and observation showed that the majority of views
about lockdown were negative but also there were some positive opinions. In another
research study [29], the authors have proposed the machine learning model to predict an
individual’s awareness of the protective measures against the coronavirus in Saudi Arabia.
In this study, Arabic tweets related to COVID-19 were collected and machine learning
Sustainability 2021, 13, 5344
5 of 14
models: support vector machine, K-nearest neighbors, and naïve Bayes were used to train
and test the Arabic tweets, SVM model outperformed with an accuracy of 85%.
Figure 3. Comparison of sentiment analysis studies related to COVID-19 on Twitter data.
The research article [30] has proposed the deep learning model for sentiment analysis
of coronavirus tweets. The study has collected two types of tweets: (1) 23,000 most
retweeted tweet collected between 1 January 2020 to 23 March 2020, tweets were explored
and results reveal that the maximum number of the tweets were neutral and negative and
(2) 226,668 tweets gathered between December 2019 and May 2020 show the maximum
number of tweets were positive and neutral tweets. The study concluded overall reaction
of people about COVID-19 on Twitter was positive yet citizens retweeted mostly negative
tweets. The authors in the paper [24] have investigated the relationship between the
sentiment of public and coronavirus cases. The study used the TextBlob sentiment corpus
to compute the polarity of tweets. Results reveal that there is a connection between
the sentiment of the public and COVID-19 cases. Important events such as government
regulation to slowdown spread, a celebration of important days can affect the people’s
sentiment. The study showed a weak correlation between sentiment polarity and that
increase in numbers of COVID-19 cases, public sentiment is affected but not that much by
the increase of coronavirus cases.
Pastor et al. in paper [31] have explored the Twitter sentiment analysis to classify
the views of Filipinos on extreme community quarantine measures announced by the
Philippines government to slow the spread of coronavirus. Sentiment results revealed that
food supply and support from government was major problem face by the people and
it concluded that most of the people showed negative sentiment while some users also
posted positive opinions. The authors of another research paper [32] analyzed people’s
reactions regarding the coronavirus vaccine. The study collected 2,349,659 tweets for a
month once the first dose vaccinated in the UK. Experiment results point out that most of
the tweets were neutral while tweets in favor of the vaccine overtook the tweets against
the vaccine. Kaur et al. in their research paper [33] have collected 16,138 tweets from three
different months of 2020 namely February, May, and June to monitor the polarity of tweets
amid COVID-19. The number of negative tweets surpassed the neutral and positive tweets
in all different time intervals as expected. Comparing the share of polarity classes from
February to June, the negative tweets were decreased from 43.90% to 38.05% while the
ratio of positive tweets increased from 21.38% to 27.01%. The share of the neutral tweets
has nearly remained the same, 34.07% and 34.94%. The research study [34] has explored
tweets from Europe regarding COVID-19. The authors collected 4.6 million geotagged
Sustainability 2021, 13, 5344
6 of 14
tweets from December 2019 to April 2020. Experimental results stated that as time passes a
downward trend of the negative sentiment was observed.
2.2. Emotion Classification on COVID-19 Data
The authors in the study [2] have investigated the Twitter data of six countries from
three different continents to know the emotions of people from different cultures about
actions their respective governments have taken on COVID-19. Countries include India
and Pakistan from Asia, Sweden and Norway from Europe, and the USA and Canada
from North America. Deep Learning-based LSTM models are used to train and test data.
The study reveals a high correlation in a tweet from India and Pakistan, and the USA and
Canada. Although two Nordic countries have many cultural similarities, Norway and
Sweden showed opposite emotions about COVID-19. The research study [35] has collected
the tweets from twelve countries related to the coronavirus and explored the tweets to know
people’s opinions from different countries about COVID-19. Experimental results conclude
that majority of people showed positive and hopeful thoughts but also fear, sadness,
and disgust opinions were observed. However, the USA, France, the Netherlands, and
Switzerland showed distrust and anger more than the other eight countries. Xue et al. [36]
have analyzed the 11 sentiment analysis topic identified from 1.5 million tweets collected
related to the coronavirus. The authors proposed a Latent Dirichlet Allocation (LDA)
topic modeling algorithm to explore all topics. Experimental results found that fear is the
dominant emotion in all topics.
3. Methodology
This section starts with explanation of our process of collecting tweets related to
COVID-19 during the second wave of the coronavirus. We also elaborate the process of
sentiment and emotion analysis on tweets from six countries including Pakistan, India,
Norway, Sweden, the USA and Canada.
3.1. Data Set—Tweets Related to Second COVID-19 Wave
The data set used in this study contains tweets from Twitter for cross-cultural emotion
recognition during the second wave of the coronavirus. For reliable cross culture polarity
measurement, six countries were selected from three continents; two from each that share
similar culture. The selected countries were India and Pakistan from Asia, Norway and
Sweden from Europe, and Canada and the USA from North America. These six countries
were chosen in particular to compare the trend between the polarity expressed during the
first wave reported in [2] with the second wave during the vaccination drive.
Data Collection: Twitter provides API to extract bulk data from their platform for
analysis. There are two types of API, i.e., Stream API and Search API. Stream API is used
to get live data, whereas Search API is used to extract historical data (up to the last 7 days)
by applying some filters. We used Twitter Search API known as Tweepy for collecting the
required data set. As we aimed to analyze the peoples’ sentiment over the progress of
COVID-19 vaccine and second wave, we collected the data for a time period Tp = Sd , Ed ,
where Sd is start date of second wave and Ed is the end date. The following query was
used to extract the data:
[Keyword] lang:[en/ur] until:Ed since:Sd -filter:links -filter:retweets
The keywords were selected such that they are directly linked to the coronavirus and
seem to be trending on twitter since the start of virus. The keywords used for extracting tweets are: lockdown, COV ID19Pandemic, StayHomeSaveLives, stayhome, Covid_19,
COV ID, Coronavirus, secondwave, pandemic, covid19, vaccine. Links and retweets were
being filtered out to exclude the less informative and repetitive tweets. Extracted tweets
were cataloged in an xlsx file as a raw data set, where each tweet record contains 72 fields
Sustainability 2021, 13, 5344
7 of 14
that describe tweet content and user information. For our objective we just retained six
fields, i.e., tweetid, tweettext, date, language, userid, and location.
Data preparation: The raw data set was processed further to clean the tweet text
up and to extract the emojis from it. In preprocessing, first we removed unnecessary
symbols, spaces, and mentioned users from tweet text and then we used NLTK library to
remove punctuation and stop-words and got the cleaned tweet text. As we aim to use this
data set for emotion recognition, so to support the sentiment analyzer for accurate results
we extracted the emojis from tweet text because emojis are true representation of users’
reaction/emotions in any textual composition.
In the final dataset (https://tinyurl.com/u47h9y7t, accessed on 28 March 2021),
each tweet was cataloged by Tweetid, date, language, cleanedtext, emoji, sentimentscore,
subjectivity, polarity, userid, and countrycode. There are 801,692 tweets from six countries
in the final data set. Country-wise distribution of tweets is shown in Table 1.
Table 1. Country-wise Distribution of Tweets.
Country
December-2020
January-2021
February-2021
Total
131,254
12,171
18,772
489
1147
688
164,521
317,016
60,389
21,350
2481
2627
1332
405,195
177,950
39,456
11,862
143
1612
953
231,976
626,220
112,016
51,984
3113
5386
2973
801,692
United States
Canada
India
Norway
Pakistan
Sweden
Total
3.2. Classification Models
As this work is an extension to our previous work [2], in order to assess change in
peoples’ sentiment and emotion after almost a year’s time to our previous results, we keep
the models same as our previous work. Readers are advised to consult section V in [2]
for further details on algorithms for sentiment and emotion detection. Figure 4 shows the
abstract model of the proposed classification system.
All three classifiers (A, B & C) are based on deep neural networks (DNN), Long
Short-Term Memory (LSTM) Netowks and Convolution Neural Network (CNN).
Deep Neural Network (DNN): A DNN is a simplest form of neural networks. It’s a
layered architecture with all neurons at one layer fully connected with all neuron at next
layer through an activation function.
Long Short Term Memory (LSTM) Network: Although fully connected deep neural
networks are good at processing text and other small sequences, their performance degrades when sequences are longer. To address the issue of longer sequences, LSTM deep
neural networks process current input and also retain previous state which is output from
previous inputs. The capability of LSTM to retain previous state enables it to understand
the word context; therefore, it is able to outperform DNN and other networks at processing
long sequences.
Convolution Neural Network: A CNN deep neural network relies on two major
operations, convolution and pooling. The convolution operation is performed on input text
or image with filters of different sizes to produce feature map which can be further used for
performing classification. The pooling operation involves sliding a two-dimensional filter
over each channel of convoluted feature map to summarize features laying in sub-regions
of the image or text. Traditionally, CNN is more appropriate for image processing, however
recently it has also started showing enough promise on sequence processing too.
The classifier A, based on LSTM with pretrained FastText [37] embedding is trained on
Sentiment140 [38] which contains a total number of 1.4 million tweets, equally distributed
among positive and negative sentiment polarities. Table 2 shows the results of different
models on Sentiment140 data set. The model based on LSTM and pretrained FastText
Sustainability 2021, 13, 5344
8 of 14
outperforms all other models. The summary of LSTM + FastText model is shown in
Figure 5.
Figure 4. Abstract model for tweets’ classification.
Table 2. F1 and accuracy scores of Six Deep Learning Models.
Model #
Model Name
F1 Score
Accuracy
1
2
3
4
5
6
DNN (Baseline)
LSTM + FastText
LSTM + GloVe
LSTM + GloVe Twitter
LSTM + w/o Pretrained Embed.
CONV Based on [39]
79.0%
82.4%
81.5%
80.4%
81.6%
81.7%
78.4%
82.4%
81.4%
80.4%
81.4%
81.1%
Figure 5. Summary of Classifier A for Sentiment Polarity Classification.
The positive polarity tweets are further checked for positive emotions (joy and surprise) through classifier B, whereas negative polarity tweets are forwarded to classifier C
for negative emotions (sad, disgust, fear, anger). For both classifier B and C, six different
models were assessed on an Emotional Tweet data set [40], and the summary of results
for positive and negative emotions is shown in Tables 3 and 4, respectively. In both cases,
LSTM with GloVe Twitter word embedding outperformed all other models; therefore, it
is used for assessing tweets emotions. The summary of LSTM + GloVe Twitter model is
shown in Figure 6.
Sustainability 2021, 13, 5344
9 of 14
Table 3. F1 and Accuracy Scores of Five Proposed Models on Positive Emotions (Joy and Surprise).
Model #
Model Name
1
2
3
4
5
DNN (Baseline)
LSTM + FastText
LSTM + GloVe
LSTM + GloVe Twitter
LSTM + w/o Pretrained Embed.
F1 Score
Accuracy
62.7%
67.5%
69.0%
69.9%
68.4%
78.4%
80.8%
80.3%
81.9%
79.8%
Table 4. F1 and Accuracy Scores of Five Models on Negative Emotions (Sad, Anger, Fear).
Model #
1
2
3
4
5
Model Name
DNN (Baseline)
LSTM + FastText
LSTM + GloVe
LSTM + GloVe Twitter
LSTM + w/o Pretrained Embed.
F1 Score
Accuracy
59.0%
62.1%
65.8%
69.2%
62.1%
64.5%
66.0%
67.7%
69.9%
66.0%
Figure 6. Model Summary for Classifier B and C.
4. Results & Analysis
Figure 7 shows a side-by-side country-wise comparison of sentiment polarity detection
for the investigated period of 2 months. The sentiments are normalized to the range of
0–1 by computing the sum of tweets per day over total number of tweets for a given
country. As shown in graphs depicted in Figure 7, there were quite a few tweets concerning
the vaccination posted over the second half of December 2020 and first half of January
2021. It can be noted that there were also only few days with no tweets. In particular, there
were two days (i.e., 10 January and 24 January 2021) where no tweets have been posted for
Norway and one day (i.e., 10 January 2021) for Sweden. It is also interesting to note that
the number of tweets posted over this period of examination is rapidly increased only in
the second half of January 2021, and this growing trend of tweets concerning vaccination
drive is seen from the all six countries.
There is a sudden change in the emotions on particular days as shown Figure 7, especially on January 20 where the peak of both negative and positive emotions expressed in
Twitter is registered. One possible reason for this could be the spread of new variant of the
coronavirus. Multiple variants of the COVID-19 virus emerged at the end of 2020, most
notably new variant first time detected in the UK (known as 20I/501Y.V1, VOC 202012/01,
or B.1.1.7), and South Africa is (known as 20H/501Y.V2 or B.1.351) (https://www.cdc.gov/
coronavirus/2019-ncov/more/science-and-research/scientific-brief-emerging-variants.html,
accessed on 25 February 2021). These new variants quickly spread around the globe. Nordre
Follo Municipality of Norway goes into lockdown after the British variant of the coronavirus spread on 22 January 2021. A new variant killed two nursing home residents and
identified 22 employees at the Langhus center.
Sustainability 2021, 13, 5344
10 of 14
Figure 7. Side-by-side country-wise comparison of sentiments analysis on collected data for the period 1rd December
2020 to 9th February 2021. Cumulative positive and negative sentiment graphs along with the averaged tweets’ polarity
for Sweden (top-left), Norway (top-right), Canada (middle-left), USA (middle-right), Pakistan (bottom-left), and India
(bottom-right).
Next, we analyzed the relationship between neighboring countries to see the sentiment
polarity and emotion trend during the vaccination period. To achieve this, a Pearson’s
correlation between countries is computed, as shown in Table 5. The Pearson’s correlation
values indicate a high correlation in both positive and negative emotions of people from
Pakistan and India (PK-IN), in contrast to people’s sentiment toward vaccination drive in
Canada and USA (US-CA), and Norway and Sweden (NO-SW). It is interesting to note
that the Pearson’s correlation between Norway and Sweden is 70% for positive and more
than 60% for negative sentiments. This shows a higher correlation of sentiments about
vaccination expressed in tweets on Twitter by the people of both countries, unlike their
different sentiments about the coronavirus outbreak and lockdown reported in [2].
Sustainability 2021, 13, 5344
11 of 14
Table 5. Pearson’s Correlation for Sentiment Polarity Between Neighbouring Countries.
No.
Correlation b/w
Positive
Negative
1
2
3
US-CA
PK-IN
NO-SW
0.623
0.837
0.703
0.624
0.865
0.616
Further, we examined the Pearson’s correlation for emotions between neighbouring
countries and a similar trend to sentiment polarity is observed. As can be seen in Table 6,
the highest Pearson’s correlation values across all the five emotions are shown for Pakistan
and India, followed by the USA and Canada.
Table 6. Pearson’s Correlation for Emotions Between Neighbouring Countries.
No
Correlation b/w
Joy
Surprise
Sad
Fear
Anger
1
2
3
US-CA
PK-IN
NO-SW
0.627
0.817
0.679
0.611
0.833
0.714
0.625
0.858
0.573
0.627
0.806
0.622
0.596
0.754
0.538
5. Conclusions and Future Work
This study aimed to analyze the emotions and sentiment polarity of people after
the launch of vaccine and COVID-19 second wave. It also tried to show if there is any
change in the sentiments of people since we studied the cross-cultural sentiment analysis
in our previous study about one year ago. To achieve this objective, the same architecture
was used from previous study which utilized the deep learning LSTM with pretrained
embedding models to detect emotions from users’ tweets on Twitter. Users’ tweets were
collected by querying the trending COVID-19 keywords from December 2020 to mid of
February 2021 when different countries started to provide vaccine shots to public. In order
to examine the change in sentiments of people from the start of virus, we limited the tweets
from six countries that were used in previous study.
Result analysis showed that in December, people were mostly neutral about the
vaccine and second wave but there was a sudden change in emotions after 15 January 2021.
People started to express positive as well negative sentiments due to new variant of the
coronavirus and governments’ efforts toward the situation. We also applied Pearson’s
correlation to examine the emotion expression relationship between the neighbouring
countries during the vaccination period. It indicated a high correlation in both positive and
negative emotions of people from Pakistan and India (PK-IN), while people’s sentiment
toward vaccination drive in Canada and USA(US-CA) were 62% correlated, and in Norway
and Sweden (NO-SW), the correlation was 70% for positive and 61% for negative despite
of their different emotions during COVID-19 outbreak in 2020.
The study covered varying cultures including the EU, the USA, Canada and South
Asian; however, it considered tweets only in English language. Usually, people in South
Asia express their emotion using local languages like Urdu, Hindi, Sindh etc. The work
can be extended in future to perform multilingual analysis for emotion and sentiment
extraction from social media text related to COVID-19. Another trend which is popular
on social media is the usage of roman Urdu, Hindi and other local languages. There is a
strong need to consider this aspect of language when performing emotion and sentiment
analysis for any topic of interest from social media.
Different transformer and attention based approaches for text processing have enormous potential to further improve accuracy of the proposed model. Usage of contextual
word embedding like BERT, ELMo etc. are needed to be assessed for suitability in the task
of social media text processing for sentiment and emotion analysis.
Sustainability 2021, 13, 5344
12 of 14
In this work, we have limited our focus on tweets, whereas other social media platforms like Facebook, Instagram etc. should be consider to learn more insights about people
opinion related to COVID-19 and its vaccination process.
Finally, as they say, “a picture is worth a thousand word”; therefore, processing images
for extracting people’s sentiments and emotions could be considered another dimension of
this work in future.
Author Contributions: R.B. prepared the data set by extracting and preprocessing tweets related
to COVID-19. She also assisted Z.K. with the data analysis part. A.S.I. conceived the original idea,
finalized contribution of this research, wrote the introduction part of paper and overall coordinate all
the efforts of the research group on this paper. S.M.D. performed experiments on the tweets data
set, wrote methodology part of the paper. S.S. contributed in literature review part and improved
visualization and overall readability of the paper. Z.K. performed analysis on the results and led the
whole results section along with multiple cycles of reviewing paper for improving its readability
of manuscript. A.G. led the literature review part and performed multiple review of the paper to
improve its readability. All authors have read and agreed to the published version of the manuscript.
Funding: The APC is covered by the Department of Computer Science (IDI), Faculty of Information
Technology and Electrical Engineering, Norwegian University of Science & Technology (NTNU),
Gjøvik, Norway.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data set used in this study can be found at https://tinyurl.com/
u47h9y7t (accessed on 29 March 2021).
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
COVID-19
PIMS
WHO
LSTM
GloVE
BERT
DNN
Coronavirus disease 2019
Pakistan Institute of Medical Sciences
World Health Organization
Long short-term memory
Global Vectors for Word Representation
Bidirectional encoder representations from transformers
Deep neural networks
References
1.
2.
3.
4.
5.
6.
7.
8.
Pfefferbaum, B.; North, C.S. Mental health and the Covid-19 pandemic. N. Engl. J. Med. 2020, 383, 510–512. [CrossRef] [PubMed]
Imran, A.S.; Daudpota, S.M.; Kastrati, Z.; Batra, R. Cross-cultural polarity and emotion detection using sentiment analysis and
deep learning on COVID-19 related tweets. IEEE Access 2020, 8, 181074–181090. [CrossRef]
Carosia, A.; Coelho, G.P.; Silva, A. Analyzing the Brazilian financial market through Portuguese sentiment analysis in social
media. Appl. Artif. Intell. 2020, 34, 1–19. [CrossRef]
Chauhan, P.; Sharma, N.; Sikka, G. The emergence of social media data and sentiment analysis in election prediction. J. Ambient.
Intell. Humaniz. Comput. 2021, 12, 2601–2627. [CrossRef]
Kastrati, Z.; Imran, A.S.; Kurti, A. Weakly supervised framework for aspect-based sentiment analysis on students’ reviews of
MOOCs. IEEE Access 2020, 8, 106799–106810. [CrossRef]
Kastrati, Z.; Dalipi, F.; Imran, A.S.; Pireva Nuci, K.; Wani, M.A. Sentiment Analysis of Students’ Feedback with NLP and Deep
Learning: A Systematic Mapping Study. Appl. Sci. 2021, 11, 3986. [CrossRef]
Xiang, X.; Lu, X.; Halavanau, A.; Xue, J.; Sun, Y.; Lai, P.H.L.; Wu, Z. Modern senicide in the face of a pandemic: An examination of
public discourse and sentiment about older adults and COVID-19 using machine learning. J. Gerontol. Ser. B 2021, 76, e190–e200.
[CrossRef]
Won, D.; Steinert-Threlkeld, Z.C.; Joo, J. Protest activity detection and perceived violence estimation from social media images.
In Proceedings of the 25th ACM International Conference on Multimedia, Mountain View, CA, USA, 23–27 October 2017;
pp. 786–794.
Sustainability 2021, 13, 5344
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
13 of 14
Burnap, P.; Williams, M.L.; Sloan, L.; Rana, O.; Housley, W.; Edwards, A.; Knight, V.; Procter, R.; Voss, A. Tweeting the terror:
modelling the social media reaction to the Woolwich terrorist attack. Soc. Netw. Anal. Min. 2014, 4, 206. [CrossRef]
Reynard, D.; Shirgaokar, M. Harnessing the power of machine learning: Can Twitter data be useful in guiding resource allocation
decisions during a natural disaster? Transp. Res. Part D Transp. Environ. 2019, 77, 449–463. [CrossRef]
Gohil, S.; Vuik, S.; Darzi, A. Sentiment analysis of health care tweets: review of the methods used. JMIR Public Health Surveill.
2018, 4, e43. [CrossRef] [PubMed]
Dunkel, A.; Andrienko, G.; Andrienko, N.; Burghardt, D.; Hauthal, E.; Purves, R. A conceptual framework for studying collective
reactions to events in location-based social media. Int. J. Geogr. Inf. Sci. 2019, 33, 780–804. [CrossRef]
Kumar, A.; Jaiswal, A. Systematic literature review of sentiment analysis on Twitter using soft computing techniques. Concurr.
Comput. Pract. Exp. 2020, 32, e5107. [CrossRef]
Liang, H.; Fung, I.C.H.; Tse, Z.T.H.; Yin, J.; Chan, C.H.; Pechta, L.E.; Smith, B.J.; Marquez-Lameda, R.D.; Meltzer, M.I.; Lubell,
K.M.; et al. How did Ebola information spread on twitter: broadcasting or viral spreading? BMC Public Health 2019, 19, 1–11.
[CrossRef] [PubMed]
Prabhakar Kaila, D.; Prasad, D.A. Informational flow on Twitter–Corona virus outbreak–topic modelling approach. Int. J. Adv.
Res. Eng. Technol. IJARET 2020, 11, 128–134.
Szomszor, M.; Kostkova, P.; St Louis, C. Twitter informatics: tracking and understanding public reaction during the 2009 swine
flu pandemic. In Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent
Technology, Lyon, France, 22–27 August 2021; IEEE: Piscataway, NJ, USA, 2011; Volume 1, pp. 320–323.
Fu, K.W.; Liang, H.; Saroha, N.; Tse, Z.T.H.; Ip, P.; Fung, I.C.H. How people react to Zika virus outbreaks on Twitter? A
computational content analysis. Am. J. Infect. Control. 2016, 44, 1700–1702. [CrossRef] [PubMed]
Vorovchenko, T.; Ariana, P.; van Loggerenberg, F.; Amirian, P. # Ebola and Twitter. What insights can global health draw from
social media? In Big Data in Healthcare; Springer: Berlin/Heidelberg, Germany, 2017; pp. 85–98.
Fung, I.C.H.; Tse, Z.T.H.; Cheung, C.N.; Miu, A.S.; Fu, K.W. Ebola and the social media. Lancet 2014. [CrossRef]
Do, H.J.; Lim, C.G.; Kim, Y.J.; Choi, H.J. Analyzing emotions in twitter during a crisis: A case study of the 2015 Middle East
Respiratory Syndrome outbreak in Korea. In Proceedings of the 2016 International Conference on Big Data and Smart Computing
(BigComp), Hong Kong, China, 18–20 January 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 415–418.
Sanders, A.C.; White, R.C.; Severson, L.S.; Ma, R.; McQueen, R.; Paulo, H.C.A.; Zhang, Y.; Erickson, J.S.; Bennett, K.P. Unmasking the conversation on masks: Natural language processing for topical sentiment analysis of COVID-19 Twitter discourse.
medRxiv 2021. [CrossRef]
Elhadad, M.K.; Li, K.F.; Gebali, F. COVID-19-FAKES: A Twitter (Arabic/English) dataset for detecting misleading information on
COVID-19. In International Conference on Intelligent Networking and Collaborative Systems; Springer: Berlin/Heidelberg, Germany,
2020; pp. 256–268.
Xue, J.; Chen, J.; Hu, R.; Chen, C.; Zheng, C.; Su, Y.; Zhu, T. Twitter Discussions and Emotions About the COVID-19 Pandemic:
Machine Learning Approach. J. Med. Internet Res. 2020, 22, e20550. [CrossRef]
Luu, T.J.P.; Follmann, R. The Relationship between Sentiment Score and COVID-19 Cases in the USA 2020. Available online:
https://jackluu.io/files/LuuResearchPaper.pdf (accessed on 29 March 2021).
Zhang, Y.; Lyu, H.; Liu, Y.; Zhang, X.; Wang, Y.; Luo, J. Monitoring Depression Trend on Twitter during the COVID-19 Pandemic.
arXiv 2020, arXiv:2007.00228.
Lu, Y.; Zheng, Q. Twitter public sentiment dynamics on cruise tourism during the COVID-19 pandemic. Curr. Issues Tour. 2020,
24, 1–7. [CrossRef]
Boon-Itt, S.; Skunkan, Y. Public perception of the COVID-19 pandemic on Twitter: Sentiment analysis and topic modeling study.
JMIR Public Health Surveill. 2020, 6, e21978. [CrossRef]
Barkur, G.; Vibha, G.B.K. Sentiment analysis of nationwide lockdown due to COVID 19 outbreak: Evidence from India. Asian J.
Psychiatry 2020, 51, 102089. [CrossRef] [PubMed]
Aljameel, S.S.; Alabbad, D.A.; Alzahrani, N.A.; Alqarni, S.M.; Alamoudi, F.A.; Babili, L.M.; Aljaafary, S.K.; Alshamrani, F.M.
A Sentiment Analysis Approach to Predict an Individual’s Awareness of the Precautionary Procedures to Prevent COVID19 Outbreaks in Saudi Arabia. Int. J. Environ. Res. Public Health 2021, 18, 218. [CrossRef] [PubMed]
Chakraborty, K.; Bhatia, S.; Bhattacharyya, S.; Platos, J.; Bag, R.; Hassanien, A.E. Sentiment Analysis of COVID-19 tweets by Deep
Learning Classifiers—A study to show how popularity is affecting accuracy in social media. Appl. Soft Comput. 2020, 97, 106754.
[CrossRef]
Pastor, C.K. Sentiment Analysis of Filipinos and Effects of Extreme Community Quarantine due to Coronavirus (Covid-19)
Pandemic. 2020. Available online: https://ssrn.com/abstract=3574385 (accessed on 29 March 2021).
Cotfas, L.A.; Delcea, C.; Roxin, I.; Ioanăş, C.; Gherai, D.S.; Tajariol, F. The Longest Month: Analyzing COVID-19 Vaccination
Opinions Dynamics from Tweets in the Month following the First Vaccine Announcement. IEEE Access 2021, 9, 33203–33223.
[CrossRef]
Kaur, S.; Kaul, P.; Zadeh, P.M. Monitoring the Dynamics of Emotions during COVID-19 Using Twitter Data. Procedia Comput. Sci.
2020, 177, 423–430. [CrossRef]
Kruspe, A.; Häberle, M.; Kuhn, I.; Zhu, X.X. Cross-language sentiment analysis of European Twitter messages duringthe
COVID-19 pandemic. arXiv 2020, arXiv:2008.12172.
Sustainability 2021, 13, 5344
35.
36.
37.
38.
39.
40.
14 of 14
Dubey, A.D. Twitter Sentiment Analysis during COVID19 Outbreak. 2020. Available online: https://ssrn.com/abstract=3572023
(accessed on 29 March 2021).
Xue, J.; Chen, J.; Chen, C.; Zheng, C.; Li, S.; Zhu, T. Public discourse and sentiment during the COVID 19 pandemic: Using Latent
Dirichlet Allocation for topic modeling on Twitter. PLoS ONE 2020, 15, e0239441. [CrossRef] [PubMed]
Bojanowski, P.; Grave, E.; Joulin, A.; Mikolov, T. Enriching Word Vectors with Subword Information. CoRR 2016. Available
online: http://xxx.lanl.gov/abs/1607.04606 (accessed on 26 March 2021).
Go, A.; Bhayani, R.; Huang, L. Twitter Sentiment Classification Using Distant Supervision. Available online: https://www-cs.
stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf (accessed on 26 March 2021).
Cai, M. Sentiment Analysis of Tweets using Deep Neural Architectures. In Proceedings of the 32nd Conference on Neural
Information Processing Systems (NIPS 2018), Montréal, QC, Canada, 3–8 December 2018; pp. 1–8.
Mohammad, S.M.; Bravo-Marquez, F. WASSA-2017 Shared Task on Emotion Intensity. In Proceedings of the Workshop on
Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), Copenhagen, Denmark, 8 September
2017; pp. 34–39.