Fake news detection in social media
Kelly Stahl *
B.S. Candidate, Department of Mathematics and Department of Computer Sciences, California State University Stanislaus, 1 University Circle, Turlock,
                                                                CA 95382
                                                   Received 20 April, 2018; accepted 15 May 2018
Abstract
  Due to the exponential growth of information online, it is becoming impossible to decipher the true from the false. Thus, this leads to
the problem of fake news. This research considers previous and current methods for fake news detection in textual formats while detailing
how and why fake news exists in the first place. This paper includes a discussion on Linguistic Cue and Network Analysis approaches,
and proposes a three-part method using Naïve Bayes Classifier, Support Vector Machines, and Semantic Analysis as an accurate way to
detect fake news on social media.
Keywords: fake news, false information, deception detection, social media, information manipulation, Network Analysis, Linguistic Cue, Fact-checking,
Naïve Bayes Classifier, SVM, Semantic Analysis
Introduction                                                              Significance
     How much of what we read on social media and on                           Using social media as a medium for news updates
supposedly “credible” news sites is trustworthy? It is                    is a double-edged sword. On one hand, social media
extremely easy for anyone to post what they desire and                    provides for easy access, little to no cost, and the spread
although that can be acceptable, there is the notion of                   of information at an impressive rate (Shu, Sliva, Wang,
taking it a step too far, such as posting false information               Tang, & Liu, 2017). However, on the other hand, social
online in order to cause a panic, using lies to manipulate                media provides the ideal place for the creation and
another person’s decision, or essentially anything else                   spread of fake news. Fake news can become extremely
that can have lasting repercussions. There is so much                     influential and has the ability to spread exceedingly fast.
information online that it is becoming impossible to                      With the increase of people using social media, they are
decipher the true from the false. Thus, this leads to the                 being exposed to new information and stories every day.
problem of fake news.                                                     Misinformation can be difficult to correct and may have
                                                                          lasting implications. For example, people can base their
Literature review                                                         reasoning on what they are exposed to either
                                                                          intentionally or subconsciously, and if the information
     What is fake news? Fake news is the deliberate                       they are viewing is not accurate, then they are
spread of misinformation via traditional news media or                    establishing their logic on lies. In addition, since false
via social media. False information spreads                               information is able to spread so fast, not only does it
extraordinarily fast. This is demonstrated by the fact that,              have the ability to harm people, but it can also be
when one fake news site is taken down, another will                       detrimental to huge corporations and even the stock
promptly take its place. In addition, fake news can                       market. For instance, in October of 2008, a journalist
become indistinguishable from accurate reporting since                    posted a false report that Steve Jobs had a heart attack.
it spreads so fast. People can download articles from                     This report was posted through CNN’s iReport.com,
sites, share the information, re-share from others and by                 which is an unedited and unfiltered site, and
the end of the day the false information has gone so far                  immediately people retweeted the fake news report.
from its original site that it becomes indistinguishable                  There was much confusion and uncertainty because of
from real news (Rubin, Chen, & Conroy, 2016).                             how widespread it became in such a short amount of
                                                                          time. The stock of Job’s company, Apple Inc.,
                                                                          fluctuated dramatically that day due to one false news
                                                                          report that had been mistaken for authentic news
                                                                          reporting (Rubin, 2017).
   *
       Corresponding author. Email: kstahl@csustan.edu
     However, the biggest reason why false information         fake news is capable of ruining the “balance of the news
is able to thrive continuously is that humans fall victim      ecosystem” (Shu et al., 2017). For instance, in the 2016
to Truth-Bias, Naïve Realism, and Confirmation Bias.           Presidential Election, the “most popular fake news was
When referring to people being naturally “truth-biased”        even more widely spread on Facebook” instead of the
this means that they have “the presumption of truth” in        “most popular authentic mainstream news” (Shu et al.,
social interactions, and “the tendency to judge an             2017). This demonstrates how users may pay more
interpersonal message as truthful, and this assumption         attention to manipulated information than authentic
is possibly revised only if something in the situation         facts. This is a problem not only because fake news
evokes suspicion” (Rubin, 2017). Basically humans are          “persuades consumers to accept biased or false beliefs”
very poor lie detectors and lack the realization that there    in order to communicate a manipulator’s agenda and
is the possibility they are being potentially lied to. Users   gain influence, but also fake news changes how
of social media tend to be unaware that there are posts,       consumers react to real news (Shu et al., 2017). People
tweets, articles or other written documents that have the      who engage in information manipulation desire to cause
sole purpose of shaping the beliefs of others in order to      confusion so that a person’s ability to decipher the true
influence their decisions. Information manipulation is         from the false is further impeded. This, along with
not a well-understood topic and generally not on               influence, political agendas, and manipulation, is one of
anyone’s mind, especially when fake news is being              the many motives why fake news is generated.
shared by a friend. Users tend to let their guard down on
social media and potentially absorb all the false              Contributors of fake news
information as if it were the truth. This is also even more
detrimental considering how young users tend to rely on             While many social media users are very much real,
social media to inform them of politics, important             those who are malicious and out to spread lies may or
events, and breaking news (Rubin, 2017). For instance,         may not be real people. There are three main types of
“Sixty-two percent of U.S. adults get news on social           fake news contributors: social bots, trolls, and cyborg
media in 2016, while in 2012, only fort-nine percent           users (Shu et al., 2017). Since the cost to create social
reported seeing news on social media,” which                   media accounts is very low, the creation of malicious
demonstrates how more and more people are becoming             accounts is not discouraged. If a social media account is
tech savvy and relying on social media to keep them            being controlled by a computer algorithm, then it is
updated (Shu et al., 2017). In addition, people tend to        referred to as a social bot. A social bot can automatically
believe that their own views on life are the only ones         generate content and even interact with social media
that are correct and if others disagree then those people      users. Social bots may or may not always be harmful but
are labeled as “uniformed, irrational, or biased,”             it entirely depends on how they are programmed. If a
otherwise known as Naïve Realism (Shu et al., 2017).           social bot is designed with the sole purpose of causing
     This leads to the problem of Confirmation Bias,           harm, such as spreading fake news in social media, then
which is the notion that people favor receiving                they can be very malicious entities and contribute
information that only verifies their own current views.        greatly to the creation of fake news. For example,
Consumers only want to hear what they believe and do           “studies shows that social bots distorted the 2016 US
not want to find any evidence against their views. For         presidential election discussions on a large scale, and
instance, someone could be a big believer of                   around 19 million bot accounts tweeted in support of
unrestricted gun control and may desire to use any             either Trump or Clinton in the week leading up to the
information they come across in order to support and           election day,” which demonstrates how influential
justify their beliefs further. Whether that is using           social bots can be on social media (Shu et al., 2017).
random articles from uncredible sites, posts from                   However, fake humans are not the only contributors
friends, re-shared tweets, or anything online that does        to the dissemination of false information; real humans
agrees with their principles. Consumers do not wish to         are very much active in the domain of fake news. As
find anything that contradicts what they believe because       implied, trolls are real humans who “aim to disrupt
it is simply not how humans function. People cannot            online communities” in hopes of provoking social
help but favor what they like to hear and have a               media users into an emotional response (Shu et al.,
predisposition for confirmation bias. It is only those         2017). For instance, there has been evidence that claims
who strive for certain academic standards that may be          “1,000 Russian trolls were paid to spread fake news on
able to avoid or limit any biasness, but the average           Hilary Clinton,” which reveals how actual people are
person who is unaware of false information to begin            performing information manipulation in order to change
with will not be able to fight these unintentional urges.      the views of others (Shu et al., 2017). The main goal of
     In addition, not only does fake news negatively           trolling is to resurface any negative feelings harvested
affect individuals, but it is also harmful to society in the   in social media users, such as fear and even anger, so
long run. With all this false information floating around,     that users will develop strong emotions of doubt and
distrust (Shu et al., 2017). When a user has doubt and        even leave out important facts that were existent in
distrust in their mind, they won’t know what to believe       profiles on related topics (Conroy, Rubin, &
and may start doubting the truth and believing the lies       Chen, 2015).
instead.                                                           Finally, the last linguistic approach, Sentiment
     While contributors of fake news can be either real       Analysis, focuses on opinion mining, which involves
or fake, what happens when it’s a blend of both? Cyborg       scrutinizing written texts for people’s attitudes,
users are a combination of “automated activities with         sentiments, and evaluations with analytical techniques.
human input” (Shu et al., 2017). The accounts are             However, this approach still is not perfect considering
typically registered by real humans as a cover, but use       that the issues of credibility and verification are
programs to perform activities in social media. What          addressed with less priority (Rubin, 2017).
makes cyborg users even more powerful is that they are
able to switch the “functionalities between human and         Network analysis methods
bot,” which gives them a great opportunity to spread
false information (Shu et al., 2017).                              In contrast, Network Analysis approaches are
     Now that we know some of the reasons why and             content-based approaches that rely on deceptive
how fake news progresses, it would be beneficial to           language cues to predict deception. What makes this
discuss the methods of detecting online deception in          category different from the Linguistic approach is that
word-based format, such as e-mails. The two main              the Network Analysis approach needs “an existing body
categories for detecting false information are the            of collective human knowledge to assess the truth of
Linguistic Cue and Network Analysis approaches.               new statements” (Conroy, Rubin, & Chen, 2015). This
                                                              is the most straightforward way of false information
Linguistic cue methods                                        detection by checking the “truthfulness of major claims
                                                              in a news articles” in order to determine “the news
     In Linguistic Cue approaches, researchers detect         veracity” (Shu et al., 2017). This approach is
deception through the study of different communicative        fundamental for further progress and development of
behaviors. Researchers believe that liars and truth-          fact-checking methods. The underlying goal is using
tellers have different ways of speaking. In text-based        outside sources in order to fact-check any projected
communication, deceivers tend to have a total word            statements in news content by assigning a “truth value
count greater than that of a truth-teller. Also, liars tend   to a claim in a particular context” (Shu et al., 2017).
to use fewer self-oriented pronouns than other-oriented            Moreover, the three existing fact-checking methods
pronouns, along with using more sensory-based words.          are expert-oriented, crowdsourcing-oriented, and
Hence, these properties found in the content of a             computational-oriented. Expert-oriented fact checking
message can serve as linguistic cues that can detect          is intellectually demanding and even time consuming
deception (Rubin, 2017). Essentially, Linguistic Cue          since it is heavily based on human experts to analyze
approaches detect fake news by catching the                   “relevant data and documents” which will lead to them
information manipulators in the writing style of the          composing their “verdicts of claim veracity” (Shu et al.,
news content. The main methods that have been                 2017). A great example of expert-oriented fact checking
implemented under the Linguistic Cue approaches are           is PolitiFact. Essentially PolitiFact requires their
Data Representation, Deep Syntax, Semantic Analysis,          researchers to spend time analyzing certain claims by
and Sentiment Analysis.                                       seeking out any credible information. When enough
     When dealing with the Data Representation                evidence has been gathered, a truth-value that ranges
approach, each word is a single significant unit and the      from True, Mostly True, Half True, Mostly False, False,
individual words are analyzed to reveal linguistic cues       and Pants on Fire is assigned to the original claim.
of deception, such as parts of speech or location-based            In addition, crowdsourcing-oriented fact checking
words (Conroy, Rubin, & Chen, 2015).                          uses the “wisdom of the crowd” concept which allows
     The Deep Syntax method is implemented through            normal people, instead of only experts, to discuss and
Probability Context Free Grammars (PCFG). Basically,          analyze the news content by using annotations which
the sentences are being transformed to a set of rewritten     are then used to create an “overall assessment of the
rules in order to describe the syntax structure               news veracity” (Shu et al., 2017). An example of this in
(Conroy, Rubin, & Chen, 2015).                                action is Fiskkit, which is an online commenting
     Another approach, Semantic Analysis, determines          website that aims to improve the dialogue of online
the truthfulness of authors by characterizing the degree      articles by allowing its users to identify inaccurate facts
of compatibility of a personal experience. The                or any negative behavior. This enables users to discuss
assumption is that since the deceptive writer has no          and comment on the truthfulness of certain parts and
previous experience with the particular event or object,      sections of a news article (Shu et al., 2017).
then they may end up including contradictions or maybe
     Finally, the last type of fact-checking is               of the message sources in the network, reputation of
Computational-oriented, which provides “an automatic          cites, trustworthiness, credibility, expertise, and the
scalable system to classify true and false claims” and        tendency of spreading rumors should all be considered
tries to solve the two biggest problems: i). Identifying      (Rubin, 2017).
any “claims that are check-worthy” and ii). Determining
the validity of these fact claims (Shu et al., 2017). Any     Selected methods explored further
statements in the content that reveal core statements and
viewpoints are removed. These are identified as factual            Furthermore, the methods to be further explored in
claims that need to be verified, hence enables the fact-      relation to fake news detection in social media are Naïve
checking process. Fact checking for specific claims           Bayes classifier, SVM, and semantic analysis.
requires external resources such as open web and
knowledge graphs. Open web sources are used as                Naïve Bayes Classifier
“references that can be compared with given claims in               Naïve Bayes is derived from Bayes Theorem,
terms of both consistency and frequency” (Shu et al.,         which is used for calculating conditional probability, the
2017). Knowledge graphs instead are “integrated from          “probability that something will happen, given that
the linked open data as a structural network topology”        something else has already occurred” (Saxena, 2017).
which aspire to find out if the statements in the news        Thus we are able to compute the likelihood of a certain
content can be deduced from “existing facts in the            outcome by using past knowledge of it.
knowledge graph” (Shu et al., 2017).                                Furthermore, Naïve Bayes is a type of classifier
     Moreover, the two main methods that are being            considered to be a supervised learning algorithm, which
used under the Network Analysis approach are Linked           belongs to the Machine Language class and works by
Data and Social Network behavior. In the Linked data          predicting “membership probabilities” for each
approach, the false statements being analyzed can be          individual class, for instance, the likelihood that the
extracted and examined alongside accurate statements          given evidence, or record, belongs to a certain class
known to the world (Conroy, Rubin, & Chen, 2015).             (Saxena, 2017). The class with the greatest, or highest
When referring to accurate statements “known to the           probability, shall be determined the “most likely class,”
world” this relates to facts proven to be true and or         which is also known as Maximum A Posteriori (MAP)
statements that are widely accepted, such as “Earth is        (Saxena, 2017).
the name of the planet we live in.”                                 Another way of thinking about Naïve Bayes
     Relating to the Social Network Behavior approach,        classifier is that this method uses the “naïve” notion that
this uses centering resonance analysis, which can be          all features are unrelated. In most cases, this assumption
abbreviated as CRA, in order to represent “the content        of independence is outrageously false. Suppose Naïve
of large sets of text by identifying the most important       Bayes classifier is scanning an article and comes across
words that link other words in the network”                   “Barack,” in many cases the same article will also have
(Conroy, Rubin, & Chen, 2015). All the previous               “Obama” contained in it. Even though these two
approaches discussed are the main methods of how              features are clearly dependent, the method will still
researchers have been detecting fake news, however            calculate the probabilities “as if they were independent,”
these practices have primarily been used for the textual      which does end up overestimating “the probability that
formats, such as e-mails or conference call records           an article belongs to a certain class” (Fan, 2017). Since
(Rubin, 2017). The real question is how do predicative        Naïve Bayes classifier overestimates the probabilities of
cues of deception in micro-blogs, such as Twitter and         dependencies, it gives the impression that it would not
Facebook, differ from those of textual formats?               work well for text classification. On the contrary, Naïve
      Therefore, concerning the area of false information     Bayes classifier still has a high performance rate even
in social media, fake news in the field of social media is    with “strong feature dependencies,” since the
relatively new. There have only been a handful of             dependencies will actually end up cancelling out each
research studies completed in this domain, which              other for the most part (Fan, 2017).
requires more research to be conducted. In order to                 In addition, what makes Naïve Bayes classifier
address this area, researchers are currently working on       desirable is that it’s relatively fast and a highly
creating software that has the ability to detect deception.   accessible technique. It can be used for binary or
Deception detection software generally implements the         multiclass classifications, making it an excellent choice
different types of Linguistic cue approaches. However,        for “Text Classification problems” as mentioned earlier
when dealing with false information detection on social       (Saxena, 2017). Also, Naïve Bayes classifier is a
media, the problem is much more complex, using one            straightforward algorithm that only really relies on
method is no longer enough. Since linguistic cues are         performing many counts. Thus, it can be “easily trained
only one part of the problem, there are other aspects that    on a small dataset” (Saxena, 2017).
essentially need to be incorporated such as positioning
     However, the biggest downfall of this method is            include ambiguities that semantic analysis can detect
that it deems all the features to be separate, which may        (Conroy, Rubin, & Chen, 2015).
not always be the case. Hence, there is no relationship              Furthermore, a huge reason for using semantic
learned among the features (Saxena, 2017).                      analysis is that this method is able to precisely classify
                                                                a document through the use of association and
SVM                                                             collocation (Unknown, 2013). This is especially useful
     A support vector machine (SVM), which can be               for languages that have words with multiple meanings
used interchangeably with a support vector network              and close synonyms, such as the English language.
(SVN), is also considered to be a supervised learning           Suppose if one decided to use a simple algorithm that is
algorithm. SVMs work by being trained with specific             unable to make the distinction among different word
data already organized into two different categories.           meanings, then the result may be ambiguous and
Hence, the model is constructed after it has already been       inaccurate. Thus, by considering rules and relations
trained.                                                        when searching through texts, semantic analysis
     Furthermore, the goal of the SVM method is to              operates similarly to how the human brain functions
distinguish which category any new data falls under, in         (Unknown, 2013).
addition, it must also maximize the margin between the               However, in light of the situation of comparing
two classes (Brambrick). The optimal goal is that the           profiles and the “description of the writer’s personal
SVM will find a hyperplane that divides the dataset into        experience” discussed above, there are potentially two
two groups.                                                     limitations with the semantic analysis method (Conroy,
     To elaborate further, the support vectors are “the         Rubin, & Chen, 2015). In order to even “determine
data points nearest to the hyperplane” and if removed           alignment between attributes and descriptors,” there
would modify the location of the dividing hyperplane            needs to be a great amount of excavated content for
(Brambrick). Thus, support vectors are crucial elements         profiles in the first place (Conroy, Rubin, & Chen,
of a data set. In addition, the hyperplane can be thought       2015). In addition, there also exits the challenge of
of as “a line that linearly separates and classifies a set of   being able to accurately associate “descriptors with
data” and “the further from the hyperplane our data             extracted attributes” (Conroy, Rubin, & Chen, 2015).
points lie,” the higher the chance that the data points
have been accurately classified (Brambrick).                    Proposed method
     Moreover, the advantages of using the SVM
method are that it tends to be very accurate and performs            Due to the complexity of fake news detection in
extremely well on datasets that are smaller and more            social media, it is evident that a feasible method must
concise. In addition, this technique is very flexible since     contain several aspects to accurately tackle the issue.
it can be used to classify or even determine numbers.           This is why the proposed method is a combination of
Also, support vector machines have the capability to            Naïve Bayes classifier, Support Vector Machines, and
handle high dimensional spaces and tend to be memory            semantic analysis. The proposed method is entirely
efficient (Ray, Srivastava, Dar, & Shaikh, 2017).               composed of Artificial Intelligence approaches, which
     On the contrary, the disadvantages of using the            is critical to accurately classify between the real and the
SVM approach are that it has difficulty with large              fake, instead of using algorithms that are unable to
datasets since “the training time with SVMs can be high”        mimic cognitive functions. The three-part method is a
and it is “less effective on noisier [meaningless] datasets     combination between Machine Learning algorithms that
with overlapping classes” (Brambrick). In addition, the         subdivide into supervised learning techniques, and
SVM method will not “directly provide probability               natural language processing methods. Although each of
estimates” (Ray et al., 2017).                                  these approaches can be solely used to classify and
Semantic Analysis                                               detect fake news, in order to increase the accuracy and
     Semantic analysis is derived from the natural              be applicable to the social media domain, they have
language processing (NLP) branch in computer science.           been combined into an integrated algorithm as a method
As discussed earlier, the method of semantic analysis           for fake news detection.
examines indicators of truthfulness by defining the                  Furthermore, SVM and Naïve Bayes classifier tend
“degree of compatibility between a personal experience,”        to “rival” each other due to the fact they are both
as equated to a “content ‘profile’ derived from a               supervised learning algorithms that are efficient at
collection analogous data” (Conroy, Rubin, &                    classifying data. Both techniques are moderately
Chen, 2015). The idea is that the fake news author is           accurate at categorizing fake news in experiments,
not familiar with the specific event or object. For             which is why this proposed method focuses on
example, they have never even visited the location in           combining SVM and Naïve Bayes classifier to get even
question, thus they may neglect facts that have been            more accurate results. In “Combining Naive Bayesian
present in “profiles on similar topics” or potentially          and Support Vector Machine for Intrusion Detection
System,” the authors integrate both methods of SVM           the future, I wish to test out the proposed method of
and Naïve Bayes classifier in order to create a more         Naïve Bayes classifier, SVM, and semantic analysis, but,
precise method that classifies better than each method       due to limited knowledge and time, this will be a project
individually. They found that their “hybrid algorithm”       for the future.
effectively minimized “false positives as well as                 It is important that we have some mechanism for
maximize balance detection rates,” and performed             detecting fake news, or at the very least, an awareness
slightly better than SVM and Naïve Bayes classifier did      that not everything we read on social media may be true,
individually (Sagale, & Kale, 2014). Even though this        so we always need to be thinking critically. This way we
experiment was applied to Intrusion Detection Systems        can help people make more informed decisions and they
(IDS), it clearly demonstrates that merging the two          will not be fooled into thinking what others want to
methods would be relevant to fake news detection.            manipulate them into believing.
     Moreover, introducing semantic analysis to SVM
and Naïve Bayes classifier can improve the algorithm
even more. The biggest drawback of Naïve Bayes               References
classifier is that it deems all features of a document, or
whichever textual format being used, to be independent       Brambrick, Aylien, N. (n.d.). KDnuggets. Retrieved February 20,
even though most of the time that is not the situation.        2018, from https://www.kdnuggets.com/2016/07/support-vector-
                                                               machines-simple-explanation.html
This is a problem due to lowered accuracy and the fact       Chen, Y., Conroy, N., & Rubin, V. (2015). News in an online world:
that relationships are not being learned if everything is      The need for an “automatic crap detector”. Proceedings of the
assumed to be unrelated. As we mentioned earlier, one          Association for Information Science and Technology, 52(1), 1-4.
of the biggest advantages of semantic analysis is that       Conroy, N., Rubin, V., & Chen, Y. (2015). Automatic deception
                                                               detection: Methods for finding fake news. Proceedings of the
this method is able to find relationships among words.         Association for Information Science and Technology, 52(1), 1-4.
Thus, adding semantic analysis helps fix one of the          Fan, C. (2017). Classifying fake news. Retrieved February 18, 2018,
biggest weaknesses of Naïve Bayes classifier.                  from http://www.conniefan.com/2017/03/classifying-fake-news
     In addition, adding semantic analysis to SVM can        Huang, Y. (2001). Support Vector Machines for Text Categorization
                                                               Based on Latent Semantic Indexing.
improve the performance of the classifier. In “Support       Ray, S., Srivastava, T., Dar, P., & Shaikh, F. (2017). Understanding
Vector Machines for Text Categorization Based on               Support Vector Machine algorithm from examples (along with
     Latent Semantic Indexing,” the author shows that          code).        Retrieved       March        2,       2018,       from
combining the two methods improves the efficiency due          https://www.analyticsvidhya.com/blog/2017/09/understaing-
                                                               support-vector-machine-example-code/
to “focusing attention of Support Vector Machines onto       Rubin, V., Chen, Y., & Conroy, N. (2015). Deception detection for
informative subspaces of the feature spaces,” (Huang,          news: Three types of fakes. Proceedings of the Association for
2001). In the experiment, semantic analysis was able to        Information Science and Technology, 52(1), 1-4.
capture the “underlying content of document in               Rubin, V., Chen, Y., & Conroy, N. J. (2016). Education and
                                                               Automation: Tools for navigating a sea of fake news. UNDARK.
semantic sense,” (Huang, 2001). This improved the            Rubin, V., Conroy, N. J., & Chen, Y. (2015, January). Research Gate.
efficiency of SVM since the method would waste less            Retrieved           April           11,          2017,          from
of its time classifying meaningless data and spend more        https://www.researchgate.net/publication/270571080_Towards_N
time organizing relevant data with the help of semantic        ews_Verification_Deception_Detection_Methods_for_News_Dis
                                                               course doi:10.13140/2.1.4822.8166
analysis. As outlined earlier, a huge benefit of semantic    Rubin, V., Conroy, N., Chen, Y., & Cornwell, S. (2016). Fake News
analysis is its ability to extract important data through      or Truth? Using Satirical Cues to Detect Potentially Misleading
relationships between words; hence, semantic analysis          News. Proceedings of the Second Workshop on Computational
is able to use its fundamental benefit to further improve      Approaches to Deception Detection. doi:10.18653/v1/w16-0802
                                                             Rubin, V. (2017). Deception detection and rumor debunking for social
SVM.                                                           media. Handbook of Social Media Research Methods.
                                                             Sagale, A. D., & Kale, S. G. (2014). Combining Naive Bayesian and
Conclusion                                                     Support      Vector     Machine       for     Intrusion     Detection
                                                               System. International Journal of Computing and Technology, 1(3).
                                                               Retrieved April 1, 2018.
     As mentioned earlier, the concept of deception          Saxena, R. (2017). How the Naive Bayes Classifier works in Machine
detection in social media is particularly new and there is     Learning.       Retrieved       October     20,       2017,     from
ongoing research in hopes that scholars can find more          https://dataaspirant.com/2017/02/06/naive-bayes-classifier-
accurate ways to detect false information in this              machine-learning/
                                                             Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake News
booming, fake-news-infested domain. For this reason,           Detection on Social Media: A Data Mining Perspective. ACM
this research may be used to help other researchers            SIGKDD Explorations Newsletter, 19(1), 22-36.
discover which combination of methods should be used         Unknown. (2013). Why Semantics is Important for Classification.
in order to accurately detect fake news in social media.       Retrieved March 19, 2018, from http://www.skilja.de/2013/why-
                                                               semantics-is-important-for-classification/
The proposed method described in this paper is an idea
for a more accurate fake news detection algorithm. In