Final Year of Computer Engineering 2022-23 Semester VII Project Synopsis
Final Year of Computer Engineering 2022-23 Semester VII Project Synopsis
Team Members :
In our modern era where the internet is ubiquitous, everyone relies on various
online resources for news. Along with the increase in the use of social media
platforms like Facebook, Twitter, etc. news spread rapidly among millions of users
within a very short span of time. The spread of fake Website and News has far-
reaching consequences like the creation of biased opinions to swaying election
outcomes for the benefit of certain candidates. Moreover, spammers use appealing
news headlines to generate revenue using advertisements via click-baits. In this
project we aim to perform binary classification of various news and websites
articles available online with the help of concepts pertaining to Artificial
Intelligence, Natural Language Processing and Machine Learning. We aim to
provide the user with the ability to classify the website and news as fake or real
and also check the authenticity of the website publishing the news.
2
Background and Motivation
There is a vital need to deal with the fake data spread across online
platforms since it creates hassles for users in terms of rumors, identity theft,
lack of authenticity and confidentiality, fake profiles etc. The dissemination
of false information through social media as possible undermines trust in
the news ecosystem, harms the reputations of individuals and organizations,
and causes fear in the public at large, all of which have the potential to
undermine societal stability. False news that has been generated is very
difficult to spot since the terminology used in the news is comparable to
that used in real news, and fake news is produced with the goal of instilling
confidence in the public. As a result, false news identification is required.
3
Problem Definition and Objectives
Fake news is deliberately written misleading material meant to deceive the public.
Authenticity and purpose are the two most important aspects of this concept. Fake
news has two characteristics: firstly, it contains incorrect material that could be
confirmed as such, but secondly, it is produced with the dishonest goal of
misleading readers. The distribution of false material through social media may
have important implications, such weakening public faith in the news ecosystem,
hurting an individual’s or organization’s reputation, or causing fear among the
general public, all of which can affect society’s stability. The data may be
represented as a collection of tuples consisting of headlines and text from a certain
number of news articles, with. It determines not whether a piece of info is fake in
the fake news identification issue.
The methods used to manipulate information differentiate real news from fake
news. Alternatively, news material may use deceptive tactics such as fabricating
facts to make the customer believe something they do not want to believe. It is also
possible to impose material that seems to be from reputable sources, but the
sources are not. Additionally, fraudulent features of fake news include the use of
altered material, such as headlines and pictures that do not match the information
delivered or the contextualization of fake news using real components and
information but in a misleading context
4
Literature Survey
Mykhailo Granik et. al. in their paper shows a simple approach for fake news
detection using naive Bayes classifier. This approach was implemented as
software system an test against a data set of Facebook news posts. They were
collected from three large Facebook pages each from the right and from the left, as
well as three large mainstream political news pages (Politico, CNN, ABC News).
They achieved classification accuracy of approximately 74%. Classification
accuracy for fake news is slightly worse. This may be caused by the skewness of
the dataset: only 4.9% of it is fake news.
Himank Gupta et. al. gave a framework based on different machine learning
approach that deals with various problems including accuracy shortage, time lag
(BotMaker) and high processing time to handle thousands of tweets in 1 sec.
Firstly, they have collected 400,000 tweets from HSpam14 dataset. Then they
further characterize the 150,000 spam tweets and 250,000 non- spam tweets. They
also derived some lightweight features along with the Top-30 words that are
providing highest information gain from Bag-of-Words model. 4. They were able
to achieve an accuracy of 91.65% and surpassed the existing solution by
approximately18%.
Marco L. Della Vedova et. al. first proposed a novel ML fake news detection
method which, by combining news content and social context features,
outperforms existing methods in the literature, increasing its accuracy up to
78.8%. Second they implemented their method within a Facebook Messenger
Chabot and validate it with a real-world application, obtaining a fake news
detection accuracy of 81.7%.
5
Methodology
The system will is develop in three parts. The first part is static which works on
machine learning classifier. We have to study and trained the model with 4
different classifiers and chose the best classifier for final execution. The second
part is dynamic which takes the keyword/text from user and searches online for the
truth probability of the news. The third part provides the authenticity of the URL
input by user.
In this project, we are going to use Python and its Sci-kit libraries Python has a
huge set of libraries and extensions, which can be easily used in Machine Learning.
Sci-Kit Learn library is the best source for machine learning algorithms where
nearly all types of machine learning algorithms are readily available for Python,
thus easy and quick evaluation of ML algorithms is possible. We are also going to
use Flask for the web based deployment of the model, provides client side
implementation using HTML, CSS and Javascript.
6
Data split Feature
Dataset Pre Processing
(Train Test) Extraction
Truth
Probability True False
7
News
Retrive
News Website
Extract feature
Display result
on Webpage
8
Functionalities:
1. To check whether the given web URL is fake or not. . It will give output in
the terms of true or false.
2. To check whether the given news is real or fake. It will give output in the
terms of true or false.
9
Software and Hardware Requirements
Framework: Flask.
Language: Python for backend and HTML, CSS, JS for front end.
Hardware:
1. Processor-i5
2. Hard Disk- 1TB
3. Memory – 8GB Ram
10
References
11