[go: up one dir, main page]

0% found this document useful (0 votes)
2 views14 pages

Arabic Fake News Detection Using Deep Learning

This document presents a novel hybrid model for detecting fake news in Arabic using deep learning techniques, specifically combining Arabic pre-trained BERT models with Convolutional Neural Networks (CNNs). The proposed model outperforms existing models, achieving the best F1-scores across three datasets, while also reducing training time. The study highlights the challenges of Arabic fake news detection and emphasizes the need for advanced methods due to the rapid spread of misinformation on social media.

Uploaded by

abhinavn22comp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views14 pages

Arabic Fake News Detection Using Deep Learning

This document presents a novel hybrid model for detecting fake news in Arabic using deep learning techniques, specifically combining Arabic pre-trained BERT models with Convolutional Neural Networks (CNNs). The proposed model outperforms existing models, achieving the best F1-scores across three datasets, while also reducing training time. The study highlights the challenges of Arabic fake news detection and emphasizes the need for advanced methods due to the rapid spread of misinformation on social media.

Uploaded by

abhinavn22comp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Received 3 August 2024, accepted 26 August 2024, date of publication 28 August 2024, date of current version 10 September 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3451128

Arabic Fake News Detection Using Deep Learning


NERMIN ABDELHAKIM OTHMAN1,2 , DOAA S. ELZANFALY1,2 ,
AND MOSTAFA MAHMOUD M. ELHAWARY 1
1 Information Systems Department, Faculty of Computers and Artificial Intelligence, Helwan University, Cairo 11795, Egypt
2 Faculty of Informatics and Computer Science, The British University in Egypt, Cairo 11837, Egypt
Corresponding author: Mostafa Mahmoud M. Elhawary (elhawary465@gmail.com)

ABSTRACT In recent years, the explosive growth of social media platforms has led to the rapid spread of
vast amounts of false news and rumors on the internet. This disrupts various news sources such as online
news outlets, radio and television stations, and newspapers, especially in Arab countries. Therefore, the fake
news detection problem has been raised worldwide. Arabic research in this field is very little compared
to English research. Previous researchers had used machine learning and deep learning techniques on a
large scale, but recently they used pre-trained models in their studies. Our proposed model works by using
the Arabic pre-trained Bidirectional Encoder Representations from Transformers (Arabic BERT) to extract
features from the text, then uses a Convolutional Neural Network (1D-CNN or 2D-CNN) to reduce the size
of the features and extract the important ones, then passes it to an artificial neural network to perform the
classification process. In our experiment we introduce a novel hybrid system consists of two main parts.
In the first part we try three Arabic pre-trained Bidirectional Encoder Representations from Transformers
model (APBTM) which are AraBERT, GigaBERT or MARBERT, while in the second part, we use
1D-CNN or 2D-CNN. this leads to six architectures from this system. we make our experiment by train
and evaluating every architecture using three datasets which are (Arabic News Stance (ANS), AraNews, and
Covid19Fakes). A comparison is made between the proposed model and other modern models which used
the same dataset. We made three sets of experiments depending on the used datasets. Each set includes a
group of experiments, and then we present the results in tables. Our proposed model which is the hybrid
model between AraBERT and 2D-CNN has achieved the best F1-scores of 0.6188,0.7837 and 0.8009 when
using the ANS dataset, the Ara-News dataset, and the Covid19Fakes dataset respectively. Furthermore, the
model reduces the training time by achieving better results with less number of epochs. The results indicate
that the proposed model offers the best performance, with 71% accuracy in the Arabic News Stance (ANS)
dataset outperforming the model made by Sorour and Abdelkader (2022) and the model made by Shishah
(2022) that achieved accuracy of 67% and 66% respectively.

INDEX TERMS BERT, CNN, deep learning, fake news detection, machine learning.

I. INTRODUCTION the ability to participate in opinions through comments and


The emergence of social networking sites such as Facebook, freely express his feelings about the news through the Like
Twitter, and so on helped to speed up the spread of informa- interaction feature or retweets as in Twitter. As a result, a huge
tion. Anyone on these sites can publish all the information amount of unconfirmed information has accumulated over the
they like without any regard for their credibility and reliabil- past few years [9].
ity. This makes it difficult to assess the reliability of any of Such features enable people to be attracted to these plat-
the published information. People now prefer to view news forms, causing those sites to become a fertile environment
from these platforms because they are easy to use and easy for the manufacture of false news, a good medium for its
to access. In addition, it provides another advantage, which is spread. There are often malicious intentions behind false
news, and those intentions may cause harm to people’s health,
The associate editor coordinating the review of this manuscript and as happened during the Corona Covid-19 pandemic. False
approving it for publication was Xiong Luo . news may also cause severe damage to the political system of
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
VOLUME 12, 2024 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 122363
N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

countries, especially developing countries [10]. It has become the dataset records. The second challenge is that the Arabic
necessary to find modern ways to detect false news because it language is very different from the English language, and thus
is no longer possible to rely on manual or traditional methods there is a need for better text representation and text embed-
because of this huge daily amount of false news [11]. ding for the learning part of the detection algorithm. In [33]
A lot of studies and research have been done to overcome Nasution and Onan mention the challenges with low-resource
this problem. For example, there are sites specially designed languages which affect any NLP task. In [34] authors handle
to ensure the authenticity of the news, such as the Interna- the annotated dataset limitation in one of the low resource
tional Fact-Checking Network [12], and The Washington languages which was Turkish by using data augmentation
Post Fact Checker [13]. All of these sites are trusted, but by generating synthetic data to expand the diversity and
they have become unreliable because of their limited capa- quantity of the annotated dataset. In [35] authors proposed
bilities in front of this huge daily amount of news. The a text augmentation system that adds an NLP model train-
term automatic detection of false news appeared to overcome ing data using semantic role labeling (SRL) and ant colony
this problem, which is classified as a natural language pro- optimization (ACO).
cessing problem (NLP). The NLP domain suffers from the In this paper, two proposed models are presented, two
hardiness in capturing nuanced interdependencies between models are based on the use of pre-trained Arabic models
different elements in the document, which is important in based on Transformer and the use of CNN, but the difference
this domain [14]. Researchers pay great efforts in the fake between them is in the structure of CNN. Where the first
news detection field, for example, there are models based on model works on using a one-dimensional CNN with the
the semantic analysis of the news [15], [16], and there are results from the pre-trained Arabic models, the second model
also models [17], [18], [19] that work on extracting features uses a two-dimensional CNN with the outputs of each layer
from the text such as Term Frequency-Inverse Document of pre-trained Arabic models.
Frequency (TF-IDF) [20] which is a numerical statistic used This paper is organized as follows: Section II presents the
to reflect the importance of a word in a document relative background, Section III presents related work, Section IV
to a collection of documents (the corpus) and then using shows the proposed model, section V shows the performance
classification by Machine Learning ( ML) such as Support evaluation, Section VI Experiment Results and Discussion,
Vector Machine (SVM) [21] which is a supervised learning and finally Section VII the conclusion.
model that analyzes data for classification and regression
analysis. However, recently, the use of Deep Learning (DL) II. BACKGROUND
has shown better results than previous methods in classifying In this section, we introduce and explain the main elements
false news. Such as models that use Convolutional Neural to be employed in the proposed model.
Network (CNN) and Long-Short Term Memory (LSTM) [7],
[22], [23], [24] to classify false news.LSTMs are a type of A. BIDIRECTIONAL ENCODER REPRESENTATIONS FROM
recurrent neural network (RNN) capable of learning long- TRANSFORMERS
term dependencies, which makes them suitable for tasks In 2018 Google researchers developed a pre-trained language
such as language modeling and sequence prediction. Sim- model which depends basically on transformers [31]. By the
ilarly, Gated Recurrent Units (GRUs) are another type of time, BERT had become a very powerful tool in Natural
RNN that, like LSTMs, are designed to capture long-term Language Processing (NLP) tasks that has achieved state-of
dependencies. GRUs are known for being simpler and faster the-art performance on a wide range of these tasks, including
to train compared to LSTMs. However, after the appear- question answering, sentiment analysis, text classification,
ance of the Transformer model [25] designed by Google and language translation. BERT can be augmented with other
2017, it reduced the training period compared to models techniques to enhance its ability. In [36] authors mention the
based on LSTM [26] and Gated Recurrent Unit (GRU) effect of integrating BERT with fuzzy logic to enhance the
[27], as well as achieving higher results. Therefore, pre- contextual understanding in NLP tasks. In [37] The authors
trained models based on Transformer [28], [29], [30] such make a text classification method which integrates hierar-
as Bidirectional Encoder Representations from Transform- chical graph-based modeling with BERT for dynamic fusion
ers (BERT) [31] and Generative Pre-trained Transformer2 across many stages. The new with the BERT is its ability
(GPT- 2) [32] were frequently used in the text classification to perform bidirectional training, which means that it can
process. take into account the context of the word by looking both
Although this has been well studied, most of the fake news forward and backward in a sentence. This is achieved by using
detection research is done on English-language datasets. transformers, which is a neural network architecture that can
Thus, the detection of Arabic fake news remains an interest- process sequential data, such as texts. BERT is pre-trained on
ing problem. Detecting fake news in Arabic faces two main a large corpus of texts using the masked language modeling
challenges. First, there is no large Arabic fake news dataset (MLM) objective. In MLM, some words are masked, and the
publicly available, despite most of the few available datasets model is trained to predict the masked words based on the
containing only the record (tweet, post, or news) ID and its context provided by the surrounding words. This way of train-
label, so researchers face difficulties in retrieving or scraping ing helps the model to learn general language patterns that can

122364 VOLUME 12, 2024


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

be applied to a wide range of NLP tasks. After the pretraining like structure, where each element in the grid represents a
process, BERT is ready to be fine-tuned by training it with a specific feature or pixel. As we know, the Arabic pre-trained
specific dataset and adding a special output layer, hence the bidirectional transformer model (APBTM) works serially in
enhanced BERT can be used in a specific domain. There are that each layer output is the input of the following layer. In the
variants of the BERT model that are designed for different proposed model, we introduce a novel way to benefit from
languages, so they all have different datasets and architectures each layer’s output because we suppose that each layer gives
such as AraBERT [1], MARBERT [3] and GigaBERT [2]. some information in its output about the features. To carry out
this model we have to use the 2D-CNN.
B. CONVOLUTIONAL NEURAL NETWORKS In this section, CNN is used for extracting the important
DL is a set of machine learning techniques that have become features from the output of the Arabic pre-trained model
increasingly popular in recent years. One of the most widely (AraBERT, GigaBERT, and MARBERT). There are two
used algorithms for DL is CNN, which has been successfully CNN architectures for this task, the first uses the CNN with
applied to areas such as computer vision, Natural Language one dimension to extract features from the last layer, where
Processing (NLP), and image classification. CNNs have the size of the input is equal to the embedding word in the
emerged as a promising tool for DL applications due to their Arabic pre-trained model and the depth of input is equal to
ability to accurately detect and classify features in data. CNN the max-length of Arabic pre-trained model. Figure 1 shows
models provide a powerful tool for DL applications, as they the first CNN architecture. The other uses the CNN with
can accurately detect and classify features in data with min- two dimensions to extract features from all the layers, where
imal pre-processing. CNNs are used in DL applications due the input size is equal to the embedding word in the Arabic
to their ability to extract meaningful features from images, pretrained model as the height and the max-length as the
as well as their capability to capture patterns and relationships width. The depth of the input is equal to the number of layers
in textual data. plus one. Figure 2 shows the second CNN architecture.
CNNs consist of several layers of neurons that are con-
nected through convolutional operations. The convolutional
layers are the primary components of a CNN and are respon-
sible for the feature extraction process. Each convolutional
layer is designed to detect different features in data, such
as edges and shapes. This capability allows the model to
recognize patterns in data and make predictions based on
those patterns. In addition, CNNs can learn from data and
adapt as new data is added. This allows the model to be able
to learn more complex tasks than those which it was trained
for. As a result, CNNs have been effectively used in image
classification and object recognition applications [38].
Furthermore, CNNs can process large data sets quickly and
FIGURE 1. Using 1D-CNN.
accurately, making them ideal for DL applications. CNNs
have numerous benefits in DL applications, such as feature
extraction and the ability to learn from data quickly and
accurately.

1) ONE-DIMENSION CNN
A one-dimension CNN (1D-CNN) is a deep learning archi-
tecture commonly used for processing and analyzing one-
dimensional sequential data, such as text or time series data.
It operates on a linear sequence of data points, where each
point represents a feature or an element in the sequence.
1D-CNNs have been applied to fake news detection tasks to
capture local patterns and dependencies within textual data,
in this case, the input could be a sequence of words where
FIGURE 2. Using 2D-CNN.
each of them is represented as a vector.
III. RELATED WORKS
2) TWO-DIMENSION CNN In this section we present the related work on fake news
A two-dimensional CNN (2D-CNN) is a deep learning detection, which has been studied intensively at two main
architecture commonly used for processing and analyzing methods in the artificial intelligence community: the ML
two-dimensional data, such as images. It operates on a grid method, and the DL method.

VOLUME 12, 2024 122365


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

A. MACHINE LEARNING METHOD to detect fake news. They try different splitting ratios when
Nakov et al. [17] classify false news using SVM with a splitting the dataset into training and testing records like
dataset [39] that has seven classifications for each type of (70/30 – 80/20 – 90/10) where the first number is related to
news item. They use classes two and six along with the the training ratio and the other number is related to the testing
technique described in [40]. ratio. The best performance was achieved by DT and RF with
Alkhair et al. [18] classify false news from YouTube com- an F1-score of 98.73%. This study used a very small dataset.
ments about personality deaths by using ML classifiers SVM,
Decision Tree (DT), and Multinomial Naïve Bayes (MNB) B. DEEP LEARNING METHOD
after feature extraction using TF-IDF from sentences. This Wang et al. [22] use a multi-modal model based on CNN for
study suffers from the dataset small size and the unbalance feature extraction and a neural network for the classification
between the fake and real news. of the fake news and event discriminator. The model has two
Abonizio et al. [19] used ML, to detect Fake News in inputs: text and image. The text is converted into a vector
three languages: English, Spanish, and Portuguese. They using word embedding and the image is extracted using the
applied three steps: Preprocessing, Feature Extraction, and Visual Geometry Group network-19 (VGG-19) model. The
Classification. Preprocessing includes conversion of text feature model is used for news classification and the event dis-
to UTF-8, filtering, and noise reduction, while feature criminator for detecting what is the event. The model is called
extraction uses NLP to extract grammatical information Event Adversarial Neural Networks (EANN). Although this
(Complexity, Stylistic, and Psychological) from each doc- study has its advantages, it also suffers from some limita-
ument or sentence. Classification using SVM, k-nearest tions. This model takes images as an input side by side with
neighbor (k-NN), Random Forest (RF), and eXtreme Gra- texts to extract features and train the classifier, but not all
dient Boosting (XGB) to classify Fake News into three the news be accompanied with images so the performance
classes: fake, legitimate, and satire. The authors of this would decrease, in addition to this, any fake textual news
study found that the classifier can detect satirical news with may be accompanied by a real image which does not relate to
higher accuracy than predicting fake and legitimate news, the event, so the model can be fooled. Adversarial training
i.e., the classifier high performance appears only on the may suffer from limited ability to generate variations and
satirical news. The used datasets of Spanish, and Portuguese instability. The model also consists of three main parts, this
language don’t contain satirical class, so the authors had increases the complexity of it and makes the integration of
to do some additional steps to overcome this gap, which the model not easy.
may affect the datasets efficiency. The authors also didn’t Bharadwaj and Shao [44] pre-process the raw texts to
consider the fact that every language has its characteris- extract semantic features for Recurrent Neural Networks
tics, so any dataset may need some additional preprocessing (RNNs). The model achieved an accuracy of 95.66% with the
steps. random forest classifier.
Aphiwongsophon and Chongstitvatana [41] use a Naïve Wang et al. [45] propose a model named SemSeq4FD
Bayes (NB) algorithm however they use SVM and a neu- which aims to enhance the text representation for early
ral network (NN) to improve results. The results present fake news detection. The model depends on using a pair
the SVM and neural network results better than NB, in the of representations for the same text, one of them is the
Twitter datasets. Although the results are satisfying, the used semantic relations between sentences as a complete graph,
dataset was scrapped in a narrow period, a maximum of two and the other is the text-local representation by using
months, this could affect the diversity of the dataset content. a one-dimension convolutional neural network (1D-CNN).
The study considers twenty-two attributes as raw data to When the two representations are combined, the enhanced
help the classifier like ID, name, is verified, and friends count, representation is generated. Then the enhanced representation
the usage of this number of attributes restricts the reliability is passed to a LSTM to detect the fake news. This model
of the model cause in most cases we can’t provide all these achieved a 2.315% improvement in the F1-score for the
attributes. English datasets and 1.553% in the F1-score for the Chinese
Himdi et al. [42] extract four categories of features which datasets. This model consists of three sections, so its com-
are (emotional expressiveness, syntactic–semantic roles, part plexity is high.
of speech, and contextual polarity) from a text news Arabic Adrian et al. [46] involves semantic methods in the fake
dataset consisting of 1098 records. they use 3 classifiers NB, news detection process. They used the extracted semantic
RF, and SVM to detect fake news. The best classifier was features from the texts like sentiments besides the labeled
RF with 79% F1-score. This model can’t be generalized texts (original features of the datasets) in training the ML and
because it is only trained on a very restricted topic ‘‘the DL classifiers. The DL classifiers beat the ML ones before
Hajj’’. Another limitation is that the fake news in the used integrating semantic features. After adding the semantic fea-
dataset was made empirically, so it can’t simulate the natural tures, the accuracy increased by 4.2%.
fake news, in addition, the dataset itself is relatively small. Ajao et al. [23] use three different DL models trained
Alawadh et al. [43] use an Arabic dataset consisting of on a dataset. The first model includes word embedding to
322 records. They apply different ML and DL classifiers convert the word, uses the CNN to reduce the dimensionality

122366 VOLUME 12, 2024


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

of embedding word vector, trains LSTM and dense layer for output classification. This model is complex since it uses
(full connection), and then classification. The second model multiple BERTs.
contains word embedding, a LSTM with dropout, a dense Shishah [8] uses Relative Features Classification (RFC)
layer (full connection), and classification. The last model and Named Entity Recognition (NER) to enhance the ability
contains word embedding, a LSTM, a dense layer (full con- of the BERT to detect fake news. The RFC and NER are
nection), and classification. In this study, although it achieves integrated with the BERT with a shared parameter layer.
high accuracy, it suffers from low values in the rest of the Amoudi et al. [50] conducted a comparative study to detect
other evaluation metrics like recall, precision, and F-measure. COVID-19 rumors using a dataset consisting of 4299 Ara-
The fall in these values may be caused due to limitations in the bic records which are classified into three classes true,
training dataset such as imbalance between classes or biased false, and other. The best F1-score was achieved by LSTM
in it. and BI-LSTM with root mean square optimizer which
Padnekar et al. [47] present a model for fake news was 79%. In [51] authors conduct a study using eight BERT
detection based on Bidirectional Long Short-Term Memory transformer-based models using two datasets, one of them
(Bi-LSTM) and Autoencoder trained using a dataset called is originally Arabic consisting of 10000 records, and the
the Fake News Challenge FNC-1 [48]. The model depends other one was translated from English and consisting of
on two inputs of news, the headline and body of news, to be 16000 tweets. The original Arabic dataset was balanced, and
converted to vectors by word embedding and feature extrac- the other was unbalanced. Original Arabic datasets gave the
tion from them. The Autoencoder is a double of LSTM to best results. The best F1-score was 98.9% obtained by the
connect the two outputs. The output from the Autoencoder ARBERT-based model.
passes to the Dense layer for normalization and linear vector, In [43] authors use pre-trained mini-BERT classifier with
then classifies the news using the classification stage into four different splitting ratios as mentioned before. The best perfor-
classes Agree, Disagree, Discuss, and Unrelated. The authors mance was achieved with the ratio (90/10) with an F1-score
don’t explain the essential difference between unrelated and of 98.73%.
disagreeable classes. The used dataset is very imbalanced Albalawi et al. [52] work on texts with images to detect
cause most of it is unrelated by 73%, and this can affect the rumors. They create two branches, one for extracting text
model performance. features and the other for extracting visual features then
Sorour and Abdelkader [7] use the CNN and LSTM-based concatenating them together, and then passing them to a clas-
model to detect fake news from Twitter with the Arabic News sifier. They used a dataset consisting of 4025 tweets split into
Stance (ANS) [4] dataset. The first step is embedding the text 1726 tweets from the Arafacts dataset [53] and 2299 tweets
to convert it to a vector. The second step is convolution and collected by authors. They obtained a 0.85 F1-score. These
pooling based on one dimension of the embedding words. types of models that use images in the classification process
The third step passes the output from the CNN model into suffer from the fact that not all news is accompanied by
a LSTM model. The last step uses the dense (full connec- images and most fake news which contain images tries to fool
tion) layer for labeling the output of LSTM. This study the public by using real images but not related to the event,
suffers from many mistakes in the table that presents the so the classifier can be fooled by it.
distribution of the used dataset ANS [4], and the confusion Wotaifi and Dhannoon [54] use the AraNews [5] dataset
matrices too. with a total number of 16600 records split into 8406 fake
Hussein et al. [28] use the AraBERT for Twitter false news records and 8194 real records. They proposed a hybrid model
detection using Arabic dataset content. With a unique token consisting of CNN and LSTM to detect fake news. They used
[CLS] for classification that is used as the first input token the accuracy parameter as an indication for model perfor-
for any classifier, The output of [CLS] is used by the authors mance and they achieved 91.4% accuracy. We note that the
for classification by being connected to a feed-forward neural authors used the accuracy measurement to evaluate the per-
network (FFNN), and the output is then normalized between formance of the model and didn’t present the other metrics,
0 and 1 using the sigmoid function. The used dataset is especially the F-measure. Table 1 is a brief summarization of
relatively small, with only 2556 tweets with 7 labels. This the previous work done in this field.
number of labels with the small size may increase the metrics
values but decrease the model efficiency. IV. PROPOSED MODEL
Mehta et al. [29] proposed two models based on BERT In this section, a detailed description of the proposed frame-
for fake news classification using LAIR PLUS [47], and work is presented. The primary goal is to build a fake
LAIR [49] datasets. Each model uses one BERT for each news detection model that includes two main parts. One
branch of the used dataset, so when dealing with LAIR [49] of the two main parts is the CNNs which are used on the
the model consists only of two BERTs, and consists of three features extracted from the Arabic pre-trained bidirectional
BERTS when dealing with LAIR PLUS [47], then each model transformer models (APBTM) which is the other main part.
shares the weights between them (BERTs) and concatenates Two systems architectures based on APBTM and CNN are
the output of BERTs. The dropout is used to avoid overfitting. presented. The first one depends on passing the output of
The last layer uses the full connection and soft-max function the last layer from APBTM to the one-dimensional CNN

VOLUME 12, 2024 122367


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

TABLE 1. A brief summarization of the previous work done in this field.

(1D-CNN) for convolutional processing. This architec- 2D-CNN as shown in figure 4 achieves the highest perfor-
ture failed to achieve the best results in most of the mance in most of the experiments that will be discussed in
experiments as shown in section VI. Figure 3 shows section VI.
the first architecture. The second architecture depends on
passing the output of all layers from APBTM to the A. TEXT PRE-PROCESSING
two-dimensional CNN (2D-CNN) for convolutional pro- This step is applied before using the dataset to train our
cessing. Our experiments results showed that the pro- model. We remove any non-Arabic characters, hashtags,
posed hybrid model between AraBERT (APBTM) and URLs, user mentions, and all the emojis. Then we should

122368 VOLUME 12, 2024


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

FIGURE 3. APBTM with 1D-CNN.


FIGURE 5. Bidirectional encoder architecture [55].

Any Transformer model works based on general steps such


as tokenization, embedding, position encoding, and multi-
head attention.

1) TOKENIZATION AND EMBEDDING


This step tokenizes the input sentence to tokens such as tok-
enizing the sentence ‘‘the BERT is best encoder’’ to [‘‘the’’,’’
BERT’’, ‘‘is’’,’’ best’’,’’ encoder’’], each pre-trained model
has a special tokenizer. The embedding phase creates vectors
with size d for each token in the dataset and sets an index
FIGURE 4. Proposed model architecture. number for each token. Figure 6 presents an example of
embedding in any transformer.

remove repeated sentences. We consider the text’s length,


for example, eliminate all texts that consist of only two or
three words and more than ten words. We can observe that
the number of preprocessing steps in DL is less than in ML.
This is because of the ability of DL classifiers to mimic the
human brain due to its complex neural networks.

B. ARABIC PRE-TRAINED BIDIRECTIONAL TRANSFORMER


MODELS
The Arabic pre-trained Bidirectional Transformer Mod-
els (APBTM) are based on the BERT [31] architecture
which is pre-trained with Arabic data such as Arabic wiki. FIGURE 6. Embedding phase example.
AraBERT [1], MARBERT [3] and GigaBERT [2] are varia-
tion models of the BERT that adapted to specific languages
and dialects. In this study, we try the mentioned APBTMs In the previous example, the tokenization index of ‘‘the
with the CNNs to benefit from their advantages individually BERT is the best encoder’’ is [120,1780,2679,3179,7835].
and study the results of using them. AraBERT excels in
Modern Standard Arabic and some dialects but is limited C. POSITIONAL ENCODING STEP
by computational demands and varying dialect performance, Positional encoding defines an entity’s place or position
MARBERT handles both MSA and dialects better but faces within a sequence, giving each point a distinct representation.
similar resource and dialect precision challenges, while In transformer models, the position of an item is not repre-
GigaBERT offers broad multilingual capabilities, suitable sented by a singular number, such as the index value, due to
for cross-language tasks, but may not match the perfor- a variety of factors. The indices can increase significantly in
mance of specialized models for individual languages and size for lengthy sequences. Variable length sequences may
demands significant resources. Figure 5 presents the bidirec- have issues because they would be standardized differently if
tional encoder architecture which is the base of the BERT the index number were normalized to be between 0 and 1. The
architecture. used functions are of different frequencies in the positional

VOLUME 12, 2024 122369


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

encoding [25]: MultiHead(Q, K , V )


pos = ConCat (head1 , . . . , headh ) · W O (5)
 
PE( pos, 2 i) = sin (1)
10000(2i/d) Q
 pos  where headi is a head from a set of heads of size h, Wi is the
PE(pos, 2 i + 1) = cos (2) weight of headi for queries, WiK is the weight of headi for
10000(2i/d)
keys, and WiV is the weight of headi for values.
where pos is the position word in the sentence, i is the index The computation of multi-head for all heads is derived
in the vector of position and d is the length of the position from Equation (5) where W O is the weight of multi-head.
vector.
Figure 7 shows positional encoding steps to the sentence E. APBTM DESCRIPTION
‘‘the BERT is best encoder’’ and how it will be with the addi- The two presented models use three APBTM: AraBERT [1],
tional position. Each pre-trained model-based Transformer GigaBERT [2], and MARBERT [3]. The three APBTMs pre-
has special embedding tokens. trained with Arabic data with different pre-trained datasets,
table 2 presents the description of the three APBTMs.

TABLE 2. Three APBTMs properties.

FIGURE 7. BERT positional encoding phase.

D. MULTI-HEAD ATTENTION STEP


The encoder function is to process input vectors and gen-
erate information about input parts that are relevant to each
other, then the generated information is the input for the next
encoder.
In Multi-Head Attention, the attention function maps a
query and a set of key-value pairs for output and all previous
values into vectors. The output is computed by calculating a FIGURE 8. BERT multi-head attention [25].

weighted sum of the values [25]. The weight is computed by


the query compatibility function with the corresponding key
as shown in Figure 8. V. PERFORMANCE EVALUATION
The Scaled Dot-Product Attention is to compute the This section includes our experiment implementation and the
attention. It has three inputs: queries (Q), keys (K), description of the used datasets. We used Google Colab Pro,
and values (V). The attention function is defined by laptop core i7 generation 9, GPU 16GB and 32 GB RAM.
equation 3 [25]. We also used the Scikit-learn library in our experiment.

QK T
 
Attention (Q, K , V ) = softmax √ .V (3) A. DESCRIPTION OF DATASETS
dk In this section, we introduce all the used datasets in training
where dk is the dimension of keys, Q is queries, K is keys, and test our model. We already have used three datasets
V is values. named Covid19Fakes [6], AraNews [5] and ANS [4].
The multi-head Attention layer allows the model to focus Covid19Fakes [6]: a COVID-19 Twitter dataset that
on different parts of input data simultaneously in different automatically annotated in Arabic and English. The
representations. This done in Equation 5 by concatenating Covid19Fakes contains 22000 tweet IDs with a list of labels
the outputs from individual attention heads as defined in for each tweet ID in the Arabic dataset. There are some
Equation 3 and then calculating the weights for an individual tweets and users removed from Twitter. Figure 9 presents the
head as in Equation 4 [25]. method for dataset collection from Twitter. In Covid19Fakes
dataset, we apply a pre-processing step and then remove
Q
headi = Attention(QWi , KWiK , VWiV ) (4) the very short sample. Select samples with sentence lengths

122370 VOLUME 12, 2024


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

FIGURE 9. Dataset collection from twitter. FIGURE 10. Used datasets distribution.

between 9 and 32. Select 7000 fake news samples and on a hybrid system consisting mainly of two parts: BERT
7000 real news samples. Hence, our Covid19Fakes dataset and CNN.
consists of 14000 samples.
AraNews [5] is a collection of Arabic disinformation C. EVALUATION METRICS
drawn from a variety of newspapers in 15 different Ara- In the classification task, there are many metrics for evaluat-
bic countries. also, this dataset uses machine to generate ing the model’s performance. The used metrics are accuracy,
news learned from real news. The AraNews dataset contains recall, precision, and F1-score. The evaluation metrics are
486961 samples. The disadvantages of this dataset are the calculated as follows:
empty samples, the very short samples (one and two words) • Accuracy In ML, the accuracy score is commonly used
no news is two words, the very long samples (more than to evaluate model performance. When implementing a
512 words), some news has numbers not related to the content binary classification algorithm, we use terms like False
news, and it is unbalanced. Removing the short, long, and Positive (FP), False-Negative (FN) True-Negative (TN),
empty samples. We select samples with sentence lengths and True-Positive (TP) so we can obtain an accurate
between 9 and 32, and finally, we get 7000 fake news samples evaluation of the model’s performance. A true-positive
and 7000 real news samples. Hence, our Ara-news dataset prediction refers to a precise assumption about a posi-
consisted of 14000 samples. tive instance while a false-positive means an incorrect
ANS [4]: it is a dataset for the task of stance detection in one about a negative instance that was predicted by the
Arabic news articles. There were several news outlets, includ- model as a positive. To calculate this accurately follow
ing the BBC and CNN, from which the data was provided. It is the equation:
a good dataset, but unfortunately, it is small in size. Table 3
TP + TN
shows the description of the dataset used in the experiment. Accuracy = (6)
figure 10 illustrates the datasets distribution. TP + FP + FN + TN
• Precision
TABLE 3. Description of datasets used. Precision or positive predictive value is the fraction of
relevant instances among the retrieved instances, the
Precision depends on TP and FP. To calculate Precision
follow the equation:
TP
Precision = (7)
TP + FP
B. EXPERIMENT DESIGN • Recall
In supervised learning, Algorithms are trained on labeled Recall or sensitivity is the fraction of relevant instances
datasets, the data has pre-defined input values and corre- that were retrieved. The recall depends on TP and FN.
sponding outputs, to learn a mapping function from the inputs To calculate recall follow the equation:
to the outputs, so that the model can make predictions on
TP
new data. In our experiment We use three datasets which are Recall = (8)
AraNews [5], Covid19Fakes [6], and ANS [4] to train three TP + FN
BERTS which are AraBERT, GigaBERT and MARBERT one • F1-score The F1 score includes FP and FN test findings
at a time, then passes the output from each one to a CNN and evenly weights them. F1 is more precise than accu-
which could be 1D-CNN or 2D-CNN. In each trial, we use racy in cases where the spread of the classes is uneven.
many different max-length, and then we evaluate the results The accuracy and recall values must first be computed
in different values of epochs, so our experiment depends before the F1 number can be determined. A classifier’s

VOLUME 12, 2024 122371


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

precision can be used to determine how accurate it is. TABLE 5. ANS datasets use APBT with 1D-CNN, and 2D-CNN in the third
epoch.
A significant number of false results can be a sign of
poor precision. Looking at a classifier’s memory can
help determine how comprehensive it is. The presence of
numerous false-positives is demonstrated by an inability
to accurately remember the outcomes. The following is
the equation of the F1 score:
2 × Recall × Precision
F1 = (9)
Recall + Precision
We present three experiments with three datasets. In the
first experiment, we present the results of APBT with-
out using CNN, the second experiment presents results of
APBT with 1D-CNN for the output of the last layer APBT, Table 6 presents the APBT with 2D-CNN in the second
and the third experiment presents results of APBT with epoch.
2D-CNN for the output of all layers of APBT. Applying these
TABLE 6. ANS datasets use APBT with 2D-CNN in the second epoch.
three experiments on three datasets ANS, Ara-news, and
Covid19Fakes.

VI. EXPERIMENT RESULTS AND DISCUSSION


We have three sets of experiments, and each set of them is
performed in the three datasets. In the first set, we use the
APBTs without any CNN and measure their performance on
the three datasets. In the second set, we apply the APBTs We observe that the APBT with 2D-CNN has better results
and 1D-CNN on the three datasets and measure their perfor- with some metrics in the second epoch this is because the
mance. In the third set, we apply the APBTs and 2D-CNN on model starts to overfit the training data, leading to a decrease
the three datasets and record the results. Three sets are divided in performance on unseen data patterns which is the main
into three sections according to the dataset. concept behind training the model.

A. FIRST EXPERIMENT RESULTS (USING ANS) B. SECOND EXPERIMENT RESULTS - (USING ARA-NEWS)
Applying the APBT on the ANS dataset without using CNN Applying the APBT on the Ara-news dataset without using
with 6 epochs is presented in Table 4 and by the four metrics CNN with 6 epochs and presenting the four metrics f1, Pre-
f1, Precision, Recall, and Accuracy. cision, Recall, and Accuracy in Table 7.

TABLE 4. ANS datasets use APBT without CNN. TABLE 7. Ara-news datasets using APBT without CNN.

Appling the APBT on the ANS dataset using 1D-CNN, and Apply the APBT on the Ara-news dataset using 1D-CNN,
2D-CNN, with 16, and 32 max lengths presented in Table 5 and 2D-CNN, with 16, and 32 max-length in the third epoch is
in the third epoch by the four metrics f1, Precision, Recall, presented in Table 8 by the four metrics f1, Precision, Recall,
and Accuracy. and Accuracy.

122372 VOLUME 12, 2024


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

TABLE 8. Ara-news datasets using APBT with 1D-CNN, and 2D-CNN in the TABLE 11. Covid19Fakes datasets use APBT with 1D-CNN, and 2D-CNN in
third epoch. the third epoch.

Applying the APBT on the Ara-news dataset using Applying the APBT on the Covid19fake dataset using
2DCNN, with 16, and 32 max lengths in the second epoch 2DCNN, with 16, and 32 max lengths in the second epoch
is presented in Table 9. is presented in Table 12.

TABLE 9. Ara-news datasets use APBT 2D-CNN in the second epoch. TABLE 12. Covid19Fakes dataset with APBT and 2D-CNN in the second
epoch.

We observe the APBT with 2D-CNN has better results


with some metrics in the third epoch because the dataset
We observe the APBT with 2D-CNN has better results
is balanced and large leading to a delay in the overfitting
with some metrics in both the third and second epochs
occurrence.
because the dataset is very restricted to one topic, and the
2D-CNN uses all the layers output from the APBT. Table 12
C. THIRD EXPERIMENT RESULTS - (USING COVID19FAKE)
presents the APBT with 2D-CNN in the third epoch and
Applying the APBT on the Covid19fakes dataset without
Table 12 presents the APBT with 2D-CNN in the second
using CNN with 6 epochs presented in Table 10 and present-
epoch.
ing the four metrics f1, Precision, Recall, and Accuracy.
D. DISCUSSION
TABLE 10. Covid19Fake dataset uses APBT without CNN.
In this section, we discuss the results obtained by our
model and the previous papers. We also discuss the training
dataset which is one of the most important factors affecting
the classifier performance. It’s worth mentioning that the
F1-score is a very important performance evaluation met-
ric, especially when dealing with imbalanced datasets [8].
It is the harmonic mean of precision and recall, providing
that balances the trade-off between them. While Precision
measures the accuracy of the positive predictions (how many
of the predicted positives are actually positive), and Recall
(or Sensitivity) measures the ability of the model to find all
the relevant cases within a dataset (how many of the actual
positives were correctly identified), These two metrics can
be at odds, increasing precision can decrease recall and vice
versa. The F1-score combines them into a single metric that
provides a more balanced evaluation, so we consider the
F1-score beside the accuracy.
Appling the APBT on the Covid19Fakes dataset using Despite the lack of Arabic datasets, there is also no agree-
1DCNN, and 2D-CNN, with 16, and 32 max lengths in the ment on the method of studying the datasets. The ANS [4]
third epoch is presented in Table 11 by the four metrics f1, dataset is published with the training, validation, and test
Precision, Recall, and Accuracy. set. We observe that our model outperforms the other two

VOLUME 12, 2024 122373


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

models in two important metrics (Accuracy and precision).


These two metrics are more related to the TP. Our model
outperforms the other two models because we use three
powerful BERTs as shown in Table 2 which are trained in
large datasets, furthermore, we add another part which is
the 2D-CNN which enhanced the results and the time for
training.
The used APBTMs achieved their highest performance
in the sixth epoch, but the Proposed model succeeded in
outperforming them in the third and second epoch this is an
important point that we need in detecting rumors because
time is very critical nowadays. To avoid overfitting suspicion,
we only consider the results in the second epoch to evaluate FIGURE 11. Performance comparison between the proposed model with
another two models on the ANS dataset.
the model and compare it with other models. Using 2D-CNN
enhances results this can be due to its working way because
it takes the output from all layers of APBTM, this enriches VII. CONCLUSION
the extracted information from the tweets so the features In this study, a novel architecture is presented for Arabic
become clearer. As a result, the classification step becomes fake news detection. The architecture consisted of two main
more accurate. parts, the first part is the APBT which can be AraBERT [1],
In the case of the ANS dataset, we predict that our model GigaBERT [2], or MARBERT [3]. This part encodes the
would outperform the other models in all of the evaluation sentences and extracts the features from them then passes its
metrics. output to a CNN which can be 1D-CNN in an experiment,
or 2D-CNN in another one to reduce the features so that the
TABLE 13. Comparison results of the proposed model on ANS dataset last layer can easily classify the text. This model is evaluated
with related models. using three datasets: Covid19Fakes [6], AraNews [5], and
ANS [4].
The experimental results are promising compared to
APBTMs without CNN. Our model outperforms the other
two models developed by Sorour and Abdelkader [7] and
Shishah [8] because we use a powerful APBTM (AraBERT)
As per the previous results, we can conclude that the as shown in table 13 which trained in large datasets as
datasets play a major role in fake news detection processes. shown in table 3. Figure 11 shows a comparison between
The dataset used in [50] was very unbalanced and small since our proposed model and the other two mentioned models.
it has 768 tweets labeled false, 1428 tweets labeled true, and Also, we can note the power of AraBERT in tables 4,7,10
3296 tweets labeled as others. The highest accuracy was 80% when we test the three APBTMs without CNN and find that
and the best F1-Score was 0.79. In [51] authors have used two AraBERT achieves best results in most experiments com-
datasets, the first one was collected and written in Arabic, and pared to MARBERT and GigaBERT. Furthermore, we added
the second one was from Kaggle and translated into Arabic. another part which is the 2D-CNN which enhanced the results
The best results were from the first dataset that was collected and the time for training. It reaches 71% accuracy with the
since it is a very natural one that simulates the Arabic lan- ANS dataset in the second epoch with 32 max-length, 77.51%
guage and easy to the pre-trained model to train over it. The accuracy with the Aranews dataset in the second epoch with
other dataset was translated and unbalanced, so the results 32 max-length, and 81.07% with the Covid19fakes dataset
was not good compared to the first dataset. In [52] authors try in the second epoch with 32 max-length, so we can say that
to enhance the detection performance, so they include images the proposed model reduces the number of epochs from six
with texts. Although their datasets were relatively small (only epochs in APBTMs to two epochs in our proposed model
4025 records), they achieved good results in the F1-score AraBERT with 2D-CNN.
which is 0.85. In [54] authors have used a dataset that was For future work, we plan to add more datasets, in addi-
balanced and consisted of 16600 entries which achieved an tion, a pre-processing step will be added to discover syntax
accuracy of 91.4%. Although in [43], the authors have used errors in sentences and correct them to improve the results.
a small dataset only consisting of 322 data records, which Moreover, we can add some pre-processing steps to catch
achieved an F1-score of 98.73%. therefore, we have detected the latest attempts to blur some word features to deceive the
that to develop an effective classifier, we should train our classifiers but reach the audience like writing words without
model initially by a balanced dataset composed of natural lan- diacritical marks or the dots that differ similar Arabic letters
guage samples that are written in the desired language. This from each other. We aim to build a model that gives more than
dataset should be of a suitable size and manually annotated classification but gives the real news after detecting the fake
by experts. ones to help the audience know the truth.

122374 VOLUME 12, 2024


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

REFERENCES [26] S. Hochreiter and J. Schmidhuber, ‘‘Long short-term memory,’’ Neural


[1] W. Antoun, F. Baly, and H. Hajj, ‘‘AraBERT: Transformer-based model for Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997.
Arabic language understanding,’’ 2020, arXiv:2003.00104. [27] K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares,
H. Schwenk, and Y. Bengio, ‘‘Learning phrase representations using
[2] W. Lan, Y. Chen, W. Xu, and A. Ritter, ‘‘An empirical study of pre-trained
RNN encoder–decoder for statistical machine translation,’’ 2014,
transformers for Arabic information extraction,’’ 2020, arXiv:2004.14519.
arXiv:1406.1078.
[3] M. Abdul-Mageed, A. Elmadany, and E. M. B. Nagoudi, ‘‘ARBERT
[28] A. Hussein, N. Ghneim, and A. Joukhadar, ‘‘DamascusTeam at
& MARBERT: Deep bidirectional transformers for Arabic,’’ 2020,
NLP4IF2021: Fighting the Arabic COVID-19 infodemic on Twitter using
arXiv:2101.01785.
AraBERT,’’ in Proc. 4th Workshop NLP Internet Freedom, Censorship,
[4] J. Khouja, ‘‘Stance prediction and claim verification: An Arabic perspec-
Disinformation, Propaganda, 2021, pp. 93–98.
tive,’’ 2020, arXiv:2005.10410.
[29] D. Mehta, A. Dwivedi, A. Patra, and M. A. Kumar, ‘‘A transformer-
[5] E. M. B. Nagoudi, A. Elmadany, M. Abdul-Mageed, T. Alhindi, and
based architecture for fake news classification,’’ Social Netw. Anal. Mining,
H. Cavusoglu, ‘‘Machine generation and detection of Arabic manipulated
vol. 11, no. 1, pp. 1–12, Dec. 2021.
and fake news,’’ 2020, arXiv:2011.03092.
[30] R. K. Kaliyar, A. Goswami, and P. Narang, ‘‘FakeBERT: Fake news
[6] M. K. Elhadad, K. F. Li, and F. Gebali, ‘‘COVID-19-fakes: A Twitter (Ara-
detection in social media with a BERT-based deep learning approach,’’
bic/English) dataset for detecting misleading information on COVID-19,’’
Multimedia Tools Appl., vol. 80, no. 8, pp. 11765–11788, Mar. 2021.
in Proc. 12th Int. Conf. Adv. Intell. Netw. Collaborative Syst. (INCoS).
Victoria, BC, Canada: Springer, 2021, pp. 256–268. [31] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, ‘‘BERT: Pre-training
of deep bidirectional transformers for language understanding,’’ 2018,
[7] S. E. Sorour and H. E. Abdelkader, ‘‘AFND: Arabic fake news detection
arXiv:1810.04805.
with an ensemble deep CNN-LSTM model,’’ J. Theor. Appl. Inf. Technol.,
[32] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever,
vol. 100, no. 14, pp. 5072–5086, 2022.
‘‘Language models are unsupervised multitask learners,’’ OpenAI blog,
[8] W. Shishah, ‘‘JointBert for detecting Arabic fake news,’’ IEEE Access,
vol. 1, no. 8, p. 9, 2019.
vol. 10, pp. 71951–71960, 2022.
[33] A. H. Nasution and A. Onan, ‘‘ChatGPT label: Comparing the quality
[9] K. Comeforo, ‘‘Review essay: Manufacturing consent: The political
of human-generated and LLM-generated annotations in low-resource lan-
economy of the mass media,’’ Global Media Commun., vol. 6, no. 2,
guage NLP tasks,’’ IEEE Access, vol. 12, pp. 71876–71900, 2024.
pp. 218–230, Aug. 2010.
[34] A. Onan and K. F. Balbal, ‘‘Improving Turkish text sentiment classification
[10] C. Dewey, ‘‘Facebook has repeatedly trended fake news since firing its through task-specific and universal transformations: An ensemble data
human editors,’’ Washington Post, 2016, vol. 12.
augmentation approach,’’ IEEE Access, vol. 12, pp. 4413–4458, 2024.
[11] X. Zhou, R. Zafarani, K. Shu, and H. Liu, ‘‘Fake news: Fundamental [35] A. Onan, ‘‘SRL-ACO: A text augmentation framework based on semantic
theories, detection strategies and challenges,’’ in Proc. 12th ACM Int. Conf. role labeling and ant colony optimization,’’ J. King Saud Univ.-Comput.
Web Search Data Mining, Jan. 2019, pp. 836–837. Inf. Sci., vol. 35, no. 7, Jul. 2023, Art. no. 101611.
[12] International Fact-Checking Network. Accessed: Mar. 17, 2024. [Online]. [36] A. Onan and H. A. Alhumyani, ‘‘FuzzyTP-BERT: Enhancing extractive
Available: https://www.poynter.org/ifcn/ text summarization with fuzzy topic modeling and transformer net-
[13] The Washington Post. (Year) The Washington Post Fact Checker. works,’’ J. King Saud Univ.-Comput. Inf. Sci., vol. 36, no. 6, Jul. 2024,
Accessed: Mar. 17, 2024. [Online]. Available: https://www. Art. no. 102080.
washingtonpost.com/politics/fact-checker/ [37] A. Onan, ‘‘Hierarchical graph-based text classification framework with
[14] A. Onan and H. Alhumyani, ‘‘Contextual hypergraph networks for contextual node embedding and BERT-based dynamic fusion,’’ J. King
enhanced extractive summarization: Introducing multi-element contextual Saud Univ.-Comput. Inf. Sci., vol. 35, no. 7, Jul. 2023, Art. no. 101610.
hypergraph extractive summarizer (MCHES),’’ Appl. Sci., vol. 14, no. 11, [38] L. Alzubaidi, J. Zhang, A. J. Humaidi, A. Al-Dujaili, Y. Duan,
p. 4671, May 2024. O. Al-Shamma, J. Santamaría, M. A. Fadhel, M. Al-Amidie, and L. Farhan,
[15] S. Feng, R. Banerjee, and Y. Choi, ‘‘Syntactic stylometry for deception ‘‘Review of deep learning: Concepts, CNN architectures, challenges, appli-
detection,’’ in Proc. 50th Annu. Meeting Assoc. Comput. Linguistics, vol. 2, cations, future directions,’’ J. Big Data, vol. 8, no. 1, pp. 1–74, Mar. 2021.
Jul. 2012, pp. 171–175. [39] F. Alam, H. Sajjad, M. Imran, and F. Ofli, ‘‘CrisisBench: Benchmarking
[16] M. Potthast, J. Kiesel, K. Reinartz, J. Bevendorff, and B. Stein, crisis-related social media datasets for humanitarian information pro-
‘‘A stylometric inquiry into hyperpartisan and fake news,’’ 2017, cessing,’’ in Proc. Int. AAAI Conf. Web Social Media, vol. 15, 2021,
arXiv:1702.05638. pp. 923–932.
[17] P. Nakov, F. Alam, S. Shaar, G. Da San Martino, and Y. Zhang, ‘‘A second [40] F. Alam, F. Dalvi, S. Shaar, N. Durrani, H. Mubarak, A. Nikolov,
pandemic? Analysis of fake news about COVID-19 vaccines in Qatar,’’ D. S. Martino, A. Abdelali, H. Sajjad, and K. Darwish, ‘‘Fighting the
2021, arXiv:2109.11372. COVID-19 infodemic in social media: A holistic perspective and a call
[18] M. Alkhair, K. Meftouh, K. Smaili, and N. Othman, ‘‘An Arabic corpus of to arms,’’ in Proc. Int. AAAI Conf. Web Social Media, vol. 15, May 2021,
fake news: Collection, analysis and classification,’’ in Proc. 7th Int. Conf. pp. 913–922.
Arabic Language Process., Nancy, France, Jul. 2019, pp. 292–302. [41] S. Aphiwongsophon and P. Chongstitvatana, ‘‘Detecting fake news with
[19] H. Q. Abonizio, J. I. de Morais, G. M. Tavares, and S. B. Junior, machine learning method,’’ in Proc. 15th Int. Conf. Electr. Eng./Electron.,
‘‘Language-independent fake news detection: English, Portuguese, and Comput., Telecommun. Inf. Technol. (ECTI-CON), Jul. 2018, pp. 528–531.
Spanish mutual features,’’ Future Internet, vol. 12, no. 5, p. 87, May 2020. [42] H. Himdi, G. Weir, F. Assiri, and H. Al-Barhamtoshy, ‘‘Arabic fake news
[20] J. Beel, B. Gipp, S. Langer, and C. Breitinger, ‘‘Paper recommender detection based on textual analysis,’’ Arabian J. Sci. Eng., vol. 47, no. 8,
systems: A literature survey,’’ Int. J. Digit. Libraries, vol. 17, pp. 305–338, pp. 10453–10469, Aug. 2022.
Nov. 2016. [43] H. M. Alawadh, A. Alabrah, T. Meraj, and H. T. Rauf, ‘‘Attention-enriched
[21] C. Cortes and V. Vapnik, ‘‘Support-vector networks,’’ Mach. Learn., mini-BERT fake news analyzer using the Arabic language,’’ Future Inter-
vol. 20, pp. 273–297, 1995. net, vol. 15, no. 2, p. 44, Jan. 2023.
[22] Y. Wang, F. Ma, Z. Jin, Y. Yuan, G. Xun, K. Jha, L. Su, and J. Gao, ‘‘EANN: [44] P. Bharadwaj and Z. Shao, ‘‘Fake news detection with semantic features
Event adversarial neural networks for multi-modal fake news detection,’’ and text mining,’’ Int. J. Natural Lang. Comput., vol. 8, no. 3, pp. 17–22,
in Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Jun. 2019.
Jul. 2018, pp. 849–857. [45] Y. Wang, L. Wang, Y. Yang, and T. Lian, ‘‘SemSeq4FD: Integrating global
[23] O. Ajao, D. Bhowmik, and S. Zargari, ‘‘Fake news identification on Twitter semantic relationship and local sequential order to enhance text represen-
with hybrid CNN and RNN models,’’ in Proc. 9th Int. Conf. Social Media tation for fake news detection,’’ Exp. Syst. Appl., vol. 166, Mar. 2021,
Soc., Jul. 2018, pp. 226–230. Art. no. 114090.
[24] M. Umer, Z. Imtiaz, S. Ullah, A. Mehmood, G. S. Choi, and B.-W. On, [46] A. M. P. Bras̋oveanu and R. Andonie, ‘‘Integrating machine learning
‘‘Fake news stance detection using deep learning architecture (CNN- techniques in semantic fake news detection,’’ Neural Process. Lett., vol. 53,
LSTM),’’ IEEE Access, vol. 8, pp. 156695–156706, 2020. no. 5, pp. 3055–3072, Oct. 2021.
[25] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, [47] S. M. Padnekar, G. S. Kumar, and P. Deepak, ‘‘BiLSTM-autoencoder
L. Kaiser, and I. Polosukhin, ‘‘Attention is all you need,’’ in Proc. Adv. architecture for stance prediction,’’ in Proc. Int. Conf. Data Sci. Eng.
Neural Inf. Process. Syst., vol. 30, 2017. (ICDSE), Dec. 2020, pp. 1–5.

VOLUME 12, 2024 122375


N. A. Othman et al.: Arabic Fake News Detection Using Deep Learning

[48] D. Pomerleau and D. Rao. (2017). Fake News Challenge Stage 1 (FNCI): NERMIN ABDELHAKIM OTHMAN received the Ph.D. degree in infor-
Stance Detection. [Online]. Available: www.fakenewschallenge.org mation systems from Helwan University. She is currently a Lecturer of
[49] W. Y. Wang, ‘‘‘Liar, liar pants on fire’: A new benchmark dataset for fake informatics with Helwan University and The British University in Egypt.
news detection,’’ 2017, arXiv:1705.00648. Her research interests include data mining, machine learning, deep learning,
[50] G. Amoudi, R. Albalawi, F. Baothman, A. Jamal, H. Alghamdi, and and sentiment analysis.
A. Alhothali, ‘‘Arabic rumor detection: A comparative study,’’ Alexandria
Eng. J., vol. 61, no. 12, pp. 12511–12523, Dec. 2022.
[51] A. B. Nassif, A. Elnagar, O. Elgendy, and Y. Afadar, ‘‘Arabic fake news
detection based on deep contextualized embedding models,’’ Neural Com-
put. Appl., vol. 34, no. 18, pp. 16019–16032, Sep. 2022.
[52] R. M. Albalawi, A. T. Jamal, A. O. Khadidos, and A. M. Alhothali,
‘‘Multimodal Arabic rumors detection,’’ IEEE Access, vol. 11, DOAA S. ELZANFALY received the joint Ph.D. degree in distributed query
pp. 9716–9730, 2023. processing from Helwan University, Egypt, and the University of Con-
[53] Z. S. Ali, W. Mansour, T. Elsayed, and A. Al-Ali, ‘‘Arafacts: The first large necticut, USA. She is currently an Associate Professor of informatics with
Arabic dataset of naturally occurring claims,’’ in Proc. 6th Arabic Natural Helwan University and The British University in Egypt. Her research inter-
Lang. Process. Workshop, 2021, pp. 231–236. ests include data management and mining, Arabic sentiment analysis and
[54] T. A. Wotaifi and B. N. Dhannoon, ‘‘An effective hybrid deep neural opinion mining, keyword search using ontologies, and rumor detection and
network for Arabic fake news detection,’’ Baghdad Sci. J., vol. 20, no. 4, identification.
p. 1392, Jan. 2023.
[55] A. Alokla, W. Gad, W. Nazih, M. Aref, and A.-B. Salem, ‘‘Pseudocode gen-
eration from source code using the BART model,’’ Mathematics, vol. 10,
no. 21, p. 3967, Oct. 2022.
[56] I. Abu El-khair, ‘‘1.5 billion words Arabic corpus,’’ 2016,
arXiv:1611.04033.
[57] I. Zeroual, D. Goldhahn, T. Eckart, and A. Lakhouaja, ‘‘OSIAN: Open
MOSTAFA MAHMOUD M. ELHAWARY received the B.Sc. degree (Hons.)
source international Arabic news corpus—Preparation and integration into from the Military Technical College, Cairo, Egypt, in 2016. He is currently
the CLARIN-infrastructure,’’ in Proc. 4th Arabic Natural Lang. Process. pursuing the M.Sc. degree in artificial intelligence with the Faculty of Com-
Workshop, 2019, pp. 175–182. puters and Artificial Intelligence, Helwan University, Cairo. His research
[58] P. J. O. Suárez, B. Sagot, and L. Romary, ‘‘Asynchronous pipeline for interests include artificial intelligence, data science, machine learning, social
processing huge corpora on medium to low resource infrastructures,’’ in network analysis, and software engineering.
Proc. 7th Workshop Challenges Manag. Large Corpora, 2019, pp. 1–9.

122376 VOLUME 12, 2024

You might also like