Arabic Fake News Detection Using Deep Learning
Arabic Fake News Detection Using Deep Learning
ABSTRACT In recent years, the explosive growth of social media platforms has led to the rapid spread of
vast amounts of false news and rumors on the internet. This disrupts various news sources such as online
news outlets, radio and television stations, and newspapers, especially in Arab countries. Therefore, the fake
news detection problem has been raised worldwide. Arabic research in this field is very little compared
to English research. Previous researchers had used machine learning and deep learning techniques on a
large scale, but recently they used pre-trained models in their studies. Our proposed model works by using
the Arabic pre-trained Bidirectional Encoder Representations from Transformers (Arabic BERT) to extract
features from the text, then uses a Convolutional Neural Network (1D-CNN or 2D-CNN) to reduce the size
of the features and extract the important ones, then passes it to an artificial neural network to perform the
classification process. In our experiment we introduce a novel hybrid system consists of two main parts.
In the first part we try three Arabic pre-trained Bidirectional Encoder Representations from Transformers
model (APBTM) which are AraBERT, GigaBERT or MARBERT, while in the second part, we use
1D-CNN or 2D-CNN. this leads to six architectures from this system. we make our experiment by train
and evaluating every architecture using three datasets which are (Arabic News Stance (ANS), AraNews, and
Covid19Fakes). A comparison is made between the proposed model and other modern models which used
the same dataset. We made three sets of experiments depending on the used datasets. Each set includes a
group of experiments, and then we present the results in tables. Our proposed model which is the hybrid
model between AraBERT and 2D-CNN has achieved the best F1-scores of 0.6188,0.7837 and 0.8009 when
using the ANS dataset, the Ara-News dataset, and the Covid19Fakes dataset respectively. Furthermore, the
model reduces the training time by achieving better results with less number of epochs. The results indicate
that the proposed model offers the best performance, with 71% accuracy in the Arabic News Stance (ANS)
dataset outperforming the model made by Sorour and Abdelkader (2022) and the model made by Shishah
(2022) that achieved accuracy of 67% and 66% respectively.
INDEX TERMS BERT, CNN, deep learning, fake news detection, machine learning.
countries, especially developing countries [10]. It has become the dataset records. The second challenge is that the Arabic
necessary to find modern ways to detect false news because it language is very different from the English language, and thus
is no longer possible to rely on manual or traditional methods there is a need for better text representation and text embed-
because of this huge daily amount of false news [11]. ding for the learning part of the detection algorithm. In [33]
A lot of studies and research have been done to overcome Nasution and Onan mention the challenges with low-resource
this problem. For example, there are sites specially designed languages which affect any NLP task. In [34] authors handle
to ensure the authenticity of the news, such as the Interna- the annotated dataset limitation in one of the low resource
tional Fact-Checking Network [12], and The Washington languages which was Turkish by using data augmentation
Post Fact Checker [13]. All of these sites are trusted, but by generating synthetic data to expand the diversity and
they have become unreliable because of their limited capa- quantity of the annotated dataset. In [35] authors proposed
bilities in front of this huge daily amount of news. The a text augmentation system that adds an NLP model train-
term automatic detection of false news appeared to overcome ing data using semantic role labeling (SRL) and ant colony
this problem, which is classified as a natural language pro- optimization (ACO).
cessing problem (NLP). The NLP domain suffers from the In this paper, two proposed models are presented, two
hardiness in capturing nuanced interdependencies between models are based on the use of pre-trained Arabic models
different elements in the document, which is important in based on Transformer and the use of CNN, but the difference
this domain [14]. Researchers pay great efforts in the fake between them is in the structure of CNN. Where the first
news detection field, for example, there are models based on model works on using a one-dimensional CNN with the
the semantic analysis of the news [15], [16], and there are results from the pre-trained Arabic models, the second model
also models [17], [18], [19] that work on extracting features uses a two-dimensional CNN with the outputs of each layer
from the text such as Term Frequency-Inverse Document of pre-trained Arabic models.
Frequency (TF-IDF) [20] which is a numerical statistic used This paper is organized as follows: Section II presents the
to reflect the importance of a word in a document relative background, Section III presents related work, Section IV
to a collection of documents (the corpus) and then using shows the proposed model, section V shows the performance
classification by Machine Learning ( ML) such as Support evaluation, Section VI Experiment Results and Discussion,
Vector Machine (SVM) [21] which is a supervised learning and finally Section VII the conclusion.
model that analyzes data for classification and regression
analysis. However, recently, the use of Deep Learning (DL) II. BACKGROUND
has shown better results than previous methods in classifying In this section, we introduce and explain the main elements
false news. Such as models that use Convolutional Neural to be employed in the proposed model.
Network (CNN) and Long-Short Term Memory (LSTM) [7],
[22], [23], [24] to classify false news.LSTMs are a type of A. BIDIRECTIONAL ENCODER REPRESENTATIONS FROM
recurrent neural network (RNN) capable of learning long- TRANSFORMERS
term dependencies, which makes them suitable for tasks In 2018 Google researchers developed a pre-trained language
such as language modeling and sequence prediction. Sim- model which depends basically on transformers [31]. By the
ilarly, Gated Recurrent Units (GRUs) are another type of time, BERT had become a very powerful tool in Natural
RNN that, like LSTMs, are designed to capture long-term Language Processing (NLP) tasks that has achieved state-of
dependencies. GRUs are known for being simpler and faster the-art performance on a wide range of these tasks, including
to train compared to LSTMs. However, after the appear- question answering, sentiment analysis, text classification,
ance of the Transformer model [25] designed by Google and language translation. BERT can be augmented with other
2017, it reduced the training period compared to models techniques to enhance its ability. In [36] authors mention the
based on LSTM [26] and Gated Recurrent Unit (GRU) effect of integrating BERT with fuzzy logic to enhance the
[27], as well as achieving higher results. Therefore, pre- contextual understanding in NLP tasks. In [37] The authors
trained models based on Transformer [28], [29], [30] such make a text classification method which integrates hierar-
as Bidirectional Encoder Representations from Transform- chical graph-based modeling with BERT for dynamic fusion
ers (BERT) [31] and Generative Pre-trained Transformer2 across many stages. The new with the BERT is its ability
(GPT- 2) [32] were frequently used in the text classification to perform bidirectional training, which means that it can
process. take into account the context of the word by looking both
Although this has been well studied, most of the fake news forward and backward in a sentence. This is achieved by using
detection research is done on English-language datasets. transformers, which is a neural network architecture that can
Thus, the detection of Arabic fake news remains an interest- process sequential data, such as texts. BERT is pre-trained on
ing problem. Detecting fake news in Arabic faces two main a large corpus of texts using the masked language modeling
challenges. First, there is no large Arabic fake news dataset (MLM) objective. In MLM, some words are masked, and the
publicly available, despite most of the few available datasets model is trained to predict the masked words based on the
containing only the record (tweet, post, or news) ID and its context provided by the surrounding words. This way of train-
label, so researchers face difficulties in retrieving or scraping ing helps the model to learn general language patterns that can
be applied to a wide range of NLP tasks. After the pretraining like structure, where each element in the grid represents a
process, BERT is ready to be fine-tuned by training it with a specific feature or pixel. As we know, the Arabic pre-trained
specific dataset and adding a special output layer, hence the bidirectional transformer model (APBTM) works serially in
enhanced BERT can be used in a specific domain. There are that each layer output is the input of the following layer. In the
variants of the BERT model that are designed for different proposed model, we introduce a novel way to benefit from
languages, so they all have different datasets and architectures each layer’s output because we suppose that each layer gives
such as AraBERT [1], MARBERT [3] and GigaBERT [2]. some information in its output about the features. To carry out
this model we have to use the 2D-CNN.
B. CONVOLUTIONAL NEURAL NETWORKS In this section, CNN is used for extracting the important
DL is a set of machine learning techniques that have become features from the output of the Arabic pre-trained model
increasingly popular in recent years. One of the most widely (AraBERT, GigaBERT, and MARBERT). There are two
used algorithms for DL is CNN, which has been successfully CNN architectures for this task, the first uses the CNN with
applied to areas such as computer vision, Natural Language one dimension to extract features from the last layer, where
Processing (NLP), and image classification. CNNs have the size of the input is equal to the embedding word in the
emerged as a promising tool for DL applications due to their Arabic pre-trained model and the depth of input is equal to
ability to accurately detect and classify features in data. CNN the max-length of Arabic pre-trained model. Figure 1 shows
models provide a powerful tool for DL applications, as they the first CNN architecture. The other uses the CNN with
can accurately detect and classify features in data with min- two dimensions to extract features from all the layers, where
imal pre-processing. CNNs are used in DL applications due the input size is equal to the embedding word in the Arabic
to their ability to extract meaningful features from images, pretrained model as the height and the max-length as the
as well as their capability to capture patterns and relationships width. The depth of the input is equal to the number of layers
in textual data. plus one. Figure 2 shows the second CNN architecture.
CNNs consist of several layers of neurons that are con-
nected through convolutional operations. The convolutional
layers are the primary components of a CNN and are respon-
sible for the feature extraction process. Each convolutional
layer is designed to detect different features in data, such
as edges and shapes. This capability allows the model to
recognize patterns in data and make predictions based on
those patterns. In addition, CNNs can learn from data and
adapt as new data is added. This allows the model to be able
to learn more complex tasks than those which it was trained
for. As a result, CNNs have been effectively used in image
classification and object recognition applications [38].
Furthermore, CNNs can process large data sets quickly and
FIGURE 1. Using 1D-CNN.
accurately, making them ideal for DL applications. CNNs
have numerous benefits in DL applications, such as feature
extraction and the ability to learn from data quickly and
accurately.
1) ONE-DIMENSION CNN
A one-dimension CNN (1D-CNN) is a deep learning archi-
tecture commonly used for processing and analyzing one-
dimensional sequential data, such as text or time series data.
It operates on a linear sequence of data points, where each
point represents a feature or an element in the sequence.
1D-CNNs have been applied to fake news detection tasks to
capture local patterns and dependencies within textual data,
in this case, the input could be a sequence of words where
FIGURE 2. Using 2D-CNN.
each of them is represented as a vector.
III. RELATED WORKS
2) TWO-DIMENSION CNN In this section we present the related work on fake news
A two-dimensional CNN (2D-CNN) is a deep learning detection, which has been studied intensively at two main
architecture commonly used for processing and analyzing methods in the artificial intelligence community: the ML
two-dimensional data, such as images. It operates on a grid method, and the DL method.
A. MACHINE LEARNING METHOD to detect fake news. They try different splitting ratios when
Nakov et al. [17] classify false news using SVM with a splitting the dataset into training and testing records like
dataset [39] that has seven classifications for each type of (70/30 – 80/20 – 90/10) where the first number is related to
news item. They use classes two and six along with the the training ratio and the other number is related to the testing
technique described in [40]. ratio. The best performance was achieved by DT and RF with
Alkhair et al. [18] classify false news from YouTube com- an F1-score of 98.73%. This study used a very small dataset.
ments about personality deaths by using ML classifiers SVM,
Decision Tree (DT), and Multinomial Naïve Bayes (MNB) B. DEEP LEARNING METHOD
after feature extraction using TF-IDF from sentences. This Wang et al. [22] use a multi-modal model based on CNN for
study suffers from the dataset small size and the unbalance feature extraction and a neural network for the classification
between the fake and real news. of the fake news and event discriminator. The model has two
Abonizio et al. [19] used ML, to detect Fake News in inputs: text and image. The text is converted into a vector
three languages: English, Spanish, and Portuguese. They using word embedding and the image is extracted using the
applied three steps: Preprocessing, Feature Extraction, and Visual Geometry Group network-19 (VGG-19) model. The
Classification. Preprocessing includes conversion of text feature model is used for news classification and the event dis-
to UTF-8, filtering, and noise reduction, while feature criminator for detecting what is the event. The model is called
extraction uses NLP to extract grammatical information Event Adversarial Neural Networks (EANN). Although this
(Complexity, Stylistic, and Psychological) from each doc- study has its advantages, it also suffers from some limita-
ument or sentence. Classification using SVM, k-nearest tions. This model takes images as an input side by side with
neighbor (k-NN), Random Forest (RF), and eXtreme Gra- texts to extract features and train the classifier, but not all
dient Boosting (XGB) to classify Fake News into three the news be accompanied with images so the performance
classes: fake, legitimate, and satire. The authors of this would decrease, in addition to this, any fake textual news
study found that the classifier can detect satirical news with may be accompanied by a real image which does not relate to
higher accuracy than predicting fake and legitimate news, the event, so the model can be fooled. Adversarial training
i.e., the classifier high performance appears only on the may suffer from limited ability to generate variations and
satirical news. The used datasets of Spanish, and Portuguese instability. The model also consists of three main parts, this
language don’t contain satirical class, so the authors had increases the complexity of it and makes the integration of
to do some additional steps to overcome this gap, which the model not easy.
may affect the datasets efficiency. The authors also didn’t Bharadwaj and Shao [44] pre-process the raw texts to
consider the fact that every language has its characteris- extract semantic features for Recurrent Neural Networks
tics, so any dataset may need some additional preprocessing (RNNs). The model achieved an accuracy of 95.66% with the
steps. random forest classifier.
Aphiwongsophon and Chongstitvatana [41] use a Naïve Wang et al. [45] propose a model named SemSeq4FD
Bayes (NB) algorithm however they use SVM and a neu- which aims to enhance the text representation for early
ral network (NN) to improve results. The results present fake news detection. The model depends on using a pair
the SVM and neural network results better than NB, in the of representations for the same text, one of them is the
Twitter datasets. Although the results are satisfying, the used semantic relations between sentences as a complete graph,
dataset was scrapped in a narrow period, a maximum of two and the other is the text-local representation by using
months, this could affect the diversity of the dataset content. a one-dimension convolutional neural network (1D-CNN).
The study considers twenty-two attributes as raw data to When the two representations are combined, the enhanced
help the classifier like ID, name, is verified, and friends count, representation is generated. Then the enhanced representation
the usage of this number of attributes restricts the reliability is passed to a LSTM to detect the fake news. This model
of the model cause in most cases we can’t provide all these achieved a 2.315% improvement in the F1-score for the
attributes. English datasets and 1.553% in the F1-score for the Chinese
Himdi et al. [42] extract four categories of features which datasets. This model consists of three sections, so its com-
are (emotional expressiveness, syntactic–semantic roles, part plexity is high.
of speech, and contextual polarity) from a text news Arabic Adrian et al. [46] involves semantic methods in the fake
dataset consisting of 1098 records. they use 3 classifiers NB, news detection process. They used the extracted semantic
RF, and SVM to detect fake news. The best classifier was features from the texts like sentiments besides the labeled
RF with 79% F1-score. This model can’t be generalized texts (original features of the datasets) in training the ML and
because it is only trained on a very restricted topic ‘‘the DL classifiers. The DL classifiers beat the ML ones before
Hajj’’. Another limitation is that the fake news in the used integrating semantic features. After adding the semantic fea-
dataset was made empirically, so it can’t simulate the natural tures, the accuracy increased by 4.2%.
fake news, in addition, the dataset itself is relatively small. Ajao et al. [23] use three different DL models trained
Alawadh et al. [43] use an Arabic dataset consisting of on a dataset. The first model includes word embedding to
322 records. They apply different ML and DL classifiers convert the word, uses the CNN to reduce the dimensionality
of embedding word vector, trains LSTM and dense layer for output classification. This model is complex since it uses
(full connection), and then classification. The second model multiple BERTs.
contains word embedding, a LSTM with dropout, a dense Shishah [8] uses Relative Features Classification (RFC)
layer (full connection), and classification. The last model and Named Entity Recognition (NER) to enhance the ability
contains word embedding, a LSTM, a dense layer (full con- of the BERT to detect fake news. The RFC and NER are
nection), and classification. In this study, although it achieves integrated with the BERT with a shared parameter layer.
high accuracy, it suffers from low values in the rest of the Amoudi et al. [50] conducted a comparative study to detect
other evaluation metrics like recall, precision, and F-measure. COVID-19 rumors using a dataset consisting of 4299 Ara-
The fall in these values may be caused due to limitations in the bic records which are classified into three classes true,
training dataset such as imbalance between classes or biased false, and other. The best F1-score was achieved by LSTM
in it. and BI-LSTM with root mean square optimizer which
Padnekar et al. [47] present a model for fake news was 79%. In [51] authors conduct a study using eight BERT
detection based on Bidirectional Long Short-Term Memory transformer-based models using two datasets, one of them
(Bi-LSTM) and Autoencoder trained using a dataset called is originally Arabic consisting of 10000 records, and the
the Fake News Challenge FNC-1 [48]. The model depends other one was translated from English and consisting of
on two inputs of news, the headline and body of news, to be 16000 tweets. The original Arabic dataset was balanced, and
converted to vectors by word embedding and feature extrac- the other was unbalanced. Original Arabic datasets gave the
tion from them. The Autoencoder is a double of LSTM to best results. The best F1-score was 98.9% obtained by the
connect the two outputs. The output from the Autoencoder ARBERT-based model.
passes to the Dense layer for normalization and linear vector, In [43] authors use pre-trained mini-BERT classifier with
then classifies the news using the classification stage into four different splitting ratios as mentioned before. The best perfor-
classes Agree, Disagree, Discuss, and Unrelated. The authors mance was achieved with the ratio (90/10) with an F1-score
don’t explain the essential difference between unrelated and of 98.73%.
disagreeable classes. The used dataset is very imbalanced Albalawi et al. [52] work on texts with images to detect
cause most of it is unrelated by 73%, and this can affect the rumors. They create two branches, one for extracting text
model performance. features and the other for extracting visual features then
Sorour and Abdelkader [7] use the CNN and LSTM-based concatenating them together, and then passing them to a clas-
model to detect fake news from Twitter with the Arabic News sifier. They used a dataset consisting of 4025 tweets split into
Stance (ANS) [4] dataset. The first step is embedding the text 1726 tweets from the Arafacts dataset [53] and 2299 tweets
to convert it to a vector. The second step is convolution and collected by authors. They obtained a 0.85 F1-score. These
pooling based on one dimension of the embedding words. types of models that use images in the classification process
The third step passes the output from the CNN model into suffer from the fact that not all news is accompanied by
a LSTM model. The last step uses the dense (full connec- images and most fake news which contain images tries to fool
tion) layer for labeling the output of LSTM. This study the public by using real images but not related to the event,
suffers from many mistakes in the table that presents the so the classifier can be fooled by it.
distribution of the used dataset ANS [4], and the confusion Wotaifi and Dhannoon [54] use the AraNews [5] dataset
matrices too. with a total number of 16600 records split into 8406 fake
Hussein et al. [28] use the AraBERT for Twitter false news records and 8194 real records. They proposed a hybrid model
detection using Arabic dataset content. With a unique token consisting of CNN and LSTM to detect fake news. They used
[CLS] for classification that is used as the first input token the accuracy parameter as an indication for model perfor-
for any classifier, The output of [CLS] is used by the authors mance and they achieved 91.4% accuracy. We note that the
for classification by being connected to a feed-forward neural authors used the accuracy measurement to evaluate the per-
network (FFNN), and the output is then normalized between formance of the model and didn’t present the other metrics,
0 and 1 using the sigmoid function. The used dataset is especially the F-measure. Table 1 is a brief summarization of
relatively small, with only 2556 tweets with 7 labels. This the previous work done in this field.
number of labels with the small size may increase the metrics
values but decrease the model efficiency. IV. PROPOSED MODEL
Mehta et al. [29] proposed two models based on BERT In this section, a detailed description of the proposed frame-
for fake news classification using LAIR PLUS [47], and work is presented. The primary goal is to build a fake
LAIR [49] datasets. Each model uses one BERT for each news detection model that includes two main parts. One
branch of the used dataset, so when dealing with LAIR [49] of the two main parts is the CNNs which are used on the
the model consists only of two BERTs, and consists of three features extracted from the Arabic pre-trained bidirectional
BERTS when dealing with LAIR PLUS [47], then each model transformer models (APBTM) which is the other main part.
shares the weights between them (BERTs) and concatenates Two systems architectures based on APBTM and CNN are
the output of BERTs. The dropout is used to avoid overfitting. presented. The first one depends on passing the output of
The last layer uses the full connection and soft-max function the last layer from APBTM to the one-dimensional CNN
(1D-CNN) for convolutional processing. This architec- 2D-CNN as shown in figure 4 achieves the highest perfor-
ture failed to achieve the best results in most of the mance in most of the experiments that will be discussed in
experiments as shown in section VI. Figure 3 shows section VI.
the first architecture. The second architecture depends on
passing the output of all layers from APBTM to the A. TEXT PRE-PROCESSING
two-dimensional CNN (2D-CNN) for convolutional pro- This step is applied before using the dataset to train our
cessing. Our experiments results showed that the pro- model. We remove any non-Arabic characters, hashtags,
posed hybrid model between AraBERT (APBTM) and URLs, user mentions, and all the emojis. Then we should
QK T
Attention (Q, K , V ) = softmax √ .V (3) A. DESCRIPTION OF DATASETS
dk In this section, we introduce all the used datasets in training
where dk is the dimension of keys, Q is queries, K is keys, and test our model. We already have used three datasets
V is values. named Covid19Fakes [6], AraNews [5] and ANS [4].
The multi-head Attention layer allows the model to focus Covid19Fakes [6]: a COVID-19 Twitter dataset that
on different parts of input data simultaneously in different automatically annotated in Arabic and English. The
representations. This done in Equation 5 by concatenating Covid19Fakes contains 22000 tweet IDs with a list of labels
the outputs from individual attention heads as defined in for each tweet ID in the Arabic dataset. There are some
Equation 3 and then calculating the weights for an individual tweets and users removed from Twitter. Figure 9 presents the
head as in Equation 4 [25]. method for dataset collection from Twitter. In Covid19Fakes
dataset, we apply a pre-processing step and then remove
Q
headi = Attention(QWi , KWiK , VWiV ) (4) the very short sample. Select samples with sentence lengths
FIGURE 9. Dataset collection from twitter. FIGURE 10. Used datasets distribution.
between 9 and 32. Select 7000 fake news samples and on a hybrid system consisting mainly of two parts: BERT
7000 real news samples. Hence, our Covid19Fakes dataset and CNN.
consists of 14000 samples.
AraNews [5] is a collection of Arabic disinformation C. EVALUATION METRICS
drawn from a variety of newspapers in 15 different Ara- In the classification task, there are many metrics for evaluat-
bic countries. also, this dataset uses machine to generate ing the model’s performance. The used metrics are accuracy,
news learned from real news. The AraNews dataset contains recall, precision, and F1-score. The evaluation metrics are
486961 samples. The disadvantages of this dataset are the calculated as follows:
empty samples, the very short samples (one and two words) • Accuracy In ML, the accuracy score is commonly used
no news is two words, the very long samples (more than to evaluate model performance. When implementing a
512 words), some news has numbers not related to the content binary classification algorithm, we use terms like False
news, and it is unbalanced. Removing the short, long, and Positive (FP), False-Negative (FN) True-Negative (TN),
empty samples. We select samples with sentence lengths and True-Positive (TP) so we can obtain an accurate
between 9 and 32, and finally, we get 7000 fake news samples evaluation of the model’s performance. A true-positive
and 7000 real news samples. Hence, our Ara-news dataset prediction refers to a precise assumption about a posi-
consisted of 14000 samples. tive instance while a false-positive means an incorrect
ANS [4]: it is a dataset for the task of stance detection in one about a negative instance that was predicted by the
Arabic news articles. There were several news outlets, includ- model as a positive. To calculate this accurately follow
ing the BBC and CNN, from which the data was provided. It is the equation:
a good dataset, but unfortunately, it is small in size. Table 3
TP + TN
shows the description of the dataset used in the experiment. Accuracy = (6)
figure 10 illustrates the datasets distribution. TP + FP + FN + TN
• Precision
TABLE 3. Description of datasets used. Precision or positive predictive value is the fraction of
relevant instances among the retrieved instances, the
Precision depends on TP and FP. To calculate Precision
follow the equation:
TP
Precision = (7)
TP + FP
B. EXPERIMENT DESIGN • Recall
In supervised learning, Algorithms are trained on labeled Recall or sensitivity is the fraction of relevant instances
datasets, the data has pre-defined input values and corre- that were retrieved. The recall depends on TP and FN.
sponding outputs, to learn a mapping function from the inputs To calculate recall follow the equation:
to the outputs, so that the model can make predictions on
TP
new data. In our experiment We use three datasets which are Recall = (8)
AraNews [5], Covid19Fakes [6], and ANS [4] to train three TP + FN
BERTS which are AraBERT, GigaBERT and MARBERT one • F1-score The F1 score includes FP and FN test findings
at a time, then passes the output from each one to a CNN and evenly weights them. F1 is more precise than accu-
which could be 1D-CNN or 2D-CNN. In each trial, we use racy in cases where the spread of the classes is uneven.
many different max-length, and then we evaluate the results The accuracy and recall values must first be computed
in different values of epochs, so our experiment depends before the F1 number can be determined. A classifier’s
precision can be used to determine how accurate it is. TABLE 5. ANS datasets use APBT with 1D-CNN, and 2D-CNN in the third
epoch.
A significant number of false results can be a sign of
poor precision. Looking at a classifier’s memory can
help determine how comprehensive it is. The presence of
numerous false-positives is demonstrated by an inability
to accurately remember the outcomes. The following is
the equation of the F1 score:
2 × Recall × Precision
F1 = (9)
Recall + Precision
We present three experiments with three datasets. In the
first experiment, we present the results of APBT with-
out using CNN, the second experiment presents results of
APBT with 1D-CNN for the output of the last layer APBT, Table 6 presents the APBT with 2D-CNN in the second
and the third experiment presents results of APBT with epoch.
2D-CNN for the output of all layers of APBT. Applying these
TABLE 6. ANS datasets use APBT with 2D-CNN in the second epoch.
three experiments on three datasets ANS, Ara-news, and
Covid19Fakes.
A. FIRST EXPERIMENT RESULTS (USING ANS) B. SECOND EXPERIMENT RESULTS - (USING ARA-NEWS)
Applying the APBT on the ANS dataset without using CNN Applying the APBT on the Ara-news dataset without using
with 6 epochs is presented in Table 4 and by the four metrics CNN with 6 epochs and presenting the four metrics f1, Pre-
f1, Precision, Recall, and Accuracy. cision, Recall, and Accuracy in Table 7.
TABLE 4. ANS datasets use APBT without CNN. TABLE 7. Ara-news datasets using APBT without CNN.
Appling the APBT on the ANS dataset using 1D-CNN, and Apply the APBT on the Ara-news dataset using 1D-CNN,
2D-CNN, with 16, and 32 max lengths presented in Table 5 and 2D-CNN, with 16, and 32 max-length in the third epoch is
in the third epoch by the four metrics f1, Precision, Recall, presented in Table 8 by the four metrics f1, Precision, Recall,
and Accuracy. and Accuracy.
TABLE 8. Ara-news datasets using APBT with 1D-CNN, and 2D-CNN in the TABLE 11. Covid19Fakes datasets use APBT with 1D-CNN, and 2D-CNN in
third epoch. the third epoch.
Applying the APBT on the Ara-news dataset using Applying the APBT on the Covid19fake dataset using
2DCNN, with 16, and 32 max lengths in the second epoch 2DCNN, with 16, and 32 max lengths in the second epoch
is presented in Table 9. is presented in Table 12.
TABLE 9. Ara-news datasets use APBT 2D-CNN in the second epoch. TABLE 12. Covid19Fakes dataset with APBT and 2D-CNN in the second
epoch.
[48] D. Pomerleau and D. Rao. (2017). Fake News Challenge Stage 1 (FNCI): NERMIN ABDELHAKIM OTHMAN received the Ph.D. degree in infor-
Stance Detection. [Online]. Available: www.fakenewschallenge.org mation systems from Helwan University. She is currently a Lecturer of
[49] W. Y. Wang, ‘‘‘Liar, liar pants on fire’: A new benchmark dataset for fake informatics with Helwan University and The British University in Egypt.
news detection,’’ 2017, arXiv:1705.00648. Her research interests include data mining, machine learning, deep learning,
[50] G. Amoudi, R. Albalawi, F. Baothman, A. Jamal, H. Alghamdi, and and sentiment analysis.
A. Alhothali, ‘‘Arabic rumor detection: A comparative study,’’ Alexandria
Eng. J., vol. 61, no. 12, pp. 12511–12523, Dec. 2022.
[51] A. B. Nassif, A. Elnagar, O. Elgendy, and Y. Afadar, ‘‘Arabic fake news
detection based on deep contextualized embedding models,’’ Neural Com-
put. Appl., vol. 34, no. 18, pp. 16019–16032, Sep. 2022.
[52] R. M. Albalawi, A. T. Jamal, A. O. Khadidos, and A. M. Alhothali,
‘‘Multimodal Arabic rumors detection,’’ IEEE Access, vol. 11, DOAA S. ELZANFALY received the joint Ph.D. degree in distributed query
pp. 9716–9730, 2023. processing from Helwan University, Egypt, and the University of Con-
[53] Z. S. Ali, W. Mansour, T. Elsayed, and A. Al-Ali, ‘‘Arafacts: The first large necticut, USA. She is currently an Associate Professor of informatics with
Arabic dataset of naturally occurring claims,’’ in Proc. 6th Arabic Natural Helwan University and The British University in Egypt. Her research inter-
Lang. Process. Workshop, 2021, pp. 231–236. ests include data management and mining, Arabic sentiment analysis and
[54] T. A. Wotaifi and B. N. Dhannoon, ‘‘An effective hybrid deep neural opinion mining, keyword search using ontologies, and rumor detection and
network for Arabic fake news detection,’’ Baghdad Sci. J., vol. 20, no. 4, identification.
p. 1392, Jan. 2023.
[55] A. Alokla, W. Gad, W. Nazih, M. Aref, and A.-B. Salem, ‘‘Pseudocode gen-
eration from source code using the BART model,’’ Mathematics, vol. 10,
no. 21, p. 3967, Oct. 2022.
[56] I. Abu El-khair, ‘‘1.5 billion words Arabic corpus,’’ 2016,
arXiv:1611.04033.
[57] I. Zeroual, D. Goldhahn, T. Eckart, and A. Lakhouaja, ‘‘OSIAN: Open
MOSTAFA MAHMOUD M. ELHAWARY received the B.Sc. degree (Hons.)
source international Arabic news corpus—Preparation and integration into from the Military Technical College, Cairo, Egypt, in 2016. He is currently
the CLARIN-infrastructure,’’ in Proc. 4th Arabic Natural Lang. Process. pursuing the M.Sc. degree in artificial intelligence with the Faculty of Com-
Workshop, 2019, pp. 175–182. puters and Artificial Intelligence, Helwan University, Cairo. His research
[58] P. J. O. Suárez, B. Sagot, and L. Romary, ‘‘Asynchronous pipeline for interests include artificial intelligence, data science, machine learning, social
processing huge corpora on medium to low resource infrastructures,’’ in network analysis, and software engineering.
Proc. 7th Workshop Challenges Manag. Large Corpora, 2019, pp. 1–9.