[go: up one dir, main page]

 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (111)

Search Parameters:
Keywords = deepfake

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 25584 KiB  
Article
LIDeepDet: Deepfake Detection via Image Decomposition and Advanced Lighting Information Analysis
by Zhimao Lai, Jicheng Li, Chuntao Wang, Jianhua Wu and Donghua Jiang
Electronics 2024, 13(22), 4466; https://doi.org/10.3390/electronics13224466 - 14 Nov 2024
Viewed by 302
Abstract
The proliferation of AI-generated content (AIGC) has empowered non-experts to create highly realistic Deepfake images and videos using user-friendly software, posing significant challenges to the legal system, particularly in criminal investigations, court proceedings, and accident analyses. The absence of reliable Deepfake verification methods [...] Read more.
The proliferation of AI-generated content (AIGC) has empowered non-experts to create highly realistic Deepfake images and videos using user-friendly software, posing significant challenges to the legal system, particularly in criminal investigations, court proceedings, and accident analyses. The absence of reliable Deepfake verification methods threatens the integrity of legal processes. In response, researchers have explored deep forgery detection, proposing various forensic techniques. However, the swift evolution of deep forgery creation and the limited generalizability of current detection methods impede practical application. We introduce a new deep forgery detection method that utilizes image decomposition and lighting inconsistency. By exploiting inherent discrepancies in imaging environments between genuine and fabricated images, this method extracts robust lighting cues and mitigates disturbances from environmental factors, revealing deeper-level alterations. A crucial element is the lighting information feature extractor, designed according to color constancy principles, to identify inconsistencies in lighting conditions. To address lighting variations, we employ a face material feature extractor using Pattern of Local Gravitational Force (PLGF), which selectively processes image patterns with defined convolutional masks to isolate and focus on reflectance coefficients, rich in textural details essential for forgery detection. Utilizing the Lambertian lighting model, we generate lighting direction vectors across frames to provide temporal context for detection. This framework processes RGB images, face reflectance maps, lighting features, and lighting direction vectors as multi-channel inputs, applying a cross-attention mechanism at the feature level to enhance detection accuracy and adaptability. Experimental results show that our proposed method performs exceptionally well and is widely applicable across multiple datasets, underscoring its importance in advancing deep forgery detection. Full article
(This article belongs to the Special Issue Deep Learning Approach for Secure and Trustworthy Biometric System)
Show Figures

Figure 1

Figure 1
<p>Imaging process of digital image.</p>
Full article ">Figure 2
<p>Process of image generation using generative adversarial networks.</p>
Full article ">Figure 3
<p>Architecture of the proposed method.</p>
Full article ">Figure 4
<p>Illustration of artifacts in deep learning-generated faces. The right-most image shows over-rendering around the nose area.</p>
Full article ">Figure 5
<p>Illustration of inconsistent iris colors in generated faces.</p>
Full article ">Figure 6
<p>Visualization of illumination maps for real images and four forgery methods from the FF++ database.</p>
Full article ">Figure 7
<p>Face material map after illumination normalization. Abnormal traces in the eye and mouth regions are more noticeable.</p>
Full article ">Figure 8
<p>Visualization of face material maps for the facial regions in real images and four forgery methods from the FF++ database for the same frame.</p>
Full article ">Figure 9
<p>Three-dimensional lighting direction vector.</p>
Full article ">Figure 10
<p>Two-dimensional lighting direction vector.</p>
Full article ">Figure 11
<p>Calculation process of lighting direction.</p>
Full article ">Figure 12
<p>Calculation the angle of lighting direction.</p>
Full article ">Figure 13
<p>Comparison of lighting direction angles between real videos and their corresponding Deepfake videos.</p>
Full article ">
59 pages, 11596 KiB  
Review
Fake News Detection Revisited: An Extensive Review of Theoretical Frameworks, Dataset Assessments, Model Constraints, and Forward-Looking Research Agendas
by Sheetal Harris, Hassan Jalil Hadi, Naveed Ahmad and Mohammed Ali Alshara
Technologies 2024, 12(11), 222; https://doi.org/10.3390/technologies12110222 - 6 Nov 2024
Viewed by 947
Abstract
The emergence and acceptance of digital technology have caused information pollution and an infodemic on Online Social Networks (OSNs), blogs, and online websites. The malicious broadcast of illegal, objectionable and misleading content causes behavioural changes and social unrest, impacts economic growth and national [...] Read more.
The emergence and acceptance of digital technology have caused information pollution and an infodemic on Online Social Networks (OSNs), blogs, and online websites. The malicious broadcast of illegal, objectionable and misleading content causes behavioural changes and social unrest, impacts economic growth and national security, and threatens users’ safety. The proliferation of AI-generated misleading content has further intensified the current situation. In the previous literature, state-of-the-art (SOTA) methods have been implemented for Fake News Detection (FND). However, the existing research lacks multidisciplinary considerations for FND based on theories on FN and OSN users. Theories’ analysis provides insights into effective and automated detection mechanisms for FN, and the intentions and causes behind wide-scale FN propagation. This review evaluates the available datasets, FND techniques, and approaches and their limitations. The novel contribution of this review is the analysis of the FND in linguistics, healthcare, communication, and other related fields. It also summarises the explicable methods for FN dissemination, identification and mitigation. The research identifies that the prediction performance of pre-trained transformer models provides fresh impetus for multilingual (even for resource-constrained languages), multidomain, and multimodal FND. Their limits and prediction capabilities must be harnessed further to combat FN. It is possible by large-sized, multidomain, multimodal, cross-lingual, multilingual, labelled and unlabelled dataset curation and implementation. SOTA Large Language Models (LLMs) are the innovation, and their strengths should be focused on and researched to combat FN, deepfakes, and AI-generated content on OSNs and online sources. The study highlights the significance of human cognitive abilities and the potential of AI in the domain of FND. Finally, we suggest promising future research directions for FND and mitigation. Full article
Show Figures

Figure 1

Figure 1
<p>FN types in terms of veracity value and velocity [<a href="#B52-technologies-12-00222" class="html-bibr">52</a>].</p>
Full article ">Figure 2
<p>Different aspects of FN.</p>
Full article ">Figure 3
<p>Statistical overview of digital news readers worldwide [<a href="#B61-technologies-12-00222" class="html-bibr">61</a>].</p>
Full article ">Figure 4
<p>Review paper flow.</p>
Full article ">Figure 5
<p>The methodology of the literature review and research process.</p>
Full article ">Figure 6
<p>Dataset curation process.</p>
Full article ">Figure 7
<p>A pie chart of the benchmark datasets used in the studies of Fake News Detection.</p>
Full article ">Figure 8
<p>A pie chart of the benchmark datasets used in the studies of fake news articles.</p>
Full article ">Figure 9
<p>A pie chart of the benchmark datasets used in the studies of other datasets related to FND.</p>
Full article ">Figure 10
<p>Fake News Detection (FND) techniques and approaches.</p>
Full article ">Figure 11
<p>Different Fake News Detection approaches used for content-based analysis.</p>
Full article ">Figure 12
<p>Multimodal Fake News Detection process.</p>
Full article ">Figure 13
<p>Algorithm classification for existing Fake News Detection.</p>
Full article ">Figure 14
<p>NLP FND framework.</p>
Full article ">Figure 15
<p>Pre-trained ensemble.</p>
Full article ">
23 pages, 6995 KiB  
Article
Discrete Fourier Transform in Unmasking Deepfake Images: A Comparative Study of StyleGAN Creations
by Vito Nicola Convertini, Donato Impedovo, Ugo Lopez, Giuseppe Pirlo and Gioacchino Sterlicchio
Information 2024, 15(11), 711; https://doi.org/10.3390/info15110711 - 6 Nov 2024
Viewed by 434
Abstract
This study proposes a novel forgery detection method based on the analysis of frequency components of images using the Discrete Fourier Transform (DFT). In recent years, face manipulation technologies, particularly Generative Adversarial Networks (GANs), have advanced to such an extent that their misuse, [...] Read more.
This study proposes a novel forgery detection method based on the analysis of frequency components of images using the Discrete Fourier Transform (DFT). In recent years, face manipulation technologies, particularly Generative Adversarial Networks (GANs), have advanced to such an extent that their misuse, such as creating deepfakes indistinguishable to human observers, has become a significant societal concern. We reviewed two GAN architectures, StyleGAN and StyleGAN2, generating synthetic faces that were compared with real faces from the FFHQ and CelebA-HQ datasets. The key results demonstrate classification accuracies above 99%, with F1 scores of 99.94% for Support Vector Machines and 97.21% for Random Forest classifiers. These findings underline the fact that performing frequency analysis presents a superior approach to deepfake detection compared to traditional spatial detection methods. It provides insight into subtle manipulation cues in digital images and offers a scalable way to enhance security protocols amid rising digital impersonation threats. Full article
(This article belongs to the Special Issue Computer Vision, Pattern Recognition and Machine Learning in Italy)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>A diagram of the components and main relationships of a classical GAN.</p>
Full article ">Figure 2
<p>Transposed convolution and nearest neighbor interpolation [<a href="#B23-information-15-00711" class="html-bibr">23</a>].</p>
Full article ">Figure 3
<p>The spectrum of a real face (<b>left</b>) and an artificially generated face (<b>right</b>). Note the artifacts highlighted in the zoomed-in sections [<a href="#B15-information-15-00711" class="html-bibr">15</a>].</p>
Full article ">Figure 4
<p>An overview of the processing pipeline.</p>
Full article ">Figure 5
<p>The image of a face (<b>left</b>), its counterpart in the frequency domain maintaining the same dimensions (<b>center</b>), and the one-dimensional representation (<b>right</b>).</p>
Full article ">Figure 6
<p>The power spectrum obtained by averaging the images of each dataset. The behavior at high frequencies is different between real and fake faces. StyleGAN 2 tries to come close in reproducing the same characteristics as real faces, but differences remain.</p>
Full article ">Figure 7
<p>CelebA-HQ.</p>
Full article ">Figure 8
<p>FFHQ.</p>
Full article ">Figure 9
<p>StyleGAN 1.</p>
Full article ">Figure 10
<p>StyleGAN 2.</p>
Full article ">Figure 11
<p>Comparison scheme between the FFHQ dataset and StyleGAN 1, highlighting the distribution of high-frequency components for the classification of real and synthetic images.</p>
Full article ">Figure 12
<p>FFHQ vs. StyleGAN 1. VM with polynomial kernel, Random Forest with log2 randomization, bootstrap 0.8 and number of estimators = 30. The graph shows a clear difference between real and fake faces around high frequencies.</p>
Full article ">Figure 13
<p>FFHQ vs. StyleGAN 2. SVM with polynomial kernel; Random Forest with randomization log2, bootstrap 0.9 and number of estimators = 20. As shown in the graph, StyleGAN 2 seems to move closer to recognizing the real faces but still with differences at high frequencies.</p>
Full article ">Figure 14
<p>CelebA vs. StyleGAN 1. SVM with linear kernel, Random Forest with randomization log2, bootstrap 0.8 and number of estimators = 10. The graph shows that there are clear differences between real and fake faces even before the high frequencies.</p>
Full article ">Figure 15
<p>CelebA vs. StyleGAN 2. SVM with linear kernel; Random Forest with randomization log2, bootstrap 0.9 and number of estimators = 10. The graph shows that StyleGAN 2 again manages to come close to the real faces, but with discrepancies in the middle and high frequencies.</p>
Full article ">Figure 16
<p>A diagram of the two vs. one test: detailed comparison of FFHQ, CelebA, and StyleGAN 2 for deepfake detection, with accuracy metrics surpassing 90%.</p>
Full article ">Figure 17
<p>Diagram summarizing the steps taken for the FFHQ test, CelebA vs. StyleGAN 1, StyleGAN 2.</p>
Full article ">Figure 18
<p>Cross-dataset testing. Colors are important to understand the procedure adopted. Equal colors refer to equal testing phases (e.g., testing set consisting of CelebA and StyleGAN2 in green with the model trained on the FFHQ dataset and StyleGAN2 in green; the same approach applies for all other colors).</p>
Full article ">
14 pages, 543 KiB  
Article
CSTAN: A Deepfake Detection Network with CST Attention for Superior Generalization
by Rui Yang, Kang You, Cheng Pang, Xiaonan Luo and Rushi Lan
Sensors 2024, 24(22), 7101; https://doi.org/10.3390/s24227101 - 5 Nov 2024
Viewed by 423
Abstract
With the advancement of deepfake forgery technology, highly realistic fake faces have posed serious security risks to sensor-based facial recognition systems. Recent deepfake detection models mainly use binary classification models based on deep learning. Despite achieving high detection accuracy on intra-datasets, these models [...] Read more.
With the advancement of deepfake forgery technology, highly realistic fake faces have posed serious security risks to sensor-based facial recognition systems. Recent deepfake detection models mainly use binary classification models based on deep learning. Despite achieving high detection accuracy on intra-datasets, these models lack generalization ability when applied to cross-datasets. We propose a deepfake detection model named Channel-Spatial-Triplet Attention Network (CSTAN), which focuses on the difference between real and fake features, thereby enhancing the generality of the detection model. To enhance the feature-learning ability of the model for image forgery regions, we have designed the Channel-Spatial-Triplet (CST) attention mechanism, which extracts subtle local information by capturing feature channels and the spatial correlation of three different scales. Additionally, we propose a novel feature extraction method, OD-ResNet-34, by embedding ODConv into the feature extraction network to enhance its dynamic adaptability to data features. Trained on the FF++ dataset and tested on the Celeb-DF-v1 and Celeb-DF-v2 datasets, the experimental results show that our model has stronger generalization ability in cross-datasets than similar models. Full article
(This article belongs to the Special Issue Image Processing and Analysis for Object Detection: 2nd Edition)
Show Figures

Figure 1

Figure 1
<p>A deepfake forgery trace detection module with multi-attention mechanism fusion. The specific implementation of Fake anchor and Extra layer refer to DONG’S [<a href="#B10-sensors-24-07101" class="html-bibr">10</a>]. The input image size for the network is <math display="inline"><semantics> <mrow> <mn>224</mn> <mo>×</mo> <mn>224</mn> </mrow> </semantics></math>, where N represents the number of images. The backbone network employs OD-ResNet-34, and the output feature dimensions after processing through the backbone network are <math display="inline"><semantics> <mrow> <mn>7</mn> <mo>×</mo> <mn>7</mn> </mrow> </semantics></math>. Finally, the designed forgery detection module outputs a classification result of <math display="inline"><semantics> <mrow> <mn>1</mn> <mo>×</mo> <mn>1</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 2
<p>OD-ResNet-34 network. The number after the × sign represents the quantity of corresponding convolutional layers. For example, the first <math display="inline"><semantics> <mrow> <mo>×</mo> <mn>5</mn> </mrow> </semantics></math> represents a convolution layer with a 3 × 3 kernel and 64 channels. There are five of them in the network. * is calculated as shown in Formula (1).</p>
Full article ">Figure 3
<p>Attention computation across four dimensions: (<b>a</b>–<b>d</b>) represent the attention computation graphs for <math display="inline"><semantics> <msub> <mi>α</mi> <mrow> <mi>s</mi> <mi>i</mi> </mrow> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>α</mi> <mrow> <mi>c</mi> <mi>i</mi> </mrow> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>α</mi> <mrow> <mi>f</mi> <mi>i</mi> </mrow> </msub> </semantics></math>, and <math display="inline"><semantics> <msub> <mi>α</mi> <mrow> <mi>w</mi> <mi>i</mi> </mrow> </msub> </semantics></math>, respectively.</p>
Full article ">Figure 4
<p>Deepfake forgery trace detection module with CST attention mechanism.</p>
Full article ">Figure 5
<p>SENet structure. * is the common multiplication sign.</p>
Full article ">Figure 6
<p>Plot of loss convergence for different improvement points.</p>
Full article ">Figure 7
<p>AUC plot obtained from training for 200 epochs on the CSTAN model for the FF++ dataset.</p>
Full article ">Figure 8
<p>Randomized test results across datasets. The top figure shows the test results in the cycle of Celeb-DF-V1, and the bottom figure shows the test results in the cycle of Celeb-DF-V2.</p>
Full article ">
19 pages, 1699 KiB  
Article
Deep Speech Synthesis and Its Implications for News Verification: Lessons Learned in the RTVE-UGR Chair
by Daniel Calderón-González, Nieves Ábalos, Blanca Bayo, Pedro Cánovas, David Griol, Carlos Muñoz-Romero, Carmen Pérez, Pere Vila and Zoraida Callejas
Appl. Sci. 2024, 14(21), 9916; https://doi.org/10.3390/app14219916 - 30 Oct 2024
Viewed by 492
Abstract
This paper presents the multidisciplinary work carried out in the RTVE-UGR Chair within the IVERES project, whose main objective is the development of a tool for journalists to verify the veracity of the audios that reach the newsrooms. In the current context, voice [...] Read more.
This paper presents the multidisciplinary work carried out in the RTVE-UGR Chair within the IVERES project, whose main objective is the development of a tool for journalists to verify the veracity of the audios that reach the newsrooms. In the current context, voice synthesis has both beneficial and detrimental applications, with audio deepfakes being a significant concern in the world of journalism due to their ability to mislead and misinform. This is a multifaceted problem that can only be tackled adopting a multidisciplinary perspective. In this article, we describe the approach we adopted within the RTVE-UGR Chair to successfully address the challenges derived from audio deepfakes involving a team with different backgrounds and a specific methodology of iterative co-creation. As a result, we present several outcomes including the compilation and generation of audio datasets, the development and deployment of several audio fake detection models, and the development of a web audio verification tool addressed to journalists. As a conclusion, we highlight the importance of this systematic collaborative work in the fight against misinformation and the future potential of audio verification technologies in various applications. Full article
(This article belongs to the Topic Artificial Intelligence Models, Tools and Applications)
Show Figures

Figure 1

Figure 1
<p>Stages followed in our methodology and participating roles.</p>
Full article ">Figure 2
<p>Datasets used/generated by the RTVE-UGR Chair.</p>
Full article ">Figure 3
<p>Pipeline of the model creation process.</p>
Full article ">Figure 4
<p>Interface of the audio deepfake detection tool developed by the RTVE-UGR Chair.</p>
Full article ">Figure 5
<p>Verification process pipeline.</p>
Full article ">
49 pages, 3154 KiB  
Review
An Investigation into the Utilisation of CNN with LSTM for Video Deepfake Detection
by Sarah Tipper, Hany F. Atlam and Harjinder Singh Lallie
Appl. Sci. 2024, 14(21), 9754; https://doi.org/10.3390/app14219754 - 25 Oct 2024
Viewed by 1132
Abstract
Video deepfake detection has emerged as a critical field within the broader domain of digital technologies driven by the rapid proliferation of AI-generated media and the increasing threat of its misuse for deception and misinformation. The integration of Convolutional Neural Network (CNN) with [...] Read more.
Video deepfake detection has emerged as a critical field within the broader domain of digital technologies driven by the rapid proliferation of AI-generated media and the increasing threat of its misuse for deception and misinformation. The integration of Convolutional Neural Network (CNN) with Long Short-Term Memory (LSTM) has proven to be a promising approach for improving video deepfake detection, achieving near-perfect accuracy. CNNs enable the effective extraction of spatial features from video frames, such as facial textures and lighting, while LSTM analyses temporal patterns, detecting inconsistencies over time. This hybrid model enhances the ability to detect deepfakes by combining spatial and temporal analysis. However, the existing research lacks systematic evaluations that comprehensively assess their effectiveness and optimal configurations. Therefore, this paper provides a comprehensive review of video deepfake detection techniques utilising hybrid CNN-LSTM models. It systematically investigates state-of-the-art techniques, highlighting common feature extraction approaches and widely used datasets for training and testing. This paper also evaluates model performance across different datasets, identifies key factors influencing detection accuracy, and explores how CNN-LSTM models can be optimised. It also compares CNN-LSTM models with non-LSTM approaches, addresses implementation challenges, and proposes solutions for them. Lastly, open issues and future research directions of video deepfake detection using CNN-LSTM will be discussed. This paper provides valuable insights for researchers and cyber security professionals by reviewing CNN-LSTM models for video deepfake detection contributing to the advancement of robust and effective deepfake detection systems. Full article
Show Figures

Figure 1

Figure 1
<p>The three-layer architecture of CNN.</p>
Full article ">Figure 2
<p>A standard LSTM block.</p>
Full article ">Figure 3
<p>A bidirectional LSTM block.</p>
Full article ">Figure 4
<p>The five stages of the systematic literature review.</p>
Full article ">Figure 5
<p>The outcome of the three-phase selection process.</p>
Full article ">Figure 6
<p>Number of publications per year.</p>
Full article ">Figure 7
<p>Common datasets used in video deepfake detection.</p>
Full article ">Figure 8
<p>Video deepfake datasets used within the selected publications [<a href="#B2-applsci-14-09754" class="html-bibr">2</a>,<a href="#B14-applsci-14-09754" class="html-bibr">14</a>,<a href="#B25-applsci-14-09754" class="html-bibr">25</a>,<a href="#B32-applsci-14-09754" class="html-bibr">32</a>,<a href="#B34-applsci-14-09754" class="html-bibr">34</a>,<a href="#B36-applsci-14-09754" class="html-bibr">36</a>,<a href="#B37-applsci-14-09754" class="html-bibr">37</a>,<a href="#B38-applsci-14-09754" class="html-bibr">38</a>,<a href="#B39-applsci-14-09754" class="html-bibr">39</a>,<a href="#B40-applsci-14-09754" class="html-bibr">40</a>,<a href="#B41-applsci-14-09754" class="html-bibr">41</a>,<a href="#B42-applsci-14-09754" class="html-bibr">42</a>,<a href="#B43-applsci-14-09754" class="html-bibr">43</a>,<a href="#B44-applsci-14-09754" class="html-bibr">44</a>,<a href="#B45-applsci-14-09754" class="html-bibr">45</a>,<a href="#B46-applsci-14-09754" class="html-bibr">46</a>,<a href="#B48-applsci-14-09754" class="html-bibr">48</a>,<a href="#B49-applsci-14-09754" class="html-bibr">49</a>,<a href="#B50-applsci-14-09754" class="html-bibr">50</a>,<a href="#B51-applsci-14-09754" class="html-bibr">51</a>,<a href="#B52-applsci-14-09754" class="html-bibr">52</a>,<a href="#B53-applsci-14-09754" class="html-bibr">53</a>,<a href="#B54-applsci-14-09754" class="html-bibr">54</a>,<a href="#B55-applsci-14-09754" class="html-bibr">55</a>,<a href="#B56-applsci-14-09754" class="html-bibr">56</a>,<a href="#B57-applsci-14-09754" class="html-bibr">57</a>,<a href="#B58-applsci-14-09754" class="html-bibr">58</a>,<a href="#B59-applsci-14-09754" class="html-bibr">59</a>,<a href="#B60-applsci-14-09754" class="html-bibr">60</a>,<a href="#B61-applsci-14-09754" class="html-bibr">61</a>,<a href="#B62-applsci-14-09754" class="html-bibr">62</a>,<a href="#B63-applsci-14-09754" class="html-bibr">63</a>,<a href="#B64-applsci-14-09754" class="html-bibr">64</a>,<a href="#B65-applsci-14-09754" class="html-bibr">65</a>,<a href="#B66-applsci-14-09754" class="html-bibr">66</a>,<a href="#B67-applsci-14-09754" class="html-bibr">67</a>,<a href="#B68-applsci-14-09754" class="html-bibr">68</a>,<a href="#B69-applsci-14-09754" class="html-bibr">69</a>,<a href="#B70-applsci-14-09754" class="html-bibr">70</a>,<a href="#B71-applsci-14-09754" class="html-bibr">71</a>].</p>
Full article ">Figure 9
<p>Confusion matrix.</p>
Full article ">
16 pages, 21131 KiB  
Article
GCS-YOLOv8: A Lightweight Face Extractor to Assist Deepfake Detection
by Ruifang Zhang, Bohan Deng, Xiaohui Cheng and Hong Zhao
Sensors 2024, 24(21), 6781; https://doi.org/10.3390/s24216781 - 22 Oct 2024
Viewed by 449
Abstract
To address the issues of target feature blurring and increased false detections caused by high compression rates in deepfake videos, as well as the high computational resource requirements of existing face extractors, we propose a lightweight face extractor to assist deepfake detection, GCS-YOLOv8. [...] Read more.
To address the issues of target feature blurring and increased false detections caused by high compression rates in deepfake videos, as well as the high computational resource requirements of existing face extractors, we propose a lightweight face extractor to assist deepfake detection, GCS-YOLOv8. Firstly, we employ the HGStem module for initial downsampling to address the issue of false detections of small non-face objects in deepfake videos, thereby improving detection accuracy. Secondly, we introduce the C2f-GDConv module to mitigate the low-FLOPs pitfall while reducing the model’s parameters, thereby lightening the network. Additionally, we add a new P6 large target detection layer to expand the receptive field and capture multi-scale features, solving the problem of detecting large-scale faces in low-compression deepfake videos. We also design a cross-scale feature fusion module called CCFG (CNN-based Cross-Scale Feature Fusion with GDConv), which integrates features from different scales to enhance the model’s adaptability to scale variations while reducing network parameters, addressing the high computational resource requirements of traditional face extractors. Furthermore, we improve the detection head by utilizing group normalization and shared convolution, simplifying the process of face detection while maintaining detection performance. The training dataset was also refined by removing low-accuracy and low-resolution labels, which reduced the false detection rate. Experimental results demonstrate that, compared to YOLOv8, this face extractor achieves the AP of 0.942, 0.927, and 0.812 on the WiderFace dataset’s Easy, Medium, and Hard subsets, representing improvements of 1.1%, 1.3%, and 3.7% respectively. The model’s parameters and FLOPs are only 1.68 MB and 3.5 G, reflecting reductions of 44.2% and 56.8%, making it more effective and lightweight in extracting faces from deepfake videos. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

Figure 1
<p>Structure of the YOLOv8.</p>
Full article ">Figure 2
<p>Structure of the GCS-YOLOv8.</p>
Full article ">Figure 3
<p>Structure of the HGStem.</p>
Full article ">Figure 4
<p>Structure of the C2f-GDConv.</p>
Full article ">Figure 5
<p>Structure of the Detect head and the GSCD.</p>
Full article ">Figure 6
<p>Comparison of detection effects on WiderFace test sets.</p>
Full article ">Figure 7
<p>Comparison of detection effects on Celeb-DF-v2 and FF++.</p>
Full article ">
13 pages, 849 KiB  
Article
Audio Deep Fake Detection with Sonic Sleuth Model
by Anfal Alshehri, Danah Almalki, Eaman Alharbi and Somayah Albaradei
Computers 2024, 13(10), 256; https://doi.org/10.3390/computers13100256 - 8 Oct 2024
Viewed by 1238
Abstract
Information dissemination and preservation are crucial for societal progress, especially in the technological age. While technology fosters knowledge sharing, it also risks spreading misinformation. Audio deepfakes—convincingly fabricated audio created using artificial intelligence (AI)—exacerbate this issue. We present Sonic Sleuth, a novel AI model [...] Read more.
Information dissemination and preservation are crucial for societal progress, especially in the technological age. While technology fosters knowledge sharing, it also risks spreading misinformation. Audio deepfakes—convincingly fabricated audio created using artificial intelligence (AI)—exacerbate this issue. We present Sonic Sleuth, a novel AI model designed specifically for detecting audio deepfakes. Our approach utilizes advanced deep learning (DL) techniques, including a custom CNN model, to enhance detection accuracy in audio misinformation, with practical applications in journalism and social media. Through meticulous data preprocessing and rigorous experimentation, we achieved a remarkable 98.27% accuracy and a 0.016 equal error rate (EER) on a substantial dataset of real and synthetic audio. Additionally, Sonic Sleuth demonstrated 84.92% accuracy and a 0.085 EER on an external dataset. The novelty of this research lies in its integration of datasets that closely simulate real-world conditions, including noise and linguistic diversity, enabling the model to generalize across a wide array of audio inputs. These results underscore Sonic Sleuth’s potential as a powerful tool for combating misinformation and enhancing integrity in digital communications. Full article
Show Figures

Figure 1

Figure 1
<p>Generative AI structure [<a href="#B3-computers-13-00256" class="html-bibr">3</a>].</p>
Full article ">Figure 2
<p>Deepfake detection approach.</p>
Full article ">Figure 3
<p>Sonic Sleuth architecture.</p>
Full article ">
12 pages, 263 KiB  
Article
Possible Health Benefits and Risks of DeepFake Videos: A Qualitative Study in Nursing Students
by Olga Navarro Martínez, David Fernández-García, Noemí Cuartero Monteagudo and Olga Forero-Rincón
Nurs. Rep. 2024, 14(4), 2746-2757; https://doi.org/10.3390/nursrep14040203 - 3 Oct 2024
Viewed by 1348
Abstract
Background: “DeepFakes” are synthetic performances created by AI, using neural networks to exchange faces in images and modify voices. Objective: Due to the novelty and limited literature on its risks/benefits, this paper aims to determine how young nursing students perceive DeepFake technology, its [...] Read more.
Background: “DeepFakes” are synthetic performances created by AI, using neural networks to exchange faces in images and modify voices. Objective: Due to the novelty and limited literature on its risks/benefits, this paper aims to determine how young nursing students perceive DeepFake technology, its ethical implications, and its potential benefits in nursing. Methods: This qualitative study used thematic content analysis (the Braun and Clarke method) with videos recorded by 50 third-year nursing students, who answered three questions about DeepFake technology. The data were analyzed using ATLAS.ti (version 22), and the project was approved by the Ethics Committee (code UCV/2021–2022/116). Results: Data analysis identified 21 descriptive codes, classified into four main themes: advantages, disadvantages, health applications, and ethical dilemmas. Benefits noted by students include use in diagnosis, patient accompaniment, training, and learning. Perceived risks include cyberbullying, loss of identity, and negative psychological impacts from unreal memories. Conclusions: Nursing students see both pros and cons in DeepFake technology and are aware of the ethical dilemmas it poses. They also identified promising healthcare applications that could enhance nurses’ leadership in digital health, stressing the importance of regulation and education to fully leverage its potential. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">
16 pages, 1482 KiB  
Article
SecureVision: Advanced Cybersecurity Deepfake Detection with Big Data Analytics
by Naresh Kumar and Ankit Kundu
Sensors 2024, 24(19), 6300; https://doi.org/10.3390/s24196300 - 29 Sep 2024
Viewed by 2016
Abstract
SecureVision is an advanced and trustworthy deepfake detection system created to tackle the growing threat of ‘deepfake’ movies that tamper with media, undermine public trust, and jeopardize cybersecurity. We present a novel approach that combines big data analytics with state-of-the-art deep learning algorithms [...] Read more.
SecureVision is an advanced and trustworthy deepfake detection system created to tackle the growing threat of ‘deepfake’ movies that tamper with media, undermine public trust, and jeopardize cybersecurity. We present a novel approach that combines big data analytics with state-of-the-art deep learning algorithms to detect altered information in both audio and visual domains. One of SecureVision’s primary innovations is the use of multi-modal analysis, which improves detection capabilities by concurrently analyzing many media forms and strengthening resistance against advanced deepfake techniques. The system’s efficacy is further enhanced by its capacity to manage large datasets and integrate self-supervised learning, which guarantees its flexibility in the ever-changing field of digital deception. In the end, this study helps to protect digital integrity by providing a proactive, scalable, and efficient defense against the ubiquitous threat of deepfakes, thereby establishing a new benchmark for privacy and security measures in the digital era. Full article
(This article belongs to the Special Issue Cybersecurity Attack and Defense in Wireless Sensors Networks)
Show Figures

Figure 1

Figure 1
<p>System architecture for audio deepfake detection.</p>
Full article ">Figure 2
<p>System architecture for image deepfake detection.</p>
Full article ">Figure 3
<p>Graphs of training vs. validation based on accuracy and loss.</p>
Full article ">Figure 4
<p>Confusion matrix of audio detection.</p>
Full article ">Figure 5
<p>Confusion matrix for image detection.</p>
Full article ">
4 pages, 449 KiB  
Proceeding Paper
Detection of Frauds in Deep Fake Using Deep Learning
by Osipilli Aparna, Pakanati Rani, Tulluri Ramya, Tanneru Priyanka, Neela Sundari, P. G. K. Sirisha, Repudi Ramesh and Dama Anand
Eng. Proc. 2024, 66(1), 48; https://doi.org/10.3390/engproc2024066048 - 23 Sep 2024
Viewed by 577
Abstract
Research on DeepFake detection using deep neural networks (DNNs) has gained more attention in an effort to detect and categorize DeepFakes. In essence, DeepFakes are regenerated content made by changing particular DNN model elements. In this study, a summary of DeepFake detection methods [...] Read more.
Research on DeepFake detection using deep neural networks (DNNs) has gained more attention in an effort to detect and categorize DeepFakes. In essence, DeepFakes are regenerated content made by changing particular DNN model elements. In this study, a summary of DeepFake detection methods for images and videos involving faces will be given based on their effectiveness, outcomes, methodology, and type of detection method. We will analyze and categorize the many DeepFake-generating techniques now in use into five primary classes. DeepFake datasets are frequently used to train and test DeepFake models. We will also cover the latest developments in DeepFake dataset trends that are currently accessible. We will also examine the problems in building a generalized DeepFake detection model. Lastly, the difficulties in creating and identifying DeepFakes will be covered. Full article
Show Figures

Figure 1

Figure 1
<p>The CNN model’s fundamental structure.</p>
Full article ">
19 pages, 964 KiB  
Article
Generalizing Source Camera Identification Based on Integral Image Optimization and Constrained Neural Network
by Yan Wang, Qindong Sun and Dongzhu Rong
Electronics 2024, 13(18), 3630; https://doi.org/10.3390/electronics13183630 - 12 Sep 2024
Viewed by 542
Abstract
Source camera identification can verify whether two videos were shot by the same device, which is of great significance in multimedia forensics. Most existing identification methods use convolutional neural networks to learn sensor noise patterns to identify the source camera in closed forensic [...] Read more.
Source camera identification can verify whether two videos were shot by the same device, which is of great significance in multimedia forensics. Most existing identification methods use convolutional neural networks to learn sensor noise patterns to identify the source camera in closed forensic scenarios. While these methodologies have achieved remarkable results, they are nonetheless constrained by two primary challenges: (1) the interference of semantic information and (2) the incongruity in feature distributions across different datasets. The former will interfere with the extraction of effective features of the model. The latter will cause the model to fit the characteristic distribution of the training data and be sensitive to unseen data features. To address these challenges, we propose a novel source camera identification framework that determines whether a video was shot by the same device by obtaining similarities between source camera features. Firstly, we extract video key frames and use the integral image to optimize the smoothing blocks selection algorithm of inter-pixel variance to remove the interference of video semantic information. Secondly, we design a residual neural network fused with a constraint layer to adaptively learn video source features. Thirdly, we introduce a triplet loss metric learning strategy to optimize the network model to improve the discriminability of the model. Finally, we design a multi-dimensional feature vector similarity fusion strategy to achieve highly generalized source camera recognition. Extensive experiments show that our method achieved an AUC value of up to 0.9714 in closed-set forensic scenarios and an AUC value of 0.882 in open-set scenarios, representing an improvement of 5% compared to the best baseline method. Furthermore, our method demonstrates effectiveness in the task of deepfake detection. Full article
Show Figures

Figure 1

Figure 1
<p>The overall structure of the proposed forensics system.</p>
Full article ">Figure 2
<p>The confusion matrix of the proposed method in closed-set forensic scenarios. (<b>a</b>) Patch. (<b>b</b>) Frame. (<b>c</b>) Video.</p>
Full article ">Figure 3
<p>Forensics accuracy of video source equipment at different block standard deviation thresholds.</p>
Full article ">Figure 4
<p>The ROC curves of our algorithm are compared with PRNU and MISL algorithms on open, closed, and semi-closed datasets. (<b>a</b>) Closed set. (<b>b</b>) Mixture. (<b>c</b>) Open set (Vision). (<b>d</b>) Open set (SOCRatES).</p>
Full article ">Figure 5
<p>The ROC curves of our algorithm are compared with PRNU and MISL algorithms on Facebook. (<b>a</b>) Closed set. (<b>b</b>) Mixture. (<b>c</b>) Open set (Vision). (<b>d</b>) Open set (SOCRatES).</p>
Full article ">Figure 5 Cont.
<p>The ROC curves of our algorithm are compared with PRNU and MISL algorithms on Facebook. (<b>a</b>) Closed set. (<b>b</b>) Mixture. (<b>c</b>) Open set (Vision). (<b>d</b>) Open set (SOCRatES).</p>
Full article ">
15 pages, 1376 KiB  
Article
Temporal Feature Prediction in Audio–Visual Deepfake Detection
by Yuan Gao, Xuelong Wang, Yu Zhang, Ping Zeng and Yingjie Ma
Electronics 2024, 13(17), 3433; https://doi.org/10.3390/electronics13173433 - 29 Aug 2024
Viewed by 948
Abstract
The rapid growth of deepfake technology, generating realistic manipulated media, poses a significant threat due to potential misuse. Therefore, effective detection methods are urgently needed to prevent malicious use, as current approaches often focus on single modalities or the simple fusion of audio–visual [...] Read more.
The rapid growth of deepfake technology, generating realistic manipulated media, poses a significant threat due to potential misuse. Therefore, effective detection methods are urgently needed to prevent malicious use, as current approaches often focus on single modalities or the simple fusion of audio–visual signals, limiting their accuracy. To solve this problem, we propose a deepfake detection scheme based on bimodal temporal feature prediction, which innovatively introduces the idea of temporal feature prediction into the audio–video bimodal deepfake detection task, aiming at fully exploiting the temporal laws of audio–visual modalities. First, pairs of adjacent audio–video sequence clips are used to construct input quadruples, and a dual-stream network is employed to extract temporal feature representations from video and audio, respectively. A video prediction module and an audio prediction module are designed to capture the temporal inconsistencies within each single modality by predicting future temporal features and comparing them with reference features. Then, a projection layer network is designed to align the audio–visual features, using contrastive loss functions to perform contrastive learning and maximize the differences between real and fake video modalities. Experiments on the FakeAVCeleb dataset demonstrate superior performance with an accuracy of 84.33% and an AUC of 89.91%, outperforming existing methods and confirming the effectiveness of our approach in deepfake detection. Full article
(This article belongs to the Special Issue Applied Cryptography and Practical Cryptoanalysis for Web 3.0)
Show Figures

Figure 1

Figure 1
<p>An overview of the audio–visual joint deepfake detection model architecture based on temporal feature prediction.</p>
Full article ">Figure 2
<p>Audio modal preprocessing process.</p>
Full article ">Figure 3
<p>MFCC feature extraction process.</p>
Full article ">Figure 4
<p>Illustration of the structure of the visual prediction module based on 1-block Transformer.</p>
Full article ">Figure 5
<p>Illustration of projection layer network aligning audio–visual features for comparative learning.</p>
Full article ">Figure 6
<p>Componentablation experiment ROC curve comparison plot.</p>
Full article ">Figure 7
<p>Comparison of ROC curves for loss function ablation experiments.</p>
Full article ">
22 pages, 13050 KiB  
Article
A Deep Learning Model for Detecting Fake Medical Images to Mitigate Financial Insurance Fraud
by Muhammad Asad Arshed, Shahzad Mumtaz, Ștefan Cristian Gherghina, Neelam Urooj, Saeed Ahmed and Christine Dewi
Computation 2024, 12(9), 173; https://doi.org/10.3390/computation12090173 - 29 Aug 2024
Viewed by 1042
Abstract
Artificial Intelligence and Deepfake Technologies have brought a new dimension to the generation of fake data, making it easier and faster than ever before—this fake data could include text, images, sounds, videos, etc. This has brought new challenges that require the faster development [...] Read more.
Artificial Intelligence and Deepfake Technologies have brought a new dimension to the generation of fake data, making it easier and faster than ever before—this fake data could include text, images, sounds, videos, etc. This has brought new challenges that require the faster development of tools and techniques to avoid fraudulent activities at pace and scale. Our focus in this research study is to empirically evaluate the use and effectiveness of deep learning models such as Convolutional Neural Networks (CNNs) and Patch-based Neural Networks in the context of successful identification of real and fake images. We chose the healthcare domain as a potential case study where the fake medical data generation approach could be used to make false insurance claims. For this purpose, we obtained publicly available skin cancer data and used recently introduced stable diffusion approaches—a more effective technique than prior approaches such as Generative Adversarial Network (GAN)—to generate fake skin cancer images. To the best of our knowledge, and based on the literature review, this is one of the few research studies that uses images generated using stable diffusion along with real image data. As part of the exploratory analysis, we analyzed histograms of fake and real images using individual color channels and averaged across training and testing datasets. The histogram analysis demonstrated a clear change by shifting the mean and overall distribution of both real and fake images (more prominent in blue and green) in the training data whereas, in the test data, both means were different from the training data, so it appears to be non-trivial to set a threshold which could give better predictive capability. We also conducted a user study to observe where the naked eye could identify any patterns for classifying real and fake images, and the accuracy of the test data was observed to be 68%. The adoption of deep learning predictive approaches (i.e., patch-based and CNN-based) has demonstrated similar accuracy (~100%) in training and validation subsets of the data, and the same was observed for the test subset with and without StratifiedKFold (k = 3). Our analysis has demonstrated that state-of-the-art exploratory and deep-learning approaches are effective enough to detect images generated from stable diffusion vs. real images. Full article
(This article belongs to the Special Issue Computational Medical Image Analysis—2nd Edition)
Show Figures

Figure 1

Figure 1
<p>Abstract diagram of the proposed study for real malignant and fake malignant cancer identification.</p>
Full article ">Figure 2
<p>Histogram of training dataset (real and fake malignant skin cancer images).</p>
Full article ">Figure 3
<p>Histogram of testing dataset (real and fake malignant skin cancer images).</p>
Full article ">Figure 4
<p>Training and validation losses and accuracies of ViT base model for real (1) and fake (0) malignant skin cancer.</p>
Full article ">Figure 5
<p>Confusion matrix base ViT model for real (1) and fake (0) malignant skin cancer.</p>
Full article ">Figure 6
<p>Training and validation losses and accuracies of the InceptionV3 model without ImageNet for real (1) and fake (0) malignant skin cancer.</p>
Full article ">Figure 7
<p>Training and validation losses and accuracies of the Xception model without ImageNet for real (1) and fake (0) malignant skin cancer.</p>
Full article ">Figure 8
<p>Training and validation losses and accuracies of ResNet152V2 model without ImageNet for real (1) and fake (0) malignant skin cancer.</p>
Full article ">Figure 9
<p>ViT base model vs. InceptionV3 (without weights) for real (1) and fake (0) malignant skin cancer.</p>
Full article ">Figure 10
<p>StratifiedKFold with k = 3 and proposed InceptionV3 (without weights) learning graphs for real (1) and fake (0) malignant skin cancer.</p>
Full article ">Figure 11
<p>Percentage of accuracy from the total testing images (<span class="html-italic">y</span>-axis) by percentage of users involved (<span class="html-italic">x</span>-axis).</p>
Full article ">Figure 12
<p>Human accuracy vs. deep learning models’ accuracy.</p>
Full article ">
31 pages, 12305 KiB  
Article
Living in the Age of Deepfakes: A Bibliometric Exploration of Trends, Challenges, and Detection Approaches
by Adrian Domenteanu, George-Cristian Tătaru, Liliana Crăciun, Anca-Gabriela Molănescu, Liviu-Adrian Cotfas and Camelia Delcea
Information 2024, 15(9), 525; https://doi.org/10.3390/info15090525 - 28 Aug 2024
Viewed by 3262
Abstract
In an era where all information can be reached with one click and by using the internet, the risk has increased in a significant manner. Deepfakes are one of the main threats on the internet, and affect society by influencing and altering information, [...] Read more.
In an era where all information can be reached with one click and by using the internet, the risk has increased in a significant manner. Deepfakes are one of the main threats on the internet, and affect society by influencing and altering information, decisions, and actions. The rise of artificial intelligence (AI) has simplified the creation of deepfakes, allowing even novice users to generate false information in order to create propaganda. One of the most prevalent methods of falsification involves images, as they constitute the most impactful element with which a reader engages. The second most common method pertains to videos, which viewers often interact with. Two major events led to an increase in the number of deepfake images on the internet, namely the COVID-19 pandemic and the Russia–Ukraine conflict. Together with the ongoing “revolution” in AI, deepfake information has expanded at the fastest rate, impacting each of us. In order to reduce the risk of misinformation, users must be aware of the deepfake phenomenon they are exposed to. This also means encouraging users to more thoroughly consider the sources from which they obtain information, leading to a culture of caution regarding any new information they receive. The purpose of the analysis is to extract the most relevant articles related to the deepfake domain. Using specific keywords, a database was extracted from Clarivate Analytics’ Web of Science Core Collection. Given the significant annual growth rate of 161.38% and the relatively brief period between 2018 and 2023, the research community demonstrated keen interest in the issue of deepfakes, positioning it as one of the most forward-looking subjects in technology. This analysis aims to identify key authors, examine collaborative efforts among them, explore the primary topics under scrutiny, and highlight major keywords, bigrams, or trigrams utilized. Additionally, this document outlines potential strategies to combat the proliferation of deepfakes in order to preserve information trust. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Bradford’s law on source clustering.</p>
Full article ">Figure 2
<p>Top 10 most relevant sources.</p>
Full article ">Figure 3
<p>Top 10 sources based on the H-index.</p>
Full article ">Figure 4
<p>Source production over time.</p>
Full article ">Figure 5
<p>Top 10 most relevant authors.</p>
Full article ">Figure 6
<p>Author productivity through Lotka’s law.</p>
Full article ">Figure 7
<p>Top 10 authors’ production over time.</p>
Full article ">Figure 8
<p>Top 10 authors’ impact based on the H-index.</p>
Full article ">Figure 9
<p>Top 10 most relevant affiliations.</p>
Full article ">Figure 10
<p>Top 10 most cited countries.</p>
Full article ">Figure 11
<p>Country scientific production.</p>
Full article ">Figure 12
<p>Top 10 most important corresponding author’s countries.</p>
Full article ">Figure 13
<p>Country collaboration map.</p>
Full article ">Figure 14
<p>Thematic map.</p>
Full article ">Figure 15
<p>Three-field plot of countries (<b>left</b>), authors (<b>middle</b>), and journals (<b>right</b>).</p>
Full article ">Figure 16
<p>Three-field plot of affiliations (<b>left</b>), authors (<b>middle</b>), and keywords (<b>right</b>).</p>
Full article ">
Back to TopTop