Early detection of fake news on emerging topics through weak supervision

Serhat Hakki Akdag¹ &
Nihan Kesim Cicekli¹

412 Accesses
Explore all metrics

Abstract

In this paper, we present a methodology for the early detection of fake news on emerging topics through the innovative application of weak supervision. Traditional techniques for fake news detection often rely on fact-checkers or supervised learning with labeled data, which is not readily available for emerging topics. To address this, we introduce the Weakly Supervised Text Classification framework (WeSTeC), an end-to-end solution designed to programmatically label large-scale text datasets within specific domains and train supervised text classifiers using the assigned labels. The proposed framework automatically generates labeling functions through multiple weak labeling strategies and eliminates underperforming ones. Labels assigned through the generated labeling functions are then used to fine-tune a pre-trained RoBERTa classifier for fake news detection. By using a weakly labeled dataset, which contains fake news related to the emerging topic, the trained fake news detection model becomes specialized for the topic under consideration. We explore both semi-supervision and domain adaptation setups, utilizing small amounts of labeled data and labeled data from other domains, respectively. The fake news classification model generated by the proposed framework excels when compared with all baselines in both setups. In addition, when compared to its fully supervised counterpart, our fake news detection model trained through weak labels achieves accuracy within 1%, emphasizing the robustness of the proposed framework’s weak labeling capabilities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fake News Detection by Weakly Supervised Learning Based on Content Features

I-S$^2$FND: a novel interpretable self-ensembled semi-supervised model based on transformers for fake news detection

Article 19 October 2023

On Unsupervised Methods for Fake News Detection

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of supporting data

The supporting data and associated programs for this journal submission are available in a dedicated GitHub repository. Access to these resources is available upon request. Interested parties may request access to the data and programs by contacting the corresponding author of this submission.

Notes

References

Dong, X., Victor, U., & Qian, L. (2020). Two-path deep semisupervised learning for timely fake news detection. IEEE Transactions on Computational Social Systems, 7(6), 1386–1398. https://doi.org/10.1109/TCSS.2020.3027639
Article Google Scholar
D’ulizia, A., Caschera, M.C., Ferri, F., et al. (2021). Fake news detection: a survey of evaluation datasets. PeerJ Computer Science, 7, e518. https://doi.org/10.7717/peerj-cs.518
Galli, A., Masciari, E., Moscato, V., et al. (2022). A comprehensive benchmark for fake news detection. Journal of Intelligent Information Systems, 59(1), 237–261. https://doi.org/10.1007/s10844-021-00646-9
Article Google Scholar
Gasparetto, A., Marcuzzo, M., Zangari, A., et al. (2022). A survey on text classification algorithms: From text to predictions. Information 13(2). https://doi.org/10.3390/info13020083
Gruppi, M., Horne, B.D., & Adalı, S. (2021). Nela-gt-2020: A large multi-labelled news dataset for the study of misinformation in news articles. arXiv preprint arXiv:2102.04567 https://doi.org/10.48550/arXiv.2102.04567
Hamed, S. K., Aziz, M. J. A., & Yaakub, M. R. (2023). A review of fake news detection approaches: A critical analysis of relevant studies and highlighting key challenges associated with the dataset, feature representation, and data fusion. Heliyon 9(10). https://doi.org/10.1016/j.heliyon.2023.e20382
Horne, B. D., & Adali, S. (2017). This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. arXiv:1703.09398. https://api.semanticscholar.org/CorpusID:7083781
Hu, L., Wei, S., Zhao, Z., et al. (2022). Deep learning for fake news detection: A comprehensive survey. AI Open, 3, 133–155. https://doi.org/10.1016/j.aiopen.2022.09.001
Article Google Scholar
Jlifi, B., Sakrani, C., & Duvallet, C. (2023). Towards a soft three-level voting model (soft t-lvm) for fake news detection. Journal of Intelligent Information Systems, 61(1), 249–269. https://doi.org/10.1007/s10844-022-00769-7
Article Google Scholar
Konkobo, P. M., Zhang, R., Huang, S., et al. (2020). A deep learning model for early detection of fake news on social media. In: 2020 7th International Conference on Behavioural and Social Computing (BESC), IEEE, (pp 1–6). https://doi.org/10.1109/BESC51023.2020.9348311
Lazer, D. M., Baum, M. A., Benkler, Y., et al. (2018). The science of fake news. Science, 359(6380), 1094–1096. https://doi.org/10.1126/science.aao2998
Article Google Scholar
Leite, J. A., Razuvayevskaya, O., Bontcheva, K., et al. (2023). Detecting misinformation with llm-predicted credibility signals and weak supervision. arXiv:2309.07601. https://doi.org/10.48550/arXiv.2309.07601
Li, Y., Lee, K., Kordzadeh, N., et al. (2021). Multi-source domain adaptation with weak supervision for early fake news detection. In: 2021 IEEE International Conference on Big Data (Big Data), IEEE, (pp. 668–676). https://doi.org/10.1109/BigData52589.2021.9671592
Liu, Y., Ott, M., Goyal, N., et al. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692. https://doi.org/10.48550/arXiv.1907.11692
Mohawesh, R., Maqsood, S., & Althebyan, Q. (2023). Multilingual deep learning framework for fake news detection using capsule neural network. Journal of Intelligent Information Systems (pp. 1–17). https://doi.org/10.1007/s10844-023-00788-y
Ngada, O., & Haskins, B. (2020). Fake news detection using content-based features and machine learning. In: 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), (pp. 1–6). https://doi.org/10.1109/CSDE50874.2020.9411638
Özgöbek, Ö., Kille, B., From, A. R., et al. (2022). Fake news detection by weakly supervised learning based on content features. In: Symposium of the Norwegian AI Society, (pp. 52–64), https://doi.org/10.1007/978-3-031-17030-0_5
Qin, Y., Wurzer, D., Lavrenko, V., et al. (2016). Spotting rumors via novelty detection. arXiv:1611.06322. https://doi.org/10.48550/arXiv.1611.06322
Ratner, A. J., Bach, S. H., Ehrenberg, H. R., et al. (2017). Snorkel: rapid training data creation with weak supervision. The VLDB Journal, 29, 709–730. https://doi.org/10.1007/s00778-019-00552-1
Article Google Scholar
Raza, S., & Ding, C. (2022). Fake news detection based on news content and social contexts: a transformer-based approach. International Journal of Data Science and Analytics, 13, 335–362. https://doi.org/10.1007/s41060-021-00302-z
Article Google Scholar
Ren, Y., Wang, B., Zhang, J., et al (2020) Adversarial active learning based heterogeneous graph neural network for fake news detection. 2020 IEEE International Conference on Data Mining (ICDM) (pp. 452–461). https://doi.org/10.1109/ICDM50108.2020.00054
Samadi, M., Mousavian, M., & Momtazi, S. (2021). Deep contextualized text representation and learning for fake news detection. Information processing & management 58(6). https://doi.org/10.1016/j.ipm.2021.102723
Shu, K., Zheng, G., Li, Y., et al. (2020). Early detection of fake news with multi-source weak social supervision. In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, Sep. 14–18, Proceedings, Part III, https://doi.org/10.1007/978-3-030-67664-3_39
Singh, V. K., Ghosh, I., & Sonagara, D. (2021). Detecting fake news stories via multimodal analysis. Journal of the Association for Information Science and Technology, 72(1), 3–17. https://doi.org/10.1002/asi.24359
Article Google Scholar
Varma, P., & Ré, C. (2018). Snuba: Automating weak supervision to label training data. In: Proceedings of the VLDB Endowment. International Conference on Very Large Data Bases, (p. 223). https://doi.org/10.14778/3291264.3291268
Wang, Y., Yang, W., Ma, F., et al. (2020). Weak supervision for fake news detection via reinforcement learning. In: Proceedings of the AAAI conference on artificial intelligence, (pp. 516–523). https://doi.org/10.1609/aaai.v34i01.5389
Wu, R., Chen, S. E., Zhang, J., et al. (2023). Learning hyper label model for programmatic weak supervision. https://doi.org/10.48550/arXiv.2207.13545
Yuan C., et al. (2020) Early detection of fake news by utilizing the credibility of news, publishers, and users based on weakly supervised learning. In: Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain (Online), (pp. 5444–5454). https://doi.org/10.18653/v1/2020.coling-main.475
Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys, 53(5), 1–40. https://doi.org/10.1145/3395046
Article Google Scholar

Download references

Acknowledgements

Not applicable

Funding

Not applicable

Author information

Authors and Affiliations

Department of Computer Engineering, Middle East Technical University, Ankara, 06800, Turkey
Serhat Hakki Akdag & Nihan Kesim Cicekli

Authors

Serhat Hakki Akdag
View author publications
You can also search for this author in PubMed Google Scholar
Nihan Kesim Cicekli
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.A. and N.K.C. jointly contributed to this manuscript. N.C. revised the work for intellectual content and approved the final version for publication. S.A. played a pivotal role in the conception, design, data analysis, and software development for the study. Both authors reviewed and finalized the manuscript.

Corresponding author

Correspondence to Nihan Kesim Cicekli.

Ethics declarations

Competing interests

The authors claim that they do not have any conflicts of interest.

Ethical Approval

Not applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Akdag, S.H., Cicekli, N.K. Early detection of fake news on emerging topics through weak supervision. J Intell Inf Syst 62, 1263–1284 (2024). https://doi.org/10.1007/s10844-024-00852-1

Download citation

Received: 17 October 2023
Revised: 26 February 2024
Accepted: 27 February 2024
Published: 15 March 2024
Issue Date: October 2024
DOI: https://doi.org/10.1007/s10844-024-00852-1

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fake News Detection by Weakly Supervised Learning Based on Content Features

I-S\(^2\)FND: a novel interpretable self-ensembled semi-supervised model based on transformers for fake news detection

On Unsupervised Methods for Fake News Detection

Availability of supporting data

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Early detection of fake news on emerging topics through weak supervision

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fake News Detection by Weakly Supervised Learning Based on Content Features

I-S\(^2\)FND: a novel interpretable self-ensembled semi-supervised model based on transformers for fake news detection

On Unsupervised Methods for Fake News Detection

Explore related subjects

Availability of supporting data

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now