[go: up one dir, main page]

Skip to main content

Advertisement

Log in

Early detection of fake news on emerging topics through weak supervision

  • Research
  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

In this paper, we present a methodology for the early detection of fake news on emerging topics through the innovative application of weak supervision. Traditional techniques for fake news detection often rely on fact-checkers or supervised learning with labeled data, which is not readily available for emerging topics. To address this, we introduce the Weakly Supervised Text Classification framework (WeSTeC), an end-to-end solution designed to programmatically label large-scale text datasets within specific domains and train supervised text classifiers using the assigned labels. The proposed framework automatically generates labeling functions through multiple weak labeling strategies and eliminates underperforming ones. Labels assigned through the generated labeling functions are then used to fine-tune a pre-trained RoBERTa classifier for fake news detection. By using a weakly labeled dataset, which contains fake news related to the emerging topic, the trained fake news detection model becomes specialized for the topic under consideration. We explore both semi-supervision and domain adaptation setups, utilizing small amounts of labeled data and labeled data from other domains, respectively. The fake news classification model generated by the proposed framework excels when compared with all baselines in both setups. In addition, when compared to its fully supervised counterpart, our fake news detection model trained through weak labels achieves accuracy within 1%, emphasizing the robustness of the proposed framework’s weak labeling capabilities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of supporting data

The supporting data and associated programs for this journal submission are available in a dedicated GitHub repository. Access to these resources is available upon request. Interested parties may request access to the data and programs by contacting the corresponding author of this submission.

Notes

  1. https://www.snorkel.org/

  2. https://simpletransformers.ai/

  3. http://ai.stanford.edu/blog/weak-supervision/

  4. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.describe.html

  5. https://numpy.org/doc/stable/reference/generated/numpy.percentile.html

  6. https://simpletransformers.ai/

  7. https://wandb.ai/site

References

Download references

Acknowledgements

Not applicable

Funding

Not applicable

Author information

Authors and Affiliations

Authors

Contributions

S.A. and N.K.C. jointly contributed to this manuscript. N.C. revised the work for intellectual content and approved the final version for publication. S.A. played a pivotal role in the conception, design, data analysis, and software development for the study. Both authors reviewed and finalized the manuscript.

Corresponding author

Correspondence to Nihan Kesim Cicekli.

Ethics declarations

Competing interests

The authors claim that they do not have any conflicts of interest.

Ethical Approval

Not applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Akdag, S.H., Cicekli, N.K. Early detection of fake news on emerging topics through weak supervision. J Intell Inf Syst 62, 1263–1284 (2024). https://doi.org/10.1007/s10844-024-00852-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-024-00852-1

Keywords