Chinese Cyberbullying Detection Using XLNet and Deep Bi-LSTM Hybrid Model
<p>Proposed model.</p> "> Figure 2
<p>Two-stream self-attention and permutation language model training process [<a href="#B33-information-15-00093" class="html-bibr">33</a>].</p> "> Figure 3
<p>Long short-term memory (LSTM) cell architecture.</p> "> Figure 4
<p>Comparison of weighted average F1-score for all methods.</p> "> Figure 5
<p>The effect of deepening the layers of Bi-LSTM on the results of the proposed model.</p> "> Figure 6
<p>Comparison between the proposed model and four advanced methods.</p> ">
Abstract
:1. Introduction
- A hybrid model for cyberbullying speech detection based on XLNet and deep Bi-LSTM is proposed. XLNet combines the advantages of autoregressive (AR) and autoencoding (AE) language models and overcomes their limitations, and after deep Bi-LSTM bidirectional coding, it improves the accuracy of Chinese cyberbullying detection.
- The Chinese offensive language dataset (COLDATASET [11]) was relabeled and expanded. In total, 1.66 k offensive remarks crawled from 10 real cyberbullying incidents that happened in recent years as well as one-star bad reviews from the Chinese community website Douban were added. While adding more features of cyberbullying language, the data is balanced as much as possible to avoid the problem of model bias caused by over- or under-sampling.
- A variety of methods using traditional machine learning, deep learning and Chinese pre-trained models are used as baseline for experiments on the expanded dataset to detect whether textual speech involves cyberbullying. The detection performance between different methods is also compared.
2. Related Work
2.1. Detection of Cyberbullying
2.2. Limitations of Existing Research
- The research is mostly conducted from the perspective of bullying vocabulary. Some social media platforms have the function of keyword filtering and blocking so that bullying words cannot be displayed. However, more bullying behaviors use implicit remarks such as mockery, innuendo, rhetorical questions and denigration. Although they do not include direct bullying vocabularies, they may cause serious psychological harm to the victims. Therefore, it is not enough to rely only on the judgment of keyword filtering for this kind of behavior, and it is necessary to dig deeper into the semantics for the judgment of bullying behavior.
- There is still no standardized dataset for the detection of Chinese cyberbullying. For the study of cyberbullying, most of the scholars crawled from social media platforms to construct datasets, and unfortunately, none of the above studies have disclosed the datasets used.
3. Methodology
3.1. Proposed Method
Algorithm 1 Cyberbullying detection model and training process. |
Input: , training set; , the initial parameters Output: , the trained parameters Parameters: , the number of lstm layers; , the number of classifications Hyperparameters: ; , learning rate
|
3.2. Embedding Layer
3.3. Bi-LSTM Layer
3.4. Output Layer
4. Experiment
4.1. Dataset
4.2. Experimental Settings
4.3. Ablation Study
5. Results and Discussion
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Kumar, A.; Sachdeva, N. Cyberbullying detection on social multimedia using soft computing techniques: A meta-analysis. Multimed. Tools Appl. 2019, 78, 23973–24010. [Google Scholar] [CrossRef]
- Smith, P.K.; Mahdavi, J.; Carvalho, M.; Fisher, S.; Russell, S.; Tippett, N. Cyberbullying: Its nature and impact in secondary school pupils. J. Child Psychol. Psychiatry 2008, 49, 376–385. [Google Scholar] [CrossRef] [PubMed]
- Kwan, I.; Dickson, K.; Richardson, M.; MacDowall, W.; Burchett, H.; Stansfield, C.; Brunton, G.; Sutcliffe, K.; Thomas, J. Cyberbullying and children and young people’s mental health: A systematic map of systematic reviews. Cyberpsychol. Behav. Soc. Netw. 2020, 23, 72–82. [Google Scholar] [CrossRef]
- Smith, P.K.; Del Barrio, C.; Tokunaga, R.S. Definitions of bullying and cyberbullying: How useful are the terms. In Principles of Cyberbullying Research: Definitions, Measures, and Methodology; Routledge: London, UK, 2013; pp. 26–40. [Google Scholar]
- Englander, E.; Donnerstein, E.; Kowalski, R.; Lin, C.A.; Parti, K. Defining cyberbullying. Pediatrics 2017, 140 (Suppl. S2), S148–S151. [Google Scholar] [CrossRef] [PubMed]
- Pieschl, S.; Porsch, T.; Kahl, T.; Klockenbusch, R. Relevant dimensions of cyberbullying—Results from two experimental studies. J. Appl. Dev. Psychol. 2013, 34, 241–252. [Google Scholar] [CrossRef]
- Nixon, C.L. Current perspectives: The impact of cyberbullying on adolescent health. Adolesc. Health Med. Ther. 2014, 5, 143–158. [Google Scholar] [CrossRef]
- Dooley, J.J.; Pyżalski, J.; Cross, D. Cyberbullying versus face-to-face bullying: A theoretical and conceptual review. Z. Psychol. Psychol. 2009, 217, 182–188. [Google Scholar] [CrossRef]
- Slonje, R.; Smith, P.K.; Frisén, A. The nature of cyberbullying, and strategies for prevention. Comput. Hum. Behav. 2013, 29, 26–32. [Google Scholar] [CrossRef]
- Zhu, C.; Huang, S.; Evans, R.; Zhang, W. Cyberbullying among adolescents and children: A comprehensive review of the global situation, risk factors, and preventive measures. Front. Public Health 2021, 9, 634909. [Google Scholar] [CrossRef]
- Deng, J.; Zhou, J.; Sun, H.; Zheng, C.; Mi, F.; Meng, H.; Huang, M. Cold: A benchmark for chinese offensive language detection. arXiv 2022, arXiv:2201.06025. [Google Scholar]
- Yin, D.; Xue, Z.; Hong, L.; Davison, B.D.; Kontostathis, A.; Edwards, L. Detection of harassment on web 2.0. In Proceedings of the Content Analysis in the WEB, Madrid, Spain, 21 April 2009; Volume 2, pp. 1–7. [Google Scholar]
- Reynolds, K.; Kontostathis, A.; Edwards, L. Using machine learning to detect cyberbullying. In Proceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops, Honolulu, HI, USA, 18–21 December 2011; Volume 2, pp. 241–244. [Google Scholar]
- Dinakar, K.; Reichart, R.; Lieberman, H. Modeling the detection of textual cyberbullying. Proc. Int. Aaai Conf. Web Soc. Media 2011, 5, 11–17. [Google Scholar] [CrossRef]
- Sarna, G.; Bhatia, M.P.S. Content based approach to find the credibility of user in social networks: An application of cyberbullying. Int. J. Mach. Learn. Cybern. 2017, 8, 677–689. [Google Scholar] [CrossRef]
- Islam, M.M.; Uddin, M.A.; Islam, L.; Akter, A.; Sharmin, S.; Acharjee, U.K. Cyberbullying detection on social networks using machine learning approaches. In Proceedings of the 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), Gold Coast, Australia, 16–18 December 2020; pp. 1–6. [Google Scholar]
- Zhang, A.; Li, B.; Wan, S.; Wang, K. Cyberbullying detection with birnn and attention mechanism. In International Conference on Machine Learning and Intelligent Communications; Springer International Publishing: Cham, Switzerland, 2019; pp. 623–635. [Google Scholar]
- Dewani, A.; Memon, M.A.; Bhatti, S. Cyberbullying detection: Advanced preprocessing techniques & deep learning architecture for Roman Urdu data. J. Big Data 2021, 8, 160. [Google Scholar]
- Eronen, J.; Ptaszynski, M.; Masui, F.; Smywiński-Pohl, A.; Leliwa, G.; Wroczynski, M. Improving classifier training efficiency for automatic cyberbullying detection with feature density. Inf. Process. Manag. 2021, 58, 102616. [Google Scholar] [CrossRef]
- Kumar, A.; Sachdeva, N. A Bi-GRU with attention and CapsNet hybrid model for cyberbullying detection on social media. World Wide Web 2022, 25, 1537–1550. [Google Scholar] [CrossRef]
- Yuvaraj, N.; Srihari, K.; Dhiman, G.; Somasundaram, K.; Sharma, A.; Rajeskannan, S.M.G.S.M.A.; Soni, M.; Gaba, G.S.; AlZain, M.A.; Masud, M. Nature-inspired-based approach for automated cyberbullying classification on multimedia social networking. Math. Probl. Eng. 2021, 2021, 6644652. [Google Scholar] [CrossRef]
- Paul, S.; Saha, S. CyberBERT: BERT for cyberbullying identification: BERT for cyberbullying identification. Multimed. Syst. 2022, 28, 1897–1904. [Google Scholar] [CrossRef]
- Tripathy, J.K.; Chakkaravarthy, S.S.; Satapathy, S.C.; Sahoo, M.; Vaidehi, V. ALBERT-based fine-tuning model for cyberbullying analysis. Multimed. Syst. 2022, 28, 1941–1949. [Google Scholar] [CrossRef]
- Zinovyeva, E.; Härdle, W.K.; Lessmann, S. Antisocial online behavior detection using deep learning. Decis. Support Syst. 2020, 138, 113362. [Google Scholar] [CrossRef]
- Jahan, M.S.; Oussalah, M. A systematic review of Hate Speech automatic detection using Natural Language Processing. Neurocomputing 2023, 546, 126232. [Google Scholar] [CrossRef]
- Li, W. A Content-Based Approach for Analysing Cyberbullying on Sina Weibo. In Proceedings of the 2nd International Conference on Information Management and Management Sciences, Chengdu, China, 23–25 August 2019; pp. 33–37. [Google Scholar]
- Zhong, J.; Qiu, J.; Sun, M.; Jin, X.; Zhang, J.; Guo, Y.; Qiu, X.; Xu, Y.; Huang, J.; Zheng, Y. To be ethical and responsible digital citizens or not: A linguistic analysis of cyberbullying on social media. Front. Psychol. 2022, 13, 861823. [Google Scholar] [CrossRef]
- Zhang, S. From flaming to incited crime: Recognising cyberbullying on Chinese wechat account. Int. J. Semiot. Law-Rev. Int. SéMiotique Jurid. 2021, 34, 1093–1116. [Google Scholar] [CrossRef]
- Zhang, A.; Lipton, Z.C.; Li, M.; Smola, A.J. Dive into deep learning. arXiv 2021, arXiv:2106.11342. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Peters, M.E.; Neumann, M.; Iyyer, M.; Gardner, M.; Clark, C.; Lee, K.; Zettlemoyer, L. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LA, USA, 1–6 June 2018; pp. 2227–2237, Association for Computational Linguistics. [Google Scholar]
- Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving Language Understanding by Generative Pre-Training. 2018. Available online: https://www.mikecaptain.com/resources/pdf/GPT-1.pdf (accessed on 5 December 2023).
- Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R.R.; Le, Q.V. Xlnet: Generalized Autoregressive Pretraining for Language Understanding. In Advances in Neural Information Processing Systems; 2019; Volume 32, Available online: https://papers.nips.cc/paper_files/paper/2019/hash/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html (accessed on 5 December 2023).
- Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
- Cui, Y.; Che, W.; Liu, T.; Qin, B.; Wang, S.; Hu, G. Revisiting pre-trained models for Chinese natural language processing. arXiv 2020, arXiv:2004.13922. [Google Scholar]
- Dai, Z.; Yang, Z.; Yang, Y.; Carbonell, J.; Le, Q.V.; Salakhutdinov, R. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv 2019, arXiv:1901.02860. [Google Scholar]
- Zhang, Y.; Wallace, B. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv 2015, arXiv:1510.03820. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A robustly optimized bert pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. Albert: A lite bert for self-supervised learning of language representations. arXiv 2019, arXiv:1909.11942. [Google Scholar]
- Sun, Y.; Wang, S.; Feng, S.; Ding, S.; Pang, C.; Shang, J.; Liu, J.; Chen, X.; Zhao, Y.; Lu, Y.; et al. Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation. arXiv 2021, arXiv:2107.02137. [Google Scholar]
- Cui, Y.; Che, W.; Wang, S.; Liu, T. Lert: A linguistically-motivated pre-trained language model. arXiv 2022, arXiv:2211.05344. [Google Scholar]
- Clark, K.; Luong, M.T.; Le, Q.V.; Manning, C.D. Electra: Pre-training text encoders as discriminators rather than generators. arXiv 2020, arXiv:2003.10555. [Google Scholar]
- Bender, E.M.; Gebru, T.; McMillan-Major, A.; Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event, Canada, 3–10 March 2021; pp. 610–623. [Google Scholar]
- Hamid, O.H. ChatGPT and the Chinese Room Argument: An Eloquent AI Conversationalist Lacking True Understanding and Consciousness. In Proceedings of the 2023 9th International Conference on Information Technology Trends (ITT), Dubai, United Arab Emirates, 24–25 May 2023; pp. 238–241. [Google Scholar]
- Hull, G. Unlearning Descartes: Sentient AI is a Political Problem. J. Soc. Comput. 2023, 4, 193–204. [Google Scholar] [CrossRef]
- Hamid, O.H. There Is More to AI than Meets the Eye: Aligning Man-made Algorithms with Nature-inspired Mechanisms. In Proceedings of the 2022 IEEE/ACS 19th International Conference on Computer Systems and Applications (AICCSA), Abu Dhabi, United Arab, 5–8 December 2022; pp. 1–4. [Google Scholar]
COLDATASET* | Cyberbullying | Non-Cyberbullying | Total | avg#char | min#char | max#char |
---|---|---|---|---|---|---|
Train | 14,488 | 14,503 | 28,991 | 46.7 | 1 | 1217 |
Dev | 2500 | 2500 | 5000 | 46.6 | 1 | 150 |
Test | 2500 | 2500 | 5000 | 47.0 | 1 | 155 |
Total | 19,488 | 19,503 | 38,991 | 46.7 | 1 | 1217 |
Identifier | Cyberbullying Incident |
---|---|
Case 1 | Niu Yu, a girl who survived the Wenchuan earthquake, was viciously abused |
Case 2 | Hangzhou Girl Zheng Linghua committed suicide due to cyberbullying over her pink hair |
Case 3 | Internet celebrity Guan Guan committed suicide due to cyberbullying |
Case 4 | 100 Day Pledge Speech Girl gets cyberbullied |
Case 5 | Family-seeking boy Liu Xuezhou killed by cyberbullying |
Case 6 | Married mother Tang committed suicide due to cyberbullying |
Case 7 | Dr. An committed suicide due to cyberbullying |
Case 8 | Wuhan Sugar Water Grandpa who sold 2 Yuan sugar water suffered from cyberbullying |
Case 9 | A woman jumped from a building after suffering from cyberbullying because she gave the delivery boy 200 yuan to show thanks |
Case 10 | An oolong incident about a Tsinghua senior falsely accused a junior student of sexual harassment |
Method | Layers | |||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
+TextCNN | 0.9020 | - | - | - | - | - |
+Bi-GRU | 0.9012 | 0.9022 | 0.9016 | 0.8997 | 0.9012 | 0.9010 |
+LSTM | 0.8992 | 0.9028 | 0.9018 | 0.9018 | 0.8983 | 0.9016 |
+GRU | 0.9004 | 0.9022 | 0.9006 | 0.9036 | 0.9015 | 0.8994 |
Proposed | 0.8986 | 0.9012 | 0.9017 | 0.9043 | 0.9011 | 0.8990 |
Method | Non-Cyberbullying | Cyberbullying | ||||
---|---|---|---|---|---|---|
Precision | Recall | F1-Score | Precision | Recall | F1-Score | |
NB | 0.8972 | 0.7328 | 0.8067 | 0.7742 | 0.9160 | 0.8391 |
SVM | 0.8517 | 0.8752 | 0.8623 | 0.8717 | 0.8476 | 0.8595 |
LR | 0.8442 | 0.8712 | 0.8575 | 0.8669 | 0.8392 | 0.8528 |
RF | 0.8545 | 0.8388 | 0.8466 | 0.8417 | 0.8572 | 0.8494 |
Method | Non-Cyberbullying | Cyberbullying | ||||
---|---|---|---|---|---|---|
Precision | Recall | F1-Score | Precision | Recall | F1-Score | |
TextCNN | 0.9027 | 0.8720 | 0.8871 | 0.8762 | 0.9060 | 0.8909 |
RNN | 0.5129 | 0.9392 | 0.6635 | 0.6398 | 0.1080 | 0.1848 |
GRU | 0.9018 | 0.8740 | 0.8877 | 0.8778 | 0.9048 | 0.8911 |
LSTM | 0.9037 | 0.8592 | 0.8809 | 0.8658 | 0.9084 | 0.8866 |
Bi-GRU | 0.9052 | 0.8636 | 0.8839 | 0.8696 | 0.9096 | 0.8891 |
Bi-LSTM | 0.8867 | 0.8736 | 0.8801 | 0.8754 | 0.8884 | 0.8819 |
Method | Non-Cyberbullying | Cyberbullying | ||||
---|---|---|---|---|---|---|
Precision | Recall | F1-Score | Precision | Recall | F1-Score | |
BERT | 0.9015 | 0.8824 | 0.8919 | 0.8848 | 0.9036 | 0.8914 |
XLNet | 0.9147 | 0.8792 | 0.8966 | 0.8837 | 0.9180 | 0.9005 |
RoBERTa | 0.8955 | 0.8944 | 0.8949 | 0.8945 | 0.8956 | 0.8951 |
ALBERT | 0.8685 | 0.8744 | 0.8714 | 0.8735 | 0.8676 | 0.8706 |
ERNIE3.0 | 0.9062 | 0.8732 | 0.8894 | 0.8777 | 0.9096 | 0.8933 |
LERT | 0.9147 | 0.8668 | 0.8901 | 0.8734 | 0.9192 | 0.8957 |
MacBERT | 0.8967 | 0.8992 | 0.8979 | 0.8989 | 0.8964 | 0.8977 |
ELECTRA | 0.9079 | 0.8716 | 0.8894 | 0.8765 | 0.9116 | 0.8937 |
Proposed | 0.9100 | 0.8974 | 0.9037 | 0.8988 | 0.9112 | 0.9050 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, S.; Wang, J.; He, K. Chinese Cyberbullying Detection Using XLNet and Deep Bi-LSTM Hybrid Model. Information 2024, 15, 93. https://doi.org/10.3390/info15020093
Chen S, Wang J, He K. Chinese Cyberbullying Detection Using XLNet and Deep Bi-LSTM Hybrid Model. Information. 2024; 15(2):93. https://doi.org/10.3390/info15020093
Chicago/Turabian StyleChen, Shifeng, Jialin Wang, and Ketai He. 2024. "Chinese Cyberbullying Detection Using XLNet and Deep Bi-LSTM Hybrid Model" Information 15, no. 2: 93. https://doi.org/10.3390/info15020093