Open Text Classification Based on Dynamic Boundary Balance

Ganlin Xu¹⁵,
Jianzhou Feng¹⁵ &
Qikai Wei¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14178))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

483 Accesses

Abstract

Open classification is the problem where there exist some unseen/unknown classes in the test set, i.e., these unknown/unseen classes don’t appear when the model is trained. Existing work often maps samples to high-dimensional space to make decisions, which leads to unobservable and inexplicable results. To address the issue, we shift perspectives to two-dimensional space and put forward a two-stage learning method built on the dynamic decision boundaries balance. We refer it to open classification with dynamic boundary balance (OCD2B). First, we construct a vanilla classifier via known classes with BERT model. Then, we use the prior knowledge of known classes to dynamically determine the decision boundaries between known classes and unknown classes in low-dimensional space. We propose a novel boundary loss function as a boundary balance strategy to reduce open space risk and empirical risk. Experimental results on two standard datasets show that our method achieves performance gain over existing methods, providing easily observable results. In particular, the larger the ratio of unseen classes is, the more obvious the performance advantage the model achieves.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning Refined Features for Open-World Text Classification

Deep active learning for multi label text classification

Article Open access 15 November 2024

MCVIE: An Effective Batch-Mode Active Learning for Multi-label Text Classification

References

Akbari, M., Mohades, A., Shirali-Shahreza, M.H.: A hybrid architecture for out of domain intent detection and intent discovery. arXiv preprint arXiv:2303.04134 (2023)
Bai, K., et al.: Open world classification with adaptive negative samples. arXiv preprint arXiv:2303.05581 (2023)
Bendale, A., Boult, T.E.: Towards open set deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1563–1572 (2016)
Google Scholar
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)
Google Scholar
Casanueva, I., Temčinas, T., Gerz, D., Henderson, M., Vulić, I.: Efficient intent detection with dual sentence encoders. arXiv preprint arXiv:2003.04807 (2020)
Chen, G., Peng, P., Wang, X., Tian, Y.: Adversarial reciprocal points learning for open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 8065–8081 (2021)
Google Scholar
Dilrukshi, I., De Zoysa, K., Caldera, A.: Twitter news classification using SVM. In: 2013 8th International Conference on Computer Science & Education, pp. 287–291. IEEE (2013)
Google Scholar
Fei, G., Liu, B.: Breaking the closed world assumption in text classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 506–514 (2016)
Google Scholar
Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136 (2016)
Jain, L.P., Scheirer, W.J., Boult, T.E.: Multi-class open set recognition using probability of inclusion. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 393–409. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_26
Chapter Google Scholar
Larson, S., et al.: An evaluation dataset for intent classification and out-of-scope prediction. arXiv preprint arXiv:1909.02027 (2019)
Lin, T.E., Xu, H.: Deep unknown intent detection with margin loss. arXiv preprint arXiv:1906.00434 (2019)
Lin, T.E., Xu, H.: A post-processing method for detecting unknown intent of dialogue system via pre-trained deep neural network classifier. Knowl.-Based Syst. 186, 104979 (2019)
Article Google Scholar
Neal, L., Olson, M., Fern, X., Wong, W.K., Li, F.: Open set learning with counterfactual images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 613–628 (2018)
Google Scholar
Puniškis, D., Laurutis, R., Dirmeikis, R.: An artificial neural nets for spam e-mail recognition. Elektronika ir Elektrotechnika 69(5), 73–76 (2006)
Google Scholar
Qin, Q., Hu, W., Liu, B.: Text classification with novelty detection. arXiv preprint arXiv:2009.11119 (2020)
Scheirer, W.J., Jain, L.P., Boult, T.E.: Probability models for open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2317–2324 (2014)
Article Google Scholar
Scheirer, W.J., de Rezende Rocha, A., Sapkota, A., Boult, T.E.: Toward open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1757–1772 (2012)
Article Google Scholar
Shu, L., Benajiba, Y., Mansour, S., Zhang, Y.: Odist: open world classification via distributionally shifted instances. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 3751–3756 (2021)
Google Scholar
Shu, L., Xu, H., Liu, B.: Doc: deep open classification of text documents. arXiv preprint arXiv:1709.08716 (2017)
Vo, B.K.H., Collier, N.: Twitter emotion analysis in earthquake situations. Int. J. Comput. Linguist. Appl. 4(1), 159–173 (2013)
Google Scholar
Xu, H., Liu, B., Shu, L., Yu, P.: Open-world learning and application to product classification. In: The World Wide Web Conference, pp. 3413–3419 (2019)
Google Scholar
Zeng, Z., et al.: Modeling discriminative representations for out-of-domain detection with supervised contrastive learning. arXiv preprint arXiv:2105.14289 (2021)
Zhan, L.M., Liang, H., Liu, B., Fan, L., Wu, X.M., Lam, A.: Out-of-scope intent detection with self-supervision and discriminative training. arXiv preprint arXiv:2106.08616 (2021)
Zhang, H., Xu, H., Lin, T.E.: Deep open intent classification with adaptive decision boundary. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14374–14382 (2021)
Google Scholar
Zhang, H., Xu, H., Zhao, S., Zhou, Q.: Learning discriminative representations and decision boundaries for open intent detection (2022)
Google Scholar
Zhou, Y., Liu, P., Qiu, X.: KNN-contrastive learning for out-of-domain intent classification. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers, pp. 5129–5141 (2022)
Google Scholar

Download references

Acknowledgements

This work is supported by the Zhongyuanyingcai program-funded to central plains science and technology innovation leading talent program (No. 204200510002), the General program of Hebei Natural Science Foundation (No. F2022203028), Program for Top 100 Innovative Talents in Colleges and Universities of Hebei Province (CXZZSS2023038), the General program of National Natural Science Foundation of China (No. 62172352) and the Central leading local science and Technology Development Fund Project (No. 226Z0305G).

Author information

Authors and Affiliations

School of Information Science and Engineering, Yanshan University, Qinhuangdao, 066000, China
Ganlin Xu & Jianzhou Feng
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100000, China
Qikai Wei

Authors

Ganlin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jianzhou Feng
View author publications
You can also search for this author in PubMed Google Scholar
Qikai Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianzhou Feng .

Editor information

Editors and Affiliations

Northeastern University, Shenyang, China
Xiaochun Yang
The University of Indonesia, Depok, Indonesia
Heru Suhartanto
Beijing Institute of Technology, Beijing, China
Guoren Wang
Northeastern University, Shenyang, China
Bin Wang
University of Technology Sydney, Sydney, NSW, Australia
Jing Jiang
Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
Bing Li
Sun Yat-sen University, Guangzhou, China
Huaijie Zhu
Anhui University, Hefei, China
Ningning Cui

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, G., Feng, J., Wei, Q. (2023). Open Text Classification Based on Dynamic Boundary Balance. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14178. Springer, Cham. https://doi.org/10.1007/978-3-031-46671-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-46671-7_10
Published: 05 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46670-0
Online ISBN: 978-3-031-46671-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics