Computer Science > Computation and Language

arXiv:2307.09007 (cs)

[Submitted on 18 Jul 2023 (v1), last revised 11 Dec 2023 (this version, v2)]

Title:On the (In)Effectiveness of Large Language Models for Chinese Text Correction

Authors:Yinghui Li, Haojing Huang, Shirong Ma, Yong Jiang, Yangning Li, Feng Zhou, Hai-Tao Zheng, Qingyu Zhou

Abstract:Recently, the development and progress of Large Language Models (LLMs) have amazed the entire Artificial Intelligence community. Benefiting from their emergent abilities, LLMs have attracted more and more researchers to study their capabilities and performance on various downstream Natural Language Processing (NLP) tasks. While marveling at LLMs' incredible performance on all kinds of tasks, we notice that they also have excellent multilingual processing capabilities, such as Chinese. To explore the Chinese processing ability of LLMs, we focus on Chinese Text Correction, a fundamental and challenging Chinese NLP task. Specifically, we evaluate various representative LLMs on the Chinese Grammatical Error Correction (CGEC) and Chinese Spelling Check (CSC) tasks, which are two main Chinese Text Correction scenarios. Additionally, we also fine-tune LLMs for Chinese Text Correction to better observe the potential capabilities of LLMs. From extensive analyses and comparisons with previous state-of-the-art small models, we empirically find that the LLMs currently have both amazing performance and unsatisfactory behavior for Chinese Text Correction. We believe our findings will promote the landing and application of LLMs in the Chinese NLP community.

Comments:	This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2307.09007 [cs.CL]
	(or arXiv:2307.09007v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2307.09007

Submission history

From: Yinghui Li [view email]
[v1] Tue, 18 Jul 2023 06:48:52 UTC (187 KB)
[v2] Mon, 11 Dec 2023 12:39:16 UTC (3,596 KB)

Computer Science > Computation and Language

Title:On the (In)Effectiveness of Large Language Models for Chinese Text Correction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:On the (In)Effectiveness of Large Language Models for Chinese Text Correction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators