Computer Science > Computation and Language

arXiv:2403.01548 (cs)

[Submitted on 3 Mar 2024 (v1), last revised 12 Mar 2024 (this version, v3)]

Title:In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Authors:Shiqi Chen, Miao Xiong, Junteng Liu, Zhengxuan Wu, Teng Xiao, Siyang Gao, Junxian He

View PDF

Abstract:Large language models (LLMs) frequently hallucinate and produce factual errors, yet our understanding of why they make these errors remains limited. In this study, we delve into the underlying mechanisms of LLM hallucinations from the perspective of inner representations, and discover a salient pattern associated with hallucinations: correct generations tend to have sharper context activations in the hidden states of the in-context tokens, compared to the incorrect ones. Leveraging this insight, we propose an entropy-based metric to quantify the ``sharpness'' among the in-context hidden states and incorporate it into the decoding process to formulate a constrained decoding approach. Experiments on various knowledge-seeking and hallucination benchmarks demonstrate our approach's consistent effectiveness, for example, achieving up to an 8.6 point improvement on TruthfulQA. We believe this study can improve our understanding of hallucinations and serve as a practical solution for hallucination mitigation.

Comments:	code repo is available at: this https URL
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2403.01548 [cs.CL]
	(or arXiv:2403.01548v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2403.01548

Submission history

From: Miao Xiong [view email]
[v1] Sun, 3 Mar 2024 15:53:41 UTC (1,339 KB)
[v2] Tue, 5 Mar 2024 18:41:07 UTC (1,339 KB)
[v3] Tue, 12 Mar 2024 09:49:28 UTC (1,339 KB)

Computer Science > Computation and Language

Title:In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators