Computer Science > Computation and Language

arXiv:2407.11009 (cs)

[Submitted on 25 Jun 2024]

Title:CharED: Character-wise Ensemble Decoding for Large Language Models

Authors:Kevin Gu, Eva Tuecke, Dmitriy Katz, Raya Horesh, David Alvarez-Melis, Mikhail Yurochkin

Abstract:Large language models (LLMs) have shown remarkable potential for problem solving, with open source models achieving increasingly impressive performance on benchmarks measuring areas from logical reasoning to mathematical ability. Ensembling models can further improve capabilities across a variety of domains. However, conventional methods of combining models at inference time such as shallow fusion necessitate a shared vocabulary and tokenization, and alternatives like fine-tuning for domain-specific performance are both time consuming and computationally expensive. We therefore present an inference-time ensembling algorithm aimed at "averaging" outputs from multiple LLMs and illustrate its improved performance across multiple domains compared to its constituent models alone. Character-wise ensemble decoding, CharED, finds the marginal distribution of each character for an individual model and performs a weighted average to generate an output, character by character. In coding, math, and toxicity benchmarks, we find our proposed model able to combine complimentary strengths of multiple LLMs, regardless of vocabulary, tokenization, or model size.

Comments:	9 pages, 4 figures
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2407.11009 [cs.CL]
	(or arXiv:2407.11009v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2407.11009

Submission history

From: Eva Tuecke [view email]
[v1] Tue, 25 Jun 2024 22:35:07 UTC (442 KB)

Computer Science > Computation and Language

Title:CharED: Character-wise Ensemble Decoding for Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:CharED: Character-wise Ensemble Decoding for Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators