Computer Science > Computation and Language

arXiv:2305.16765 (cs)

[Submitted on 26 May 2023]

Title:Backpack Language Models

Authors:John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang

View PDF

Abstract:We present Backpacks: a new neural architecture that marries strong modeling performance with an interface for interpretability and control. Backpacks learn multiple non-contextual sense vectors for each word in a vocabulary, and represent a word in a sequence as a context-dependent, non-negative linear combination of sense vectors in this sequence. We find that, after training, sense vectors specialize, each encoding a different aspect of a word. We can interpret a sense vector by inspecting its (non-contextual, linear) projection onto the output space, and intervene on these interpretable hooks to change the model's behavior in predictable ways. We train a 170M-parameter Backpack language model on OpenWebText, matching the loss of a GPT-2 small (124Mparameter) Transformer. On lexical similarity evaluations, we find that Backpack sense vectors outperform even a 6B-parameter Transformer LM's word embeddings. Finally, we present simple algorithms that intervene on sense vectors to perform controllable text generation and debiasing. For example, we can edit the sense vocabulary to tend more towards a topic, or localize a source of gender bias to a sense vector and globally suppress that sense.

Comments:	ACL 2023 Camera-Ready
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2305.16765 [cs.CL]
	(or arXiv:2305.16765v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2305.16765

Submission history

From: John Hewitt [view email]
[v1] Fri, 26 May 2023 09:26:23 UTC (274 KB)

Computer Science > Computation and Language

Title:Backpack Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Backpack Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators