Computer Science > Artificial Intelligence

arXiv:2502.20379 (cs)

[Submitted on 27 Feb 2025]

Title:Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers

Authors:Shalev Lifshitz, Sheila A. McIlraith, Yilun Du

Abstract:By utilizing more computational resources at test-time, large language models (LLMs) can improve without additional training. One common strategy uses verifiers to evaluate candidate outputs. In this work, we propose a novel scaling dimension for test-time compute: scaling the number of verifiers. We introduce Multi-Agent Verification (MAV) as a test-time compute paradigm that combines multiple verifiers to improve performance. We propose using Aspect Verifiers (AVs), off-the-shelf LLMs prompted to verify different aspects of outputs, as one possible choice for the verifiers in a MAV system. AVs are a convenient building block for MAV since they can be easily combined without additional training. Moreover, we introduce BoN-MAV, a simple multi-agent verification algorithm that combines best-of-n sampling with multiple verifiers. BoN-MAV demonstrates stronger scaling patterns than self-consistency and reward model verification, and we demonstrate both weak-to-strong generalization, where combining weak verifiers improves even stronger LLMs, and self-improvement, where the same base model is used to both generate and verify outputs. Our results establish scaling the number of verifiers as a promising new dimension for improving language model performance at test-time.

Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2502.20379 [cs.AI]
	(or arXiv:2502.20379v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2502.20379

Submission history

From: Shalev Lifshitz [view email]
[v1] Thu, 27 Feb 2025 18:53:30 UTC (395 KB)

Computer Science > Artificial Intelligence

Title:Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators