Computer Science > Machine Learning

arXiv:2203.04274 (cs)

[Submitted on 8 Mar 2022]

Title:Leveraging Initial Hints for Free in Stochastic Linear Bandits

Authors:Ashok Cutkosky, Chris Dann, Abhimanyu Das, Qiuyi (Richard)Zhang

View PDF

Abstract:We study the setting of optimizing with bandit feedback with additional prior knowledge provided to the learner in the form of an initial hint of the optimal action. We present a novel algorithm for stochastic linear bandits that uses this hint to improve its regret to $\tilde O(\sqrt{T})$ when the hint is accurate, while maintaining a minimax-optimal $\tilde O(d\sqrt{T})$ regret independent of the quality of the hint. Furthermore, we provide a Pareto frontier of tight tradeoffs between best-case and worst-case regret, with matching lower bounds. Perhaps surprisingly, our work shows that leveraging a hint shows provable gains without sacrificing worst-case performance, implying that our algorithm adapts to the quality of the hint for free. We also provide an extension of our algorithm to the case of $m$ initial hints, showing that we can achieve a $\tilde O(m^{2/3}\sqrt{T})$ regret.

Comments:	ALT 2022
Subjects:	Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS)
Cite as:	arXiv:2203.04274 [cs.LG]
	(or arXiv:2203.04274v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2203.04274

Submission history

From: Qiuyi Zhang [view email]
[v1] Tue, 8 Mar 2022 18:48:55 UTC (67 KB)

Computer Science > Machine Learning

Title:Leveraging Initial Hints for Free in Stochastic Linear Bandits

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Leveraging Initial Hints for Free in Stochastic Linear Bandits

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators