Computer Science > Machine Learning

arXiv:2203.14343 (cs)

[Submitted on 27 Mar 2022 (v1), last revised 18 May 2022 (this version, v3)]

Title:Diagonal State Spaces are as Effective as Structured State Spaces

Authors:Ankit Gupta, Albert Gu, Jonathan Berant

View PDF

Abstract:Modeling long range dependencies in sequential data is a fundamental step towards attaining human-level performance in many modalities such as text, vision, audio and video. While attention-based models are a popular and effective choice in modeling short-range interactions, their performance on tasks requiring long range reasoning has been largely inadequate. In an exciting result, Gu et al. (ICLR 2022) proposed the $\textit{Structured State Space}$ (S4) architecture delivering large gains over state-of-the-art models on several long-range tasks across various modalities. The core proposition of S4 is the parameterization of state matrices via a diagonal plus low rank structure, allowing efficient computation. In this work, we show that one can match the performance of S4 even without the low rank correction and thus assuming the state matrices to be diagonal. Our $\textit{Diagonal State Space}$ (DSS) model matches the performance of S4 on Long Range Arena tasks, speech classification on Speech Commands dataset, while being conceptually simpler and straightforward to implement.

Comments:	updated version with simpler DSS variants, RNN view for autoregressive decoding, ablation analysis, analysis of trained model parameters and kernels
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL)
Cite as:	arXiv:2203.14343 [cs.LG]
	(or arXiv:2203.14343v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2203.14343

Submission history

From: Ankit Gupta [view email]
[v1] Sun, 27 Mar 2022 16:30:33 UTC (316 KB)
[v2] Tue, 17 May 2022 15:10:10 UTC (835 KB)
[v3] Wed, 18 May 2022 18:30:07 UTC (835 KB)

Computer Science > Machine Learning

Title:Diagonal State Spaces are as Effective as Structured State Spaces

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Diagonal State Spaces are as Effective as Structured State Spaces

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators