10000 Add rolling window to sklearn.model_selection.TimeSeriesSplit · Issue #22523 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
Add rolling window to sklearn.model_selection.TimeSeriesSplit #22523
Open
@cmp1

Description

@cmp1

Describe the workflow you want to enable

I wanted to ask whether any plans exist to implement a rolling/sliding window method in the TimeSeriesSplit class:

68747470733a2f2f692e6962622e636f2f4b576b665137712f6e6577706c6f742e706e67

Currently, we are limited to using the expanding window type. For many financial time series models where a feature experiences a structural break, having a model whose weights are trained on the entire history can prove suboptimal.

I noted in #13204, specifically svenstehle's comments, that this might be on the horizon?

Describe your proposed solution

Current Implementation

>>> x = np.arange(15)
>>> cv = TimeSeriesSplit(n_splits=3, gap=2)
>>> for train_index, test_index in cv.split(x):
...      print("TRAIN:", train_index, "TEST:", test_index)
TRAIN: [0 1 2 3] TEST: [6 7 8]
TRAIN: [0 1 2 3 4 5 6] TEST: [ 9 10 11]
TRAIN: [0 1 2 3 4 5 6 7 8 9] TEST: [12 13 14]

Desired outcome

>>> x = np.arange(10)
>>> cv = TimeSeriesSplit(n_splits='walk_fw',  max_train_size=3, max_test_size=1)
>>> for train_index, test_index in cv.split(x):
...      print("TRAIN:", train_index, "TEST:", test_index)
TRAIN: [0 1 2] TEST: [3]
TRAIN: [1 2 3] TEST: [ 4]
TRAIN: [2 3 4] TEST: [5]
TRAIN: [ 3 4 5] TEST: [6]
TRAIN: [ 4 5 6] TEST: [7]
TRAIN: [ 5 6 7] TEST: [8]
TRAIN: [ 6 7 8] TEST: [9]

Where the 'stride' of the walk forward is proportionate to the test set, or walks by the max_train_size parameter?

Describe alternatives you've considered, if relevant

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0