8000 Redesign of the sklearn.datasets API · Issue #13122 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Redesign of the sklearn.datasets API #13122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
daniel-cortez-stevenson opened this issue Feb 8, 2019 · 2 comments
Closed

Redesign of the sklearn.datasets API #13122

daniel-cortez-stevenson opened this issue Feb 8, 2019 · 2 comments

Comments

@daniel-cortez-stevenson
Copy link

@rth At the API level I agree the functional programming style is friendlier. The benefit really would come when the sklearn datasets API is extended, allowing for easier maintenance and ensuring similar behavior. The proposed pull request would not make sense if the datasets API is 'closed'.

@rth I also agree that a the Datasets API in deep learning libraries is more obviously necessary. But, on a leap of faith here, I'm hoping that - through a thoughtful redesign - we may identify useful extensions of a Dataset class that are relevant to supporting reproducibility and performance of scikit-learn's more notable functionality (Estimators/Transformers).

For example, could extending datasets to act as generators increase memory or time performance of sklearn meaningfully?

I'll take a look at OpenML, and open up a fresh issue to start a high level conversation. I hope this PR serves to get some brain juices flowing.

Originally posted by @daniel-cortez-stevenson in #13120 (comment)

@daniel-cortez-stevenson
Copy link
Author

Opening up this Issue to discuss a potential redesign of the sklearn.datasets API to support more OO design. A prototype of a redesigned sklearn.datasets.base module can be found at PR #13120 . This was also briefly discussed in #10733 and may facilitate closing #10972 #11818

@daniel-cortez-stevenson
Copy link
Author

Discussion with a better overview and scope definition has been started at #13123. Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0