8000 Add support for sparse input to the Bagging models · Issue #2399 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Add support for sparse input to the Bagging models #2399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ogrisel opened this issue Aug 27, 2013 · 7 comments
Closed

Add support for sparse input to the Bagging models #2399

ogrisel opened this issue Aug 27, 2013 · 7 comments
Labels
Enhancement Moderate Anything that requires some knowledge of conventions and best practices

Comments

@ogrisel
Copy link
Member
ogrisel commented Aug 27, 2013

Bagging models as implemented in #2375 currently only support dense array-like input. We need add support for CSR or CSC input.

Depending on whether sample-bagging (without weights) and / or feature-bagging are enabled and the kind of input data representation we get some copy of re-samples of the input data are likely to be required (by calling tocsr() or tocsc()).

@amueller
Copy link
Member
amueller commented Jan 5, 2014

Any particular reason this feature is tagged for 0.15? Because we don't want to introduce new estimators without sparse support? Usually I wouldn't flag new features for releases.

@GaelVaroquaux
Copy link
Member

Any particular reason this feature is tagged for 0.15? Because w 8000 e don't want to
introduce new estimators without sparse support? Usually I wouldn't flag new
features for releases.

Agreed from my side.

@hamsal
Copy link
Contributor
hamsal commented Mar 11, 2014

Is there anything that needs to be done here that is outside of the scope of issue #655? The sparse input support needs to be implemented at the desicion tree level as I understand so both of these issues are analogous.

@glouppe
Copy link
Contributor
glouppe commented Mar 11, 2014

The bagging module is not tree-specific. It works with any base estimator.
The issue with the current implementation is that it does not check whether
the base estimator supports sparse data. As such, data is always densified
internally before being fed to the base models.

On 11 March 2014 03:22, hamsal notifications@github.com wrote:

Is there anything that needs to be done here that is outside of the scope
of issue #655 #655?
The sparse input support needs to be implemented at the desicion tree level
as I understand so both of these issues are analogous.

Reply to this email directly or view it on GitHubhttps://github.com//issues/2399#issuecomment-37257344
.

@ogrisel ogrisel removed this from the 0.15 milestone Mar 14, 2014
@ogrisel
Copy link
Member Author
ogrisel commented Mar 14, 2014

I cleared the 0.15 tag.

@msalahi
Copy link
msalahi commented Apr 16, 2014

where does the densifying happen? i'm trying to get it to break on sparse input to a classifier that should be able to handle it ( namely, KNeighborsClassifier ). As far as i can tell, it gets all the way through BaggingClassifier to KNeighborsClassifier.fit without being densified.

i check the type of the input data directly before line 113, which appears to be where the hand-off to the KNeighborsClassifier happens, and it's still sparse at this point. here's my attempt at reproducing this issue. anyone know where to look to find the problem?

@arjoly
Copy link
Member
arjoly commented Apr 19, 2014

closed by #3076

@arjoly arjoly closed this as completed May 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Moderate Anything that requires some knowledge of conventions and best practices
Projects
None yet
Development

No branches or pull requests

7 participants
0