10000 Problem fitting LassoLarsCV - broadcasting error · Issue #716 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Problem fitting LassoLarsCV - broadcasting error #716

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
conradlee opened this issue Mar 21, 2012 · 20 comments
Closed

Problem fitting LassoLarsCV - broadcasting error #716

conradlee opened this issue Mar 21, 2012 · 20 comments
Milestone

Comments

@conradlee
Copy link
Contributor

I'm trying to follow along the example presented here on sparse feature recovery using a randomized lasso.

Part of that example requires fitting a LasoLarsCV estimator. However, when I try to fit that, I get an error, you can see the traceback here.

You can re-create this problem by downloading the data that I'm using here. Then try the following:

from scipy import io

matlab_dict = io.loadmat("LassoProblemData.mat")
X, y = matlab_dict["X"], matlab_dict["y"]
lars_cv = LassoLarsCV(cv=3, n_jobs=1)
lars_cv.fit(X, y)
@amueller
Copy link
Member

Hi Conrad.
The problem is the shape of y, which is (639, 1) in your example,
but should be (639,). You can solve this problem by using ravel:
``

lars_cv.fit(X, y.ravel())

``

This is a known issue, but I'm not sure if we reached a consensus on how
to handle this situation and if the current behavior is the "right" one.
I don't like it very much ;)

Cheers,
Andy

On 03/21/2012 03:06 PM, Conrad Lee wrote:

I'm trying to follow along the example presented here on sparse feature recovery using a randomized lasso.

Part of that example requires fitting a LasoLarsCV estimator. However, when I try to fit that, I get an error, you can see the traceback here.

You can re-create this problem by downloading the data that I'm using here. Then try the following:

from scipy import io

matlab_dict = io.loadmat("LassoProblemData.mat")
X, y = matlab_dict["X"], matlab_dict["y"]
lars_cv = LassoLarsCV(cv=3, n_jobs=1)
lars_cv.fit(X, y)

Reply to this email directly or view it on GitHub:
#716

@ogrisel
Copy link
Member
ogrisel commented Mar 21, 2012

LassoLars does not accept multivariate output, hence the shape of your y is invalid: (639, 1). There is some effort to explicitly support multivariate targets for linear regression on PR #685 but that does not cover LassoLars either. In your case you can just call replace y.ravel() to make it a strict 1D array with shape (639,).

Any PR to make LassoLars multivariate or to make the error message more explicit is welcomed :)

@conradlee
Copy link
Contributor Author

Thanks for the quick reply. However, even when I shape the labels correctly, I get another error, see the traceback

here.

@ogrisel
Copy link
Member
ogrisel commented Mar 21, 2012

Are you running mater? I cannot reproduce this.

@conradlee
Copy link
Contributor Author

Sorry, what do you meany by "running mater"? I am running a version that I checked out from github today.

@ogrisel
Copy link
Member
ogrisel commented Mar 21, 2012

I meant "master".

@conradlee
Copy link
Contributor Author

Hmm, I go this working now. I think I was just not correctly following the clean/make/build/install cycle. Once I cleaned, rebuilt, and installed everything again, the problem went away. Sorry for the bother!

@conradlee conradlee reopened this Mar 21, 2012
@conradlee
Copy link
Contributor Author

Ok, I got to the bottom of this---turns out it wasn't a build problem on my end. The problem was arising because the labels were of type int32. So the you can re-create the problem with the following:

from scipy import io
import numpy
from sklearn.linear_model import LassoLarsCV

matlab_dict = io.loadmat("LassoProblemData.mat")
X, y = matlab_dict["X"], matlab_dict["y"]
y = numpy.ravel(y)

y = y.astype("i4")

lars_cv = LassoLarsCV(cv=3, n_jobs=1)
lars_cv.fit(X, y)

(you can download the file "LassoProblemData.mat" here)

@agramfort
Copy link
Member

can you send us a PR that makes use of utils.as_float_array which would fix the pb?

thanks for the bug report

@agramfort
Copy link
Member

maybe you can check that the problems does not appear with all the other linear models and add a test for it :)

@amueller
Copy link
Member

I tried adding tests using #893 but I couldn't reproduce the problem there. Using your example, I get a different traceback than you posted. I get ValueError: cannot convert float NaN to integer. Is this also your current error?

@conradlee
Copy link
Contributor Author

Sorry Andreas, I'll be away from my development machine for a few weeks, and so won't be able to check what the current error is.

@amueller
Copy link
Member

@conradlee No problem. It would be great if you had time to look into this at some point. I'll go for the other issues in the mean time ;)

@amueller
Copy link
Member

@conradlee to me it seems the example is working with current master. Could you please have a look? If you don't have this problem any more, I'll close the issue.

@conradlee
Copy link
Contributor Author

@amueller I still get a problem, you can see the exception here.

Remember, I'm not running the code mentioned at the beginning of this thread, but the stuff in the middle. Here's exactly what I ran:

from scipy import io
import numpy
from sklearn.linear_model import LassoLarsCV

matlab_dict = io.loadmat("LassoProblemData.mat")
X, y = matlab_dict["X"], matlab_dict["y"]
y = numpy.ravel(y)

y = y.astype("i4")

lars_cv = LassoLarsCV(cv=3, n_jobs=1)
lars_cv.fit(X, y)

The data comes from here.

@amueller
Copy link
Member

Ok thanks, I can reproduce.
Btw the paste site is pretty annoying, I had to figure out how to disable my add blocker.

@ogrisel
Copy link
Member
ogrisel commented Aug 27, 2012

Btw the paste site is pretty annoying, I had to figure out how to disable my add blocker.

+1, gists are better anyway.

@GaelVaroquaux
Copy link
Member

On Sun, Aug 26, 2012 at 12:33:48PM -0700, Andreas Mueller wrote:

Ok thanks, I can reproduce.

I had to fix a minor detail (in ca36d73) to get the code running on my
computer, but after that, I cannot reproduce. Can you still
reproduce?

G

@amueller
Copy link
Member

I could reproduce, but should be fixed in 06112e5.

@amueller
Copy link
Member

I guess your modification made mine obsolete? Mine is more upstream in the code, though, which I think is the place the input check should happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
0