-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Vague Error Message for Linear Regression when X is 1D #4466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We to try our best to salvage what the user gave us. I'd agree we failed here. We want consistent treatment of X of 1d shape, and we don't have that yet. |
Someone pointed out to me that scikit-learn/sklearn/utils/validation.py Line 334 in 69827c4
So this is either a numpy problem (seems unlikely to me) or the |
I agree that the error message is crap, but I disagree that we should
cast the shape to 2D.
|
Why not cast to 2D and print a warning? The Also, it seems like the intention of check_X_y was to cast the shape to 2D... Has the thinking changed? |
Why not cast to 2D and print a warning? The x[:,np.newaxis] fix seems
cumbersome.
Because then errors in code will not be caught. Code that is too lax lets
errors go through (ie typing systems are good thing).
Also, it seems like the intention of check_X_y was to cast the shape to 2D...
Has the thinking changed?
No, it was to raise good errors.
|
From looking at the code for check_array(), it looks like the intention is, by default, to cast the input. If |
Actually |
Fixed by #5152. |
I was recently trying to do just a very simple linear regression on x vs. y -- I got this error message:
I wasted 3 minutes trying to figure out if my x and y vectors didn't have the same lengths (they did), whether the problem was that they were lists, not numpy arrays (they were). Then I found the culprit, through stackoverflow:
http://stackoverflow.com/questions/27107057/sklearn-linear-regression-python
The trick is to replace:
With:
I know this is only a small inconvenience for me, but I think this needs to be addressed for the sake of usability. This provides an unnecessarily difficult entry barrier to new users.
A related thought/comment: Do the inputs have to be numpy arrays? Why not try to salvage the input if the user passes lists?
The text was updated successfully, but these errors were encountered: