8000 OneHotEncoder throws unhelpful error messages when tranform called prior to fit · Issue #12395 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

OneHotEncoder throws unhelpful error messages when tranform called prior to fit #12395

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dillongardner opened this issue Oct 16, 2018 · 7 comments · Fixed by #12443
Closed
Milestone

Comments

@dillongardner
Copy link
Contributor

Description

OneHotEncoder throws an AttributeError instead of a NotFittedError when tranform is called prior to fit

  • if transform is called prior to being fit an AttributeError is thrown
  • if categories includes arrays of of unicode type

Steps/Code to Reproduce

import numpy as np
from sklearn.preprocessing import OneHotEncoder

categories = sorted(['Dillon', 'Joel', 'Earl', 'Liz'])
X = np.array(['Dillon', 'Dillon', 'Joel', 'Liz', 'Liz', 'Earl']).reshape(-1, 1)

ohe = OneHotEncoder(categories=[sorted(categories)])
ohe.transform(X)
# Throws AttributeError: 'OneHotEncoder' object has no attribute '_legacy_mode'

Expected Results

NotFittedError: This OneHotEncoder instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

Actual Results

Throws AttributeError: 'OneHotEncoder' object has no attribute '_legacy_mode'

Versions

System
------
    python: 3.6.3 (default, Oct  4 2017, 06:09:38)  [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)]
executable: /Users/dillon/Envs/mewtwo/bin/python3.6
   machine: Darwin-18.0.0-x86_64-i386-64bit

BLAS
----
    macros: NO_ATLAS_INFO=3, HAVE_CBLAS=None
  lib_dirs:
cblas_libs: cblas

Python deps
-----------
       pip: 18.1
setuptools: 39.0.1
   sklearn: 0.20.0
     numpy: 1.14.2
     scipy: 1.0.1
    Cython: None
    pandas: 0.22.0
@amueller
Copy link
Member

Thanks for the report. Do you want to work on a fix?

@dillongardner
Copy link
Contributor Author

Sure, I am happy to.

@amueller
Copy link
Member

Thanks!

@jnothman jnothman added this to the 0.20.1 milestone Oct 17, 2018
@amueller
Copy link
Member

@dillongardner have you started working on this? We want to release a bugfix version relatively soon.

@dillongardner
Copy link
Contributor Author

A bit more nuanced than expected. Sorry for the delay.

@ogrisel
Copy link
Member
ogrisel commented Nov 7, 2018

I a fixed list of categories are passed to the constructor maybe we could also not raise any exception at all as the estimator should be stateless in that case, no?

@jnothman
Copy link
Member
jnothman commented Nov 7, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
0