10000 SimpleImputer breaks using Pandas 1.0 with Int64 column · Issue #16531 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

SimpleImputer breaks using Pandas 1.0 with Int64 column #16531

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eliasmistler opened this issue Feb 24, 2020 · 2 comments
Closed

SimpleImputer breaks using Pandas 1.0 with Int64 column #16531

eliasmistler opened this issue Feb 24, 2020 · 2 comments
Labels

Comments

@eliasmistler
Copy link

Describe the bug

There is an issue when using SimpleImputer with a new Pandas dataframe, specifically if it has a column that is of type Int64 and has a NA value in the training data.

Code to Reproduce

def test_simple_imputer_with_Int64_column():
    index = pd.Index(['A', 'B', 'C'], name='group')
    df = pd.DataFrame({
        'att-1': [10, 20, np.nan],
        'att-2': [30, 40, 30]
    }, index=index)

    # TODO: This line breaks the test! Comment out and it works
    df = df.astype('Int64')

    imputer = SimpleImputer()
    imputer.fit(df)
    imputed = imputer.transform(df)
    df_imputed = pd.DataFrame(imputed, columns=['att-1', 'att-2'], index=index)

    assert df_imputed.loc['C', 'att-1'] == 15

Expected Results

Correct value is imputed

Actual Results

Exception raised:

TypeError: float() argument must be a string or a number, not 'NAType'

Versions

System:
    python: 3.7.4 (default, Aug 13 2019, 15:17:50)  [Clang 4.0.1 (tags/RELEASE_401/final)]
executable: <path-to-my-project>/.venv/bin/python
   machine: Darwin-19.3.0-x86_64-i386-64bit
Python dependencies:
       pip: 19.3.1
setuptools: 42.0.2
   sklearn: 0.22.1
     numpy: 1.18.1
     scipy: 1.4.1
    Cython: None
    pandas: 1.0.1
matplotlib: None
    joblib: 0.14.1
Built with OpenMP: True
@glemaitre
Copy link
Member

related to #16498

821D

@rth
Copy link
Member
rth commented Feb 24, 2020

Thanks for the report @eliasmistler .

Closing as a duplicate of #16498, please comment there is you have any other comment. A fix is being worked on.

@rth rth closed this as completed Feb 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants
0