You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is an issue when using SimpleImputer with a new Pandas dataframe, specifically if it has a column that is of type Int64 and has a NA value in the training data.
Code to Reproduce
deftest_simple_imputer_with_Int64_column():
index=pd.Index(['A', 'B', 'C'], name='group')
df=pd.DataFrame({
'att-1': [10, 20, np.nan],
'att-2': [30, 40, 30]
}, index=index)
# TODO: This line breaks the test! Comment out and it worksdf=df.astype('Int64')
imputer=SimpleImputer()
imputer.fit(df)
imputed=imputer.transform(df)
df_imputed=pd.DataFrame(imputed, columns=['att-1', 'att-2'], index=index)
assertdf_imputed.loc['C', 'att-1'] ==15
Expected Results
Correct value is imputed
Actual Results
Exception raised:
TypeError: float() argument must be a string or a number, not 'NAType'
Describe the bug
There is an issue when using
SimpleImputer
with a new Pandas dataframe, specifically if it has a column that is of typeInt64
and has aNA
value in the training data.Code to Reproduce
Expected Results
Correct value is imputed
Actual Results
Exception raised:
Versions
The text was updated successfully, but these errors were encountered: