-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Closed
Description
Description
check_array
in validation doesn't correctly convert the array when:
- dtype is a list or a tuple and
- all dataframe column types are valid numpy dtypes and the numpy result type of all the column types is in the
dtype
array/tuple
Steps/Code to Reproduce
import numpy as np
import pandas as pd
from sklearn.utils import check_array
example = pd.DataFrame({'id': range(1, 5)})
example['float'] = [0, 0.1, 2.0, 3.1]
example['label'] = [True, False, True, False]
FLOAT_DTYPES = (np.float64, np.float32, np.float16)
a = check_array(example, dtype=FLOAT_DTYPES)
Expected Results
>>> a.dtype
dtype('float64')
Actual Results
>>> a.dtype
dtype('O')
Versions
System:
python: 3.7.4 (default, Oct 7 2019, 17:26:17) [Clang 10.0.1 (clang-1001.0.46.4)]
machine: Darwin-18.7.0-x86_64-i386-64bit
Python dependencies:
sklearn: 0.22
numpy: 1.17.4
scipy: 1.3.3
pandas: 0.25.3
TomDLT
Metadata
Metadata
Assignees
Labels
No labels