8000 FIX raise a TypeError when the values type is not supported in DictVe… · rth/scikit-learn@64d5448 · GitHub
[go: up one dir, main page]

Skip to content

Commit 64d5448

Browse files
kamiyaaglemaitrejjerphan
authored
FIX raise a TypeError when the values type is not supported in DictVectorizer (scikit-learn#19520)
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
1 parent 0b45ac5 commit 64d5448

File tree

3 files changed

+30
-8
lines changed

3 files changed

+30
-8
lines changed

doc/whats_new/v1.0.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -340,10 +340,14 @@ Changelog
340340
:mod:`sklearn.feature_extraction`
341341
.................................
342342

343-
- |Fix| Fixed a bug in class:`feature_extraction.HashingVectorizer` where some
343+
- |Fix| Fixed a bug in :class:`feature_extraction.HashingVectorizer` where some
344344
input strings would result in negative indices in the transformed data.
345345
:pr:`19035` by :user:`Liu Yu <ly648499246>`.
346346

347+
- |Fix| Fixed a bug in :class:`feature_extraction.DictVectorizer` by raising an
348+
error with unsupported value type.
349+
:pr:`19520` by :user:`Jeff Zhao <kamiyaa>`.
350+
347351
:mod:`sklearn.feature_selection`
348352
................................
349353

sklearn/feature_extraction/_dict_vectorizer.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -226,13 +226,7 @@ def _transform(self, X, fitting):
226226
v = 1
227227
elif isinstance(v, Number) or (v is None):
228228
feature_name = f
229-
elif isinstance(v, Mapping):
230-
raise TypeError(
231-
f"Unsupported value Type {type(v)} "
232-
f"for {f}: {v}.\n"
233-
"Mapping objects are not supported."
234-
)
235-
elif isinstance(v, Iterable):
229+
elif not isinstance(v, Mapping) and isinstance(v, Iterable):
236230
feature_name = None
237231
self._add_iterable_element(
238232
f,
@@ -244,6 +238,12 @@ def _transform(self, X, fitting):
244238
indices=indices,
245239
values=values,
246240
)
241+
else:
242+
raise TypeError(
243+
f"Unsupported value Type {type(v)} "
244+
f"for {f}: {v}.\n"
245+
f"{type(v)} objects are not supported."
246+
)
247247

248248
if feature_name is not None:
249249
if fitting and feature_name not in vocab:

sklearn/feature_extraction/tests/test_dict_vectorizer.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,3 +209,21 @@ def test_dictvectorizer_dense_sparse_equivalence():
209209
expected_inverse = [{"category=thriller": 1.0}]
210210
assert dense_inverse_transform == expected_inverse
211211
assert sparse_inverse_transform == expected_inverse
212+
213+
214+
def test_dict_vectorizer_unsupported_value_type():
215+
"""Check that we raise an error when the value associated to a feature
216+
is not supported.
217+
218+
Non-regression test for:
219+
https://github.com/scikit-learn/scikit-learn/issues/19489
220+
"""
221+
222+
class A:
223+
pass
224+
225+
vectorizer = DictVectorizer(sparse=True)
226+
X = [{"foo": A()}]
227+
err_msg = "Unsupported value Type"
228+
with pytest.raises(TypeError, match=err_msg):
229+
vectorizer.fit_transform(X)

0 commit comments

Comments
 (0)
0