10000 Target Encoder outputs nan in Pipeline (or SimpleImputer+TargetEncoder) · Issue #272 · scikit-learn-contrib/category_encoders · GitHub
[go: up one dir, main page]

Skip to content
Target Encoder outputs nan in Pipeline (or SimpleImputer+TargetEncoder) #272
@datacubeR

Description

@datacubeR

I have been modelling using the ames_housing dataset with the code attached in the following zip file.

rep_example.zip

Weird thing is that in a dataset with no nulls, adding a SimpleImputer along with a TargetEncoder(), several null values start to come out.

I'm not sure if I'm doing something wrong, but if using SimpleImputer with no Null values, nothing should happen. And actually I ran process separately and simpleImputer will not output any null value. But, once this Numpy array goes into TargetEncoder() it will output more than 2000 Nulls.

Why is that?

Expected Behavior

If no nulls are provided, no nulls should Output. See attached notebook, such when running the TargetEncoder by its own.

image

Actual Behavior

image

image

Steps to Reproduce the Problem

Refer to attached notebook with example code.

Specifications

  • Version: 2.2.2
  • Platform: Windows 10
  • Subsystem: Python 3.7.7

Thanks Guys,

Alfonso

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0