-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Handle missing values in OrdinalEncoder #11997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @jnothman-- do you mind if I work on this? |
Go for it |
I suppose we might also consider a |
Hi. |
I'm currently working on this issue-- but I think the best way to start is to review the contributing guidlines. And when you see an issue no one is working on, ask the member who submitted the issue if you can get started. (I'm fairly new to this project myself). |
I wish the help wanted tag would disappear once a contributor adopts an issue... |
@jnothman I am new to this project and would like to contribute. Can I start by working on this issue? |
@maxcopeland is working on this issue. It's just taking some time to
finish because it's not so easy :)
|
A suggestion: assign 0 only for missing values, and starting encoding from 1 (and not from 0 as it is done now), even when there are no missing values in the data set. Such a normalization could help identifying the preceding missing values more easier (in order to handle them). |
EDIT: Since we will encode the missing values as a caregories, we will not need |
I am closing this PR because this feature was added in #21988, which added |
A minimal implementation would pass through NaNs from the input to the output of
transform
and make sure the presence of NaN does not affect the categories identified infit
.A
missing_values
parameter might allow the user to configure what object is a placeholder for missingness (e.g. NaN, None, etc.).See #10465 for background
The text was updated successfully, but these errors were encountered: