-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
UnaryEncoder #8628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So the major difference is that instead of a single bit being active, all bits up to i are active? I've certainly used this before, I did not know it was called a unary encoder! |
I'd be happy with a more common name if you have one. I recall it being
called unary in some scikit-learn forum. Either way, I think it would be
useful to have around alongside OneHotEncoder, because it is a more
powerful representation of ordinal data and requires a little
sophistication to implement elegantly
|
Yes, I agree that it would be nice to have included. I just colloquially referred to it as a cumulative bit encoder, but that seems long. |
CumulativeEncoder gives some sense of it, except that it's not actually
cumulative :)
…On 23 March 2017 at 09:56, Jacob Schreiber ***@***.***> wrote:
Yes, I agree that it would be nice to have included. I just colloquially
referred to it as a cumulative bit encoder, but that seems long.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8628 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz62eeuBkSQr6H1JCCneo6w3Y1nBsGks5roacGgaJpZM4MlNiI>
.
|
How about StepEncoder as it sounds very much like unit step function. Also, if someone is not already working on it, i would like to implement it. |
Go ahead. We can figure out a name later. |
Fwiw, I don't see the relevance of the unit step function, and don't think
StepEncoder is an intuitive name
|
OrdinalEncoder might be okay
…On 23 Mar 2017 6:14 pm, "Joel Nothman" ***@***.***> wrote:
Fwiw, I don't see the relevance of the unit step function, and don't think
StepEncoder is an intuitive name
On 23 Mar 2017 4:28 pm, "Jacob Schreiber" ***@***.***>
wrote:
Go ahead. We can figure out a name later.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8628 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6yo3l87lzdo_bOxqFVQJ8yz2eJ1Wks5rogMXgaJpZM4MlNiI>
.
|
Going ahead with OrdinalEncoder for now. |
May be "LeftHotEncoder" and "RightHotEncoder", if the latter scenario makes sense? Is this taken up by the way? |
Apologies, I guess it's taken up and WIP.. didn't notice the PR reference.. am on a mobile device. |
This has an active PR at #8652 |
More recent work in #12893. |
It's not very clear how to implement Unary encoder (also called Thermometer Encoder or RankHot Encoder) as categorical encoder, since it is designed for ordered values, not just any category. But it should be ok to add it as an option to KBinsDiscretizer, or not? |
I'm sure we've discussed this before, but I'm not sure where, and there certainly does not appear to be an active PR. For ordinal (and discretized; see #7668) features, a "unary" encoding (is there a better name for this) is more informative than a one-hot encoding. For k values 0, ..., k - 1 of the ordinal feature x, this creates
k - 1
binary features such that thei
th is active ifx > i (for i = 0, ... k - 1)
. Below is an initial implementation.The text was updated successfully, but these errors were encountered: