Add keyword parameter to scoring functions to support different types of data

@thomasjpfan

Describe the workflow you want to enable

Currently, there is no way to use SelectKBest with ordinal data.

Describe your proposed solution

I would want to add a keyword parameter to the current f_regression (or f_classif) that takes in the type of input data. For example, if our X is ordinal and y is continuous, we can run f_regression(X, y, input_type="ordinal"). The function will then calculate the Spearman's coefficient (as opposed to the current implementation of Pearson's coefficient in f_regression) and output the scores and pvalues.

The steps to add support for ordinal data are:

Write wrapper for scipy.stats.spearmanr OR write our own function that calculates Spearman's
Integrate that wrapper into f_regression and add keyword parameter input_type

Now, I am not sure how to score one-hot encoded data yet, but hopefully by adding the keyword parameter, we can gradually expand the types of input data sklearn's scoring functions can support.

Describe alternatives you've considered, if relevant

Alternatively, we can also write a new function f_regression_ordinal to deal with ordinal X and continuous y.

Additional context

This feature request partially addresses #8480. There has also been discussions of the wrapper method, but no consensus has been reached: #6673, #8038.

This feature request was submitted per suggestions by @thomasjpfan and discussion with @yashika51 and @flosincapite

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions