8000 ENH Improve efficiency of chi2 by converting the input to float first… · scikit-learn/scikit-learn@e1e8c66 · GitHub
[go: up one dir, main page]

Skip to content

Commit e1e8c66

Browse files
ENH Improve efficiency of chi2 by converting the input to float first (#22235)
* ENH Improve efficiency of chi2 by converting the input to float first * CLN Address comments * Update sklearn/feature_selection/_univariate_selection.py Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
1 parent 330881a commit e1e8c66

File tree

2 files changed

+9
-1
lines changed

2 files changed

+9
-1
lines changed

doc/whats_new/v1.1.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,12 @@ Changelog
140140
`'auto'` in 1.3. `None` and `'warn'` will be removed in 1.3. :pr:`20145` by
141141
:user:`murata-yu`.
142142

143+
:mod:`sklearn.feature_selection`
144+
................................
145+
146+
- |Efficiency| Improve runtime performance of :func:`feature_selection.chi2`
147+
with boolean arrays. :pr:`22235` by `Thomas Fan`_.
148+
143149
:mod:`sklearn.datasets`
144150
.......................
145151

sklearn/feature_selection/_univariate_selection.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -210,7 +210,9 @@ def chi2(X, y):
210210

211211
# XXX: we might want to do some of the following in logspace instead for
212212
# numerical stability.
213-
X = check_array(X, accept_sparse="csr")
213+
# Converting X to float allows getting better performance for the
214+
# safe_sparse_dot call made bellow.
215+
X = check_array(X, accept_sparse="csr", dtype=(np.float64, np.float32))
214216
if np.any((X.data if issparse(X) else X) < 0):
215217
raise ValueError("Input X must be non-negative.")
216218

0 commit comments

Comments
 (0)
0