8000 [MRG+2] addresses #8509 improvements to f_regression documentation (#… · scikit-learn/scikit-learn@5210f81 · GitHub
[go: up one dir, main page]

Skip to content

Commit 5210f81

Browse files
brownsarahmjnothman
authored andcommitted
[MRG+2] addresses #8509 improvements to f_regression documentation (#8548)
* clarify role of the function and streamline introduction * added feature selection methods to see also * completed see also * fixed pep related formatting for flake8checks. * fixed extra whitespace flake8 problems, remaining failure is a copied see all line from another function, the line is over by a period, does not make sense to newline that. * one more whitespace * FIX small pep8 error.
1 parent 02c705e commit 5210f81

File tree

1 file changed

+13
-4
lines changed

1 file changed

+13
-4
lines changed

sklearn/feature_selection/univariate_selection.py

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -230,17 +230,18 @@ def chi2(X, y):
230230
def f_regression(X, y, center=True):
231231
"""Univariate linear regression tests.
232232
233-
Quick linear model for testing the effect of a single regressor,
234-
sequentially for many regressors.
233+
Linear model for testing the individual effect of each of many regressors.
234+
This is a scoring function to be used in a feature seletion procedure, not
235+
a free standing feature selection procedure.
235236
236237
This is done in 2 steps:
237238
238-
1. The cross correlation between each regressor and the target is computed,
239+
1. The correlation between each regressor and the target is computed,
239240
that is, ((X[:, i] - mean(X[:, i])) * (y - mean_y)) / (std(X[:, i]) *
240241
std(y)).
241242
2. It is converted to an F score then to a p-value.
242243
243-
Read more in the :ref:`User Guide <univariate_feature_selection>`.
244+
For more on usage see the :ref:`User Guide <univariate_feature_selection>`.
244245
245246
Parameters
246247
----------
@@ -261,10 +262,18 @@ def f_regression(X, y, center=True):
261262
pval : array, shape=(n_features,)
262263
p-values of F-scores.
263264
265+
264266
See also
265267
--------
268+
mutual_info_regression: Mutual information for a continuous target.
266269
f_classif: ANOVA F-value between label/feature for classification tasks.
267270
chi2: Chi-squared stats of non-negative features for classification tasks.
271+
SelectKBest: Select features based on the k highest scores.
272+
SelectFpr: Select features based on a false positive rate test.
273+
SelectFdr: Select features based on an estimated false discovery rate.
274+
SelectFwe: Select features based on family-wise error rate.
275+
SelectPercentile: Select features based on percentile of the highest
276+
scores.
268277
"""
269278
X, y = check_X_y(X, y, ['csr', 'csc', 'coo'], dtype=np.float64)
270279
n_samples = X.shape[0]

0 commit comments

Comments
 (0)
0