8000 update documentation to reflect fit, transform, and predict parameter… by shanglun · Pull Request #7156 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

update documentation to reflect fit, transform, and predict parameter… #7156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from
19 changes: 19 additions & 0 deletions doc/developers/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -798,6 +798,8 @@ The reason for postponing the validation is that the same validation
would have to be performed in ``set_params``,
which is used in algorithms like ``GridSearchCV``.

.. _fitting:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this used anywhere?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is linked to in a previous section - line 982.


Fitting
^^^^^^^

Expand Down Expand Up @@ -974,6 +976,23 @@ The easiest and recommended way to accomplish this is to
**not do any parameter validation in** ``__init__``.
All logic behind estimator parameters,
like translating string arguments into functions, should be done in ``fit``.
While it is possible to pass parameters into the ``fit`` method, these
should be restricted to variables that need to be sliced during
cross-validation. These variables need to be arrays with ``shape[0]==n_samples``
(see :ref:`fitting` above). An example of this type of variable would be
``sample_weight``. This interface allows all data-dependent parameters to be
sliced cleanly during cross-validation, and allows
operations such as gridsearch and pipelining.
All other parameters should be passed to
``__init__`` or set after construction using ``set_params``.

As a general rule, ``transform`` and ``predict`` should not take additional
parameters. In cases where additional parameters would be useful, e.g. when
setting a threshold value for feature selection, the custom parameters
should be passed to ``__init__``. In case parameters need to be changed after
the call to __init__, e.g. when setting a threshold value for feature
selection, this can be done by setting the attribute directly on the estimator
or calling set_parms.

Also it is expected that parameters with trailing ``_`` are **not to be set
inside the** ``__init__`` **method**. All and only the public attributes set by
Expand Down
0