10000 DOC Fix warnings about references and links by cmarmo · Pull Request #14976 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

DOC Fix warnings about references and links #14976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Sep 23, 2019
Merged
6 changes: 6 additions & 0 deletions doc/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1547,6 +1547,12 @@ functions or non-estimator constructors.
picklable. This means, for instance, that lambdas cannot be used
as estimator parameters.

``pos_label``
Value with which positive labels must be encoded in binary
classification problems in which the positive class is not assumed.
This value is typically required to compute asymmetric evaluation
metrics such as precision and recall.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add something like. "This value is typically required to compute asymmetric evaluation metrics such as precision and recall".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in bdd700f

``random_state``
Whenever randomization is part of a Scikit-learn algorithm, a
``random_state`` parameter may be provided to control the random number
Expand Down
2 changes: 1 addition & 1 deletion doc/modules/computing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -565,7 +565,7 @@ These environment variables should be set before importing scikit-learn.

:SKLEARN_WORKING_MEMORY:

Sets the default value for the :term:`working_memory` argument of
Sets the default value for the `working_memory` argument of
:func:`sklearn.set_config`.

:SKLEARN_SEED:
Expand Down
17 changes: 11 additions & 6 deletions doc/modules/ensemble.rst
Original file line number Diff line number Diff line change
Expand Up @@ -456,7 +456,7 @@ trees.
Scikit-learn 0.21 introduces two new experimental implementations of
gradient boosting trees, namely :class:`HistGradientBoostingClassifier`
and :class:`HistGradientBoostingRegressor`, inspired by
`LightGBM <https://github.com/Microsoft/LightGBM>`__.
`LightGBM <https://github.com/Microsoft/LightGBM>`__ (See [LightGBM]_).

These histogram-based estimators can be **orders of magnitude faster**
than :class:`GradientBoostingClassifier` and
Expand Down Expand Up @@ -825,7 +825,7 @@ Histogram-Based Gradient Boosting
Scikit-learn 0.21 introduces two new experimental implementations of
gradient boosting trees, namely :class:`HistGradientBoostingClassifier`
and :class:`HistGradientBoostingRegressor`, inspired by
`LightGBM <https://github.com/Microsoft/LightGBM>`__.
`LightGBM <https://github.com/Microsoft/LightGBM>`__ (See [LightGBM]_).

These histogram-based estimators can be **orders of magnitude faster**
than :class:`GradientBoostingClassifier` and
D7AE Expand Down Expand Up @@ -996,10 +996,15 @@ Finally, many parts of the implementation of

.. topic:: References

.. [XGBoost] Tianqi Chen, Carlos Guestrin, "XGBoost: A Scalable Tree
Boosting System". https://arxiv.org/abs/1603.02754
.. [LightGBM] Ke et. al. "LightGBM: A Highly Efficient Gradient
BoostingDecision Tree"
.. [F1999] Friedmann, Jerome H., 2007, `"Stochastic Gradient Boosting"
<https://statweb.stanford.edu/~jhf/ftp/stobst.pdf>`_
.. [R2007] G. Ridgeway, "Generalized Boosted Models: A guide to the gbm
package", 2007
.. [XGBoost] Tianqi Chen, Carlos Guestrin, `"XGBoost: A Scalable Tree
Boosting System" <https://arxiv.org/abs/1603.02754>`_
.. [LightGBM] Ke et. al. `"LightGBM: A Highly Efficient Gradient
BoostingDecision Tree" <https://papers.nips.cc/paper/
6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree>`_

.. _voting_classifier:

Expand Down
4 changes: 2 additions & 2 deletions doc/modules/neighbors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -720,5 +720,5 @@ added space complexity in the operation.
J. Goldberger, G. Hinton, S. Roweis, R. Salakhutdinov, Advances in
Neural Information Processing Systems, Vol. 17, May 2005, pp. 513-520.

.. [2] `Wikipedia entry on Neighborhood Components Analysis
<https://en.wikipedia.org/wiki/Neighbourhood_components_analysis>`_
`Wikipedia entry on Neighborhood Components Analysis
<https://en.wikipedia.org/wiki/Neighbourhood_components_analysis>`_
2 changes: 1 addition & 1 deletion doc/modules/partial_dependence.rst
Original file line number Diff line number Diff line change
Expand Up @@ -125,5 +125,5 @@ which the trees were trained.
Statistical Learning <https://web.stanford.edu/~hastie/ElemStatLearn//>`_,
Second Edition, Section 10.13.2, Springer, 2009.

.. [Mol2019] C. Molnar, `Interpretable Machine Learning
C. Molnar, `Interpretable Machine Learning
<https://christophm.github.io/interpretable-ml-book/>`_, Section 5.1, 2019.
2 changes: 1 addition & 1 deletion examples/decomposition/plot_faces_decomposition.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Faces dataset decompositions
============================

This example applies to :ref:`olivetti_faces` different unsupervised
This example applies to :ref:`olivetti_faces_dataset` different unsupervised
matrix decomposition (dimension reduction) methods from the module
:py:mod:`sklearn.decomposition` (see the documentation chapter
:ref:`decompositions`) .
Expand Down
2 changes: 1 addition & 1 deletion examples/inspection/plot_permutation_importance.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@

.. topic:: References:

.. [1] L. Breiman, "Random Forests", Machine Learning, 45(1), 5-32,
[1] L. Breiman, "Random Forests", Machine Learning, 45(1), 5-32,
2001. https://doi.org/10.1023/A:1010933404324
"""
print(__doc__)
Expand Down
2 changes: 1 addition & 1 deletion examples/multioutput/plot_classifier_chain_yeast.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
data point has at least one label. As a baseline we first train a logistic
regression classifier for each of the 14 labels. To evaluate the performance of
these classifiers we predict on a held-out test set and calculate the
:ref:`jaccard score <jaccard_score>` for each sample.
:ref:`jaccard score <jaccard_similarity_score>` for each sample.

Next we create 10 classifier chains. Each classifier chain contains a
logistic regression model for each of the 14 labels. The models in each
Expand Down
4 changes: 2 additions & 2 deletions sklearn/decomposition/online_lda.py
Original file line number Diff line number Diff line change
Expand Up @@ -274,8 +274,8 @@ class LatentDirichletAllocation(TransformerMixin, BaseEstimator):

References
----------
[1] "Online Learning for Latent Dirichlet Allocation", Matthew D. Hoffman,
David M. Blei, Francis Bach, 2010
.. [1] "Online Learning for Latent Dirichlet Allocation", Matthew D.
Hoffman, David M. Blei, Francis Bach, 2010
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make such bibliographic references with number indices, we then have indentifier conflicts that automatically get resolved by sphinx using a hash of the text of the reference:

Screenshot from 2019-09-16 09-33-14

Which leads to:

Screenshot from 2019-09-16 09-33-02

in the body of the docstring when the reference is cited.

So we have two options:

  • either we keep using numbers as references indices in such docstring but then we remove the the _ suffix of mentions such as [1]_ and the .. prefix in the entries so as to not make those actual references and let the user scroll down and scan the docstring manually instead of generating a link.

  • alternatively we stop using integer index in such references in docstrings and instead use unique identifiers such as [Hoffman2010].

For short class / function docstrings both options are possible. Whenever the text is long to the point the reader has to scroll by more than 1 screen length, I believe the second option makes most sense.

Updating all references to follow the second option (explicit identifiers) would lead to a large PR, so if we want to do this I would do it in several small localized PRs that can be merged progressively, starting with the PRs that actually cause sphinx warnings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the second option. But then all refs (even those not linked in the text) must be updated for consistency (and future use).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ogrisel, do you mind if I address the citing problem in a new issue? Just to focus here on the sphinx warnings... and close that one ASAP?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the core of the referencing problem is also reported there #4344

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree let's focus on fixing the sphinx warnings first and see about the general consistency of references across the full code-base for later PRs.


[2] "Stochastic Variational Inference", Matthew D. Hoffman, David M. Blei,
Chong Wang, John Paisley, 2013
Expand Down
8 changes: 4 additions & 4 deletions sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py
Original file line number Diff line number Diff line change
Expand Up @@ -751,13 +751,13 @@ class HistGradientBoostingRegressor(RegressorMixin, BaseHistGradientBoosting):
n_trees_per_iteration_ : int
The number of tree that are built at each iteration. For regressors,
this is always 1.
train_score_ : ndarray, shape (n_iter_ + 1,)
train_score_ : ndarray, shape (n_iter_+1,)
The scores at each iteration on the training data. The first entry
is the score of the ensemble before the first iteration. Scores are
computed according to the ``scoring`` parameter. If ``scoring`` is
not 'loss', scores are computed on a subset of at most 10 000
samples. Empty if no early stopping.
validation_score_ : ndarray, shape (n_iter_ + 1,)
validation_score_ : ndarray, shape (n_iter_+1,)
The scores at each iteration on the held-out validation data. The
first entry is the score of the ensemble before the first iteration.
Scores are computed according to the ``scoring`` parameter. Empty if
Expand Down Expand Up @@ -932,13 +932,13 @@ class HistGradientBoostingClassifier(BaseHistGradientBoosting,
The number of tree that are built at each iteration. This is equal to 1
for binary classification, and to ``n_classes`` for multiclass
classification.
train_score_ : ndarray, shape (n_iter_ + 1,)
train_score_ : ndarray, shape (n_iter_+1,)
The scores at each iteration on the training data. The first entry
is the score of the ensemble before the first iteration. Scores are
computed according to the ``scoring`` parameter. If ``scoring`` is
not 'loss', scores are computed on a subset of at most 10 000
samples. Empty if no early stopping.
validation_score_ : ndarray, shape (n_iter_ + 1,)
validation_score_ : ndarray, shape (n_iter_+1,)
The scores at each iteration on the held-out validation data. The
first entry is the score of the ensemble before the first iteration.
Scores are computed according to the ``scoring`` parameter. Empty if
Expand Down
2 changes: 1 addition & 1 deletion sklearn/ensemble/partial_dependence.py
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@ def plot_partial_dependence(gbrt, X, features, feature_names=None,
Dict with keywords passed to the ``matplotlib.pyplot.plot`` call.
For two-way partial dependence plots.

**fig_kw : dict
``**fig_kw`` : dict
Dict with keywords passed to the figure() call.
Note that all keywords not recognized above will be automatically
included here.
Expand Down
2 changes: 1 addition & 1 deletion sklearn/impute/_knn.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ class KNNImputer(TransformerMixin, BaseEstimator):

- 'nan_euclidean'
- callable : a user-defined function which conforms to the definition
of _pairwise_callable(X, Y, metric, **kwds). The function
of ``_pairwise_callable(X, Y, metric, **kwds)``. The function
accepts two arrays, X and Y, and a `missing_values` keyword in
`kwds` and returns a scalar distance value.

Expand Down
2 changes: 1 addition & 1 deletion sklearn/linear_model/bayes.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ class BayesianRidge(RegressorMixin, LinearModel):
sigma_ : array-like of shape (n_features, n_features)
Estimated variance-covariance matrix of the weights

scores_ : array-like of shape (n_iter_ + 1,)
scores_ : array-like of shape (n_iter_+1,)
If computed_score is True, value of the log marginal likelihood (to be
maximized) at each iteration of the optimization. The array starts
with the value of the log marginal likelihood obtained for the initial
Expand Down
4 changes: 2 additions & 2 deletions sklearn/metrics/ranking.py
Original file line number Diff line number Diff line change
Expand Up @@ -1188,7 +1188,7 @@ def dcg_score(y_true, y_score, k=None,
References
----------
`Wikipedia entry for Discounted Cumulative Gain
<https://en.wikipedia.org/wiki/Discounted_cumulative_gain>`_
<https://en.wikipedia.org/wiki/Discounted_cumulative_gain>`_

Jarvelin, K., & Kekalainen, J. (2002).
Cumulated gain-based evaluation of IR techniques. ACM Transactions on
Expand Down Expand Up @@ -1336,7 +1336,7 @@ def ndcg_score(y_true, y_score, k=None, sample_weight=None, ignore_ties=False):
References
----------
`Wikipedia entry for Discounted Cumulative Gain
<https://en.wikipedia.org/wiki/Discounted_cumulative_gain>`_
<https://en.wikipedia.org/wiki/Discounted_cumulative_gain>`_

Jarvelin, K., & Kekalainen, J. (2002).
Cumulated gain-based evaluation of IR techniques. ACM Transactions on
Expand Down
13 changes: 8 additions & 5 deletions sklearn/svm/classes.py
Original file line number Diff line number Diff line change
Expand Up @@ -827,11 +827,14 @@ class NuSVC(BaseSVC):
Scalable linear Support Vector Machine for classification using
liblinear.

Notes
-----
**References:**
`LIBSVM: A Library for Support Vector Machines
<http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf>`__
References
----------
.. [1] `LIBSVM: A Library for Support Vector Machines
<http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf>`_

.. [2] `Platt, John (1999). "Probabilistic outputs for support vector
machines and comparison to regularizedlikelihood methods."
<http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.1639>`_
"""

_impl = 'nu_svc'
Expand Down
0