DOC Fix warnings about references and links #14976

cmarmo · 2019-09-13T15:11:56Z

Reference Issues/PRs

Close #14975

What does this implement/fix? Explain your changes.

Fix warnings on references and links in the documentation.
Add referenced missing entries in the Glossary.

cmarmo · 2019-09-13T19:32:23Z

CircleCI failure is my bad... I will fix it as soon as possible.

ogrisel · 2019-09-16T07:26:42Z

doc/glossary.rst

+    ``pos_label``
+        Value with which positive labels must be encoded in binary
+        classification problems in which the positive class is not assumed.
+


Maybe add something like. "This value is typically required to compute asymmetric evaluation metrics such as precision and recall".

Added in bdd700f

ogrisel · 2019-09-16T07:47:14Z

sklearn/decomposition/online_lda.py

-    [1] "Online Learning for Latent Dirichlet Allocation", Matthew D. Hoffman,
-        David M. Blei, Francis Bach, 2010
+    .. [1] "Online Learning for Latent Dirichlet Allocation", Matthew D.
+        Hoffman, David M. Blei, Francis Bach, 2010


If we make such bibliographic references with number indices, we then have indentifier conflicts that automatically get resolved by sphinx using a hash of the text of the reference:

Which leads to:

in the body of the docstring when the reference is cited.

So we have two options:

either we keep using numbers as references indices in such docstring but then we remove the the _ suffix of mentions such as [1]_ and the .. prefix in the entries so as to not make those actual references and let the user scroll down and scan the docstring manually instead of generating a link.

alternatively we stop using integer index in such references in docstrings and instead use unique identifiers such as [Hoffman2010].

For short class / function docstrings both options are possible. Whenever the text is long to the point the reader has to scroll by more than 1 screen length, I believe the second option makes most sense.

Updating all references to follow the second option (explicit identifiers) would lead to a large PR, so if we want to do this I would do it in several small localized PRs that can be merged progressively, starting with the PRs that actually cause sphinx warnings.

I prefer the second option. But then all refs (even those not linked in the text) must be updated for consistency (and future use).

@ogrisel, do you mind if I address the citing problem in a new issue? Just to focus here on the sphinx warnings... and close that one ASAP?

Note that the core of the referencing problem is also reported there #4344

I agree let's focus on fixing the sphinx warnings first and see about the general consistency of references across the full code-base for later PRs.

rth · 2019-09-16T07:57:22Z

sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py

@@ -932,13 +932,13 @@ class HistGradientBoostingClassifier(BaseHistGradientBoosting,
        The number of tree that are built at each iteration. This is equal to 1
        for binary classification, and to ``n_classes`` for multiclass
        classification.
-    train_score_ : ndarray, shape (n_iter_ + 1,)
+    train_score_ : ndarray, shape (`n_iter_ + 1`,)


We don't code escape shapes in general see #14744 Was this producing a warning?

Yes: the n_iter_ was interpreted as a label for a link producing a warning.

WARNING: Unknown target name: "n_iter".

Maybe another fix is possible....

It turns out that simply removing spaces solved the problem (765e0fe)

rth · 2019-09-16T07:57:35Z

sklearn/linear_model/bayes.py

@@ -108,7 +108,7 @@ class BayesianRidge(RegressorMixin, LinearModel):
    sigma_ : array-like of shape (n_features, n_features)
        Estimated variance-covariance matrix of the weights

-    scores_ : array-like of shape (n_iter_ + 1,)
+    scores_ : array-like of shape (`n_iter_ + 1`,)


rth · 2019-09-16T08:04:29Z

doc/glossary.rst

+        operations involve using a large amount of temporary memory.
+        Where computations can be performed in fixed-memory chunks the user is
+        allowed to hint at the maximum size of this working memory (defaulting
+        to 1GB). 


I don't think we use this outside of set_config and pairwise_distances_chunked? I don't know if we want to add it to the glossary.

If we do add it, we would need to reference the glossary from those docstrings, but then the behavior is not consistent as in functions outside of set_config

When None (default), the value of sklearn.get_config()['working_memory'] is used.

So postponing this until we have more occurrences might be simplest.

Sorry, not sure to understand... Anyway I have added working_memory in the glossary because of the warning

scikit-learn/doc/modules/computing.rst:568: WARNING: term not in glossary: working_memory

After re-reading... understood. Let's do the other way around. I have removed the link to working_memory and deleted working_memory from the glossary (bdd700f)

ogrisel

Some more comments but otherwise, LGTM.

ogrisel · 2019-09-20T15:42:43Z

sklearn/decomposition/online_lda.py

-    [1] "Online Learning for Latent Dirichlet Allocation", Matthew D. Hoffman,
-        David M. Blei, Francis Bach, 2010
+    .. [1] "Online Learning for Latent Dirichlet Allocation", Matthew D.
+        Hoffman, David M. Blei, Francis Bach, 2010


I agree let's focus on fixing the sphinx warnings first and see about the general consistency of references across the full code-base for later PRs.

ogrisel · 2019-09-20T15:43:33Z

sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py

        The scores at each iteration on the training data. The first entry
        is the score of the ensemble before the first iteration. Scores are
        computed according to the ``scoring`` parameter. If ``scoring`` is
        not 'loss', scores are computed on a subset of at most 10 000
        samples. Empty if no early stopping.
-    validation_score_ : ndarray, shape (n_iter_ + 1,)
+    validation_score_ : ndarray, shape (`n_iter_ + 1`,)


Suggested change

validation_score_ : ndarray, shape (`n_iter_ + 1`,)

validation_score_ : ndarray, shape (n_iter_+1,)

sorry... just fixed

ogrisel · 2019-09-20T15:43:48Z

sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py

@@ -932,13 +932,13 @@ class HistGradientBoostingClassifier(BaseHistGradientBoosting,
        The number of tree that are built at each iteration. This is equal to 1
        for binary classification, and to ``n_classes`` for multiclass
        classification.
-    train_score_ : ndarray, shape (n_iter_ + 1,)
+    train_score_ : ndarray, shape (`n_iter_ + 1`,)


Suggested change

train_score_ : ndarray, shape (`n_iter_ + 1`,)

train_score_ : ndarray, shape (n_iter_+1,)

cmarmo · 2019-09-23T10:50:12Z

Thanks @ogrisel! Maybe someone else could check?
Recent merges have already generated new sphinx warnings... this is likely to become a Sisyphus punishement... :(

thomasjpfan

Updating all references to follow the second option (explicit identifiers) would lead to a large PR...

Having the references use a [NAMEYEAR] style through out the repo woud be great.

Thank you @cmarmo on working on reducing the warnings in sphinx! Next would be to get the CI to fail when there are any warnings, so we do not need to keep on fixing the warnings manually.

glemaitre · 2019-09-23T15:50:58Z

Thank you @cmarmo on working on reducing the warnings in sphinx! Next would be to get the CI to fail when there are any warnings, so we do not need to keep on fixing the warnings manually.

We were discussing this in IRL. We should be able to quickly introspect the warning created by a PR somehow.

thomasjpfan · 2019-09-23T16:02:05Z

We were discussing this in IRL. We should be able to quickly introspect the warning created by a PR somehow.

Agreed. It would most likely involve some fun bash/text processing, so one does not need to download the complete stdout to get the warnings.

cmarmo added 5 commits September 13, 2019 12:11

Fix some formatting and ref warnings.

a6b9305

Fix Unknown target warnings.

a8a9511

FIx warninghs on glossary.

0b54ff1

Merge branch 'master' into warningdoc

26bbdec

Fix missing refs and label in generated examples.

586a36f

cmarmo and others added 5 commits September 15, 2019 18:48

Merge branch 'master' into warningdoc

cc0208e

Revert escape sequence.

a33af98

Revert escape sequence (again).

549cea6

Fix line too long circleci failure.

19b997c

Merge branch 'master' into warningdoc

6a6015e

ogrisel reviewed Sep 16, 2019

View reviewed changes

rth reviewed Sep 16, 2019

View reviewed changes

cmarmo added 4 commits September 17, 2019 13:51

Merge branch 'master' into warningdoc

1084747

Fix glossary issues.

bdd700f

Alternative fix for 'Unknown target'. Thanks @glemaitre!

765e0fe

Alternative fix for 'Unknown target'. Again...

a1ddbc3

ogrisel approved these changes Sep 20, 2019

View reviewed changes

...and again...

52352bb

cmarmo changed the title ~~[DOC] Fix warnings about references and links~~ DOC Fix warnings about references and links Sep 20, 2019

thomasjpfan approved these changes Sep 23, 2019

View reviewed changes

thomasjpfan merged commit 6bd8df0 into scikit-learn:master Sep 23, 2019

cmarmo deleted the warningdoc branch September 24, 2019 12:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC Fix warnings about references and links #14976

DOC Fix warnings about references and links #14976

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

	validation_score_ : ndarray, shape (`n_iter_ + 1`,)
	validation_score_ : ndarray, shape (n_iter_+1,)

	train_score_ : ndarray, shape (`n_iter_ + 1`,)
	train_score_ : ndarray, shape (n_iter_+1,)

Uh oh!

DOC Fix warnings about references and links #14976

DOC Fix warnings about references and links #14976

Uh oh!

Conversation

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!