DOC move notebooks to sphinx user guides and generated docs #336

adrinjalali · 2020-03-18T14:02:46Z

Our notebooks are examples for users, and they are conceptually a part of the documentation.

This PR proposes to move them to a sphinx gallery generated example. You can call make doc (or run the command that it runs) to generate the examples locally.

@philippjfr @cmarmo I'm trying to point to the examples from the main index.rst, not sure how to do that. The auto_generated examples doesn't exist when the index.rst is processed, and therefore gives an error.

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

philippjfr · 2020-03-18T14:03:31Z

Thanks, I'll take a look as soon as I get a chance.

docs/examples/binary_classification_COMPAS.rst

docs/fairlearn.exceptions.rst

docs/examples/binary_classification_COMPAS.rst

docs/index.rst

cmarmo · 2020-03-19T11:57:56Z

@adrinjalali , may I suggest to add sphinx_gallery in the requirements.txt in this PR? Cheers.

adrinjalali · 2020-03-19T15:52:17Z

How do we keep this in sync with the notebook itself?

So the idea is that there are a few ways of having this content on the docs:

on the notebooks
in a python file, as an example, have it processed with sphinx-gallery which generates an html page for it, and gives a link to the .py and the .ipynb file. The notebook is then generated by the sphinx-gallery and there's no need to have a notebook in the repo.
have the content, or some of it, as a user manual, in an .rst file. That'll include the output in the .rst file, and sphinx will run the code, and then check the output against the one put in the file, and error if not the same (AFAIK).

We usually have a combination of the user guide .rst file and the .py examples in sklearn. Here in this PR I've included both, for us to see how they look, and decide which path to move forward. But both of them will remove the notebooks from the repo. I've just not removed it from the repo yet in this PR.

riedgar-ms · 2020-03-19T16:51:46Z

OK, but I think we will want to have example notebooks for users as well. So is there a way of automatically generating the notebooks from these changes, and having them appear on a webpage?

adrinjalali · 2020-03-19T19:11:35Z

Yes, the examples are used to automatically generate the notebook. As an example you can see this one. There's a notebook download button on the bottom of the page.

adrinjalali · 2020-03-19T19:11:53Z

sphinx-gallery is the one doing it.

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

adrinjalali · 2020-03-20T17:31:00Z

The tests are failing due to some pep8 issues (line too long). In sklearn we find it hard to maintain the like length in examples, especially with math lines or links. Should we relax on the linter on the docs/ folder and exclude them from the linter?

adrinjalali · 2020-03-20T17:31:23Z

I've removed the linter, and now we have the link to the example on the main index page. Let me know what you think.

romanlutz · 2020-03-21T06:02:45Z

The tests are failing due to some pep8 issues (line too long). In sklearn we find it hard to maintain the like length in examples, especially with math lines or links. Should we relax on the linter on the docs/ folder and exclude them from the linter?

Works for me! @riedgar-ms ?

riedgar-ms · 2020-03-21T17:22:17Z

I'm OK with upping the line length limit in the docs directory

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

adrinjalali · 2020-04-01T16:31:11Z

@riedgar-ms
the generated stuff will be under doc/_build/html/auto_examples, and you can use make doc or what that command does (a sphinx call) to generate the docs locally. It won't look the same as what you see on readthedocs because of the default sphinx theme on the readthedocs side, but gives you a good idea. Then you can open the doc/_build/html/index.html and you'll have a link to the generated html page, which has links to the .py and the .ipynb files.

@romanlutz
The generated index.html now has a link to the generated example, because of the auto_examples/plot_binary_classification_COMPAS line in the doc/index.rst.

We can also have a quick chat about the whole sphinx, sphinx-gallery, etc if it helps.

riedgar-ms · 2020-04-01T18:22:58Z

docs/index.rst

-    fairlearn.reductions
-    fairlearn.widget
-
+   auto_examples/plot_binary_classification_COMPAS


Could we have an 'examples' entry at this level of the table of contents, and the notebook beneath it?

Does this look better?

On the landing page, I'd suggest not opening up the TOC for each example. Just list the examples. Well, 'example' for now.

Could there just be an 'examples' link in the sidebar TOC too. Right now, the Notebook itself is listed there:

How does it look now?

riedgar-ms · 2020-04-01T18:23:32Z

examples/plot_binary_classification_COMPAS.py

+Binary Classification on COMPAS dataset
+=======================================
+"""
+print(__doc__)


Was this supposed to show up on the rendered webpage?

yes. It's there to show up when you run the .py version of the example.

What does it print? All of the (former) notebook text?

Yes. Could remove it if you want.

Maybe leave it in for now; I think it looks a little odd, but I don't have the only vote here :-)

When you push your changes, I'll bring them over to my branch, and get the updated sample documentation.

riedgar-ms · 2020-04-01T18:35:51Z

I've generated the docs locally as suggested, and apart from the two small layout issues I've mentioned, it looks OK to me.

I wouldn't do more than this right now, though. We need to figure out how to fit this into our build pipeline (both for testing the notebooks and autopublishing the docs), and there may also be other integration points with other resources we're putting together.

However, I doubt any of that can happen until this is in the main Fairlearn repo (unless I can suck the branch from the fork into the main repo.... I imagine git can do that somehow).

Any other comments @romanlutz ?

romanlutz · 2020-04-01T19:17:30Z

Everything looks good to me. I'd like to proceed since we're not breaking anything existing, get it into master, so that we can see what it looks like, and then step-by-step replace existing ones.

@adrinjalali I would love to have that chat on sphinx, sphinx-gallery, etc. and I'm sure @riedgar-ms is also interested. Feel free to DM me/us on Gitter or Teams or wherever you like with possible times. I'm willing to compromise on timing since I'm 9h away from you ;-)

MiroDudik · 2020-04-01T20:34:54Z

I like the proposal here and I think we should go ahead and merge this PR. I really like the ability to more naturally include examples in the documentation.

However, we shouldn't yet remove the current notebooks--and I also don't think we should quite yet put them into the new format. My main concern is the ease of editing the notebooks. I like developing and editing the notebook in jupyter. I see that sphinx-gallery can generate .ipynb from .py--does it also support going in the opposite direction or is there some other tool / suggested workflow that handles that without too much pain?

adrinjalali · 2020-04-01T21:21:43Z

The issue is that .ipynb does not play very nice with git, and editing them are really not that easy if you don't have a notebook server up all the time (and some of us really don't, like me). But that doesn't mean I don't do exploratory analysis or coding. For that, I have two points:

Exploratory code IMHO should not be in the main repo, and it's worth going through the hassle of converting it into a python file before putting it in the repo.
Not having notebooks doesn't mean you can't do notebook style coding. For instance, in vscode you can have a .py file, and as soon as you have a #%% line, it assumes it's a new cell. You can have your cells and run them on the notebook server in the IDE and still have a .py file. I believe other IDEs have a similar feature.
Looking up usage instances when there are notebooks in the repo is not trivial, and changing them mostly requires running a notebook server. I ended up having to go through that pain when working on ThresholdOptimizer to make it more sklearn compatible. And since that PR was touching on the API, needed to fix the usage examples in the notebooks, and it took me quite a few CI cycles more than it should have.

So I believe once we get used to it, it shouldn't affect our productivity while working on the .py files.

* adding sphinx-gallery Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * add example as user guide Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * remove exceptions Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * cleanups Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * fix link to doc Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * remove user guide format Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * improve on pep8 Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * fix after merge upstream/master Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * change flake8 settings and fix some pep8 issues Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * pep8, blacked the example Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * ignore docstring warnings Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * fix inline math Signed-off-by: adrinjalali <adrin.jalali@gmail.com> * minor change Signed-off-by: adrinjalali <adrin.jalali@gmail.com> Co-authored-by: adrinjalali <adrin.jalali@gmail.com>

riedgar-ms · 2020-04-02T14:15:30Z

FYI, this is what the docs will look like after the change:
https://fairlearn.readthedocs.io/en/riedgar-ms-adrin-nb-copy/
(I stole @adrinjalali 's branch into this repo as a workaround)

romanlutz · 2020-04-02T22:11:31Z

The issue is that .ipynb does not play very nice with git, and editing them are really not that easy if you don't have a notebook server up all the time (and some of us really don't, like me). But that doesn't mean I don't do exploratory analysis or coding. For that, I have two points:

* Exploratory code IMHO should not be in the main repo, and it's worth going through the hassle of converting it into a python file before putting it in the repo.

* Not having notebooks doesn't mean you can't do notebook style coding. For instance, in vscode you can have a `.py` file, and as soon as you have a `#%%` line, it assumes it's a new _cell_. You can have your cells and run them on the notebook server in the IDE and still have a `.py` file. I believe other IDEs have a similar feature.

* Looking up usage instances when there are notebooks in the repo is not trivial, and changing them mostly requires running a notebook server. I ended up having to go through that pain when working on `ThresholdOptimizer` to make it more sklearn compatible. And since that PR was touching on the API, needed to fix the usage examples in the notebooks, and it took me quite a few CI cycles more than it should have.

So I believe once we get used to it, it shouldn't affect our productivity while working on the .py files.

I share all your concerns! The process you outline here sounds promising and I'll go ahead and try that. It seems to address lots of the issues I have with notebooks.

I'll count Miro's statement above as an endorsement to complete the PR. We'll have to have a broader discussion about notebooks since some of them are examples on how to use the package, others are more like case studies.

romanlutz · 2020-04-02T22:17:23Z

Oh I missed the two outstanding comments by @riedgar-ms . Richard, please complete the PR when you're happy with them.

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

adrinjalali · 2020-04-03T13:07:57Z

The DCO is wrong in failing, it's failing because of your commit to my branch @riedgar-ms (I so much dislike this DCO thing)

riedgar-ms · 2020-04-03T13:55:18Z

The DCO is wrong in failing, it's failing because of your commit to my branch @riedgar-ms (I so much dislike this DCO thing)

I committed to your branch? In your fork? Sorry :-(

I don't even know how I managed that :-(

adrinjalali · 2020-04-03T13:58:57Z

haha, yes you did. It's a good tool. It's a nice one for maintainers. Basically, if the author of the PR checks the "allow changes from maintainers", then maintainers can commit to the relevant branch of the author directly ;) and we do it pretty often in sklearn to do last fixes on people's PRs. It's odd that DCO can't understand that.

riedgar-ms · 2020-04-03T14:02:33Z

Yes, but I thought I was updating my copy of your branch.... I was doing a PR from your branch of your fork into my branch of Fairlearn itself (so that we could view in ReadTheDocs). And I thought the conflict was on that PR, and that I'd merged into my branch.

riedgar-ms · 2020-04-03T15:35:33Z

I've just shown this to @mesameki and @vingu . They like the basic flow, but aren't sure about converting our big 'scenario' notebooks over to this. For smaller code samples (I think that the Metrics notebook is a good candidate), they think it looks great and a valuable addition to our documentation.

I've expanded on that a bit in the Fairlearn Teams channel (or are we moving to Gitter?).

MiroDudik · 2020-04-03T19:58:43Z

@adrinjalali : I think you make an excellent point re. .ipynb not playing nice with git. I suggest that for the time being, let's keep both examples and notebooks, but let's not duplicate the content between the two--except for the current example that you worked out. I'll try to see how easily I can play with examples/plot_binary_classification_COMPAS.py to get a sense of what makes sense to cover under examples and what we should cover under notebooks (if anything).

One of the issues that we are running into here is that there is no established way to talk about fairness in AI, so we are developing standards as we go and that's why our examples are more exploratory than one would like. Also--as soon as we discover that something might have bad unintented consequences, we want to be able to easily edit things--any any barrier to that will slow things down.

Given all these disclaimers: I'm good to go ahead with this PR.

@riedgar-ms : I don't think we need to move this conversation elsewhere given that the bulk of it is here (and we are reaching a consensus).

adrinjalali · 2020-04-06T08:54:29Z

I think it'd be best if we could have a call on this one. Any day this week 17:30-19:30 would work for me.

riedgar-ms · 2020-04-06T11:21:10Z

I think it'd be best if we could have a call on this one. Any day this week 17:30-19:30 would work for me.

I'm assuming that's European time? Have your clocks gone forward, @adrinjalali . What were your particular concerns?

With @MiroDudik being happy with this, we can complete the PR (after that ever-fun DCO issue is fixed). However, I don't think we'd go ahead and convert other notebooks, while we decide how best to present things. It's great to have this new option, though.

adrinjalali · 2020-04-06T11:37:40Z

I'm assuming that's European time?

Yes, sorry, CEST time.

after that ever-fun DCO issue is fixed

that'll mean force pushing and losing the history here, which I'm not super fond of, do we have to? You can still merge I guess w/o.

However, I don't think we'd go ahead and convert other notebooks, while we decide how best to present things. It's great to have this new option, though.

I'd like to be a part of that conversation if you don't mind. It doesn't give the contributor (in this case me) a feeling of inclusion when you come and say we talked and we decided we won't do it for the other ones. I think it's better to have a discussion, hear your concerns, and here the pro side, and see what we can do. I agree that the larger notebooks shouldn't be converted to an example, but those are perfect examples of something which should go to a user guide. At the end of the day, the website is where people find information, and those user guides are important for people to understand how the package works.

riedgar-ms · 2020-04-06T17:43:28Z

Yes, sorry, CEST time.

No problem. Invitation set.

after that ever-fun DCO issue is fixed

that'll mean force pushing and losing the history here, which I'm not super fond of, do we have to? You can still merge I guess w/o.

There's a suggested fix from the DCO bot:
https://github.com/fairlearn/fairlearn/pull/336/checks?check_run_id=558031246
However, I believe that that only works if the troublesome commit was the last one.

I'd like to be a part of that conversation if you don't mind. It doesn't give the contributor (in this case me) a feeling of inclusion when you come and say we talked and we decided we won't do it for the other ones. I think it's better to have a discussion, hear your concerns, and here the pro side, and see what we can do. I agree that the larger notebooks shouldn't be converted to an example, but those are perfect examples of something which should go to a user guide. At the end of the day, the website is where people find information, and those user guides are important for people to understand how the package works.

Understood, and my apologies for the lack of organisation on this my end.

romanlutz · 2020-04-06T20:21:23Z

Yes, sorry, CEST time.

No problem. Invitation set.

after that ever-fun DCO issue is fixed
that'll mean force pushing and losing the history here, which I'm not super fond of, do we have to? You can still merge I guess w/o.

There's a suggested fix from the DCO bot:
https://github.com/fairlearn/fairlearn/pull/336/checks?check_run_id=558031246
However, I believe that that only works if the troublesome commit was the last one.

I'd like to be a part of that conversation if you don't mind. It doesn't give the contributor (in this case me) a feeling of inclusion when you come and say we talked and we decided we won't do it for the other ones. I think it's better to have a discussion, hear your concerns, and here the pro side, and see what we can do. I agree that the larger notebooks shouldn't be converted to an example, but those are perfect examples of something which should go to a user guide. At the end of the day, the website is where people find information, and those user guides are important for people to understand how the package works.

Understood, and my apologies for the lack of organisation on this my end.

DCO can be overridden so don't worry. I don't see your sign-off, @riedgar-ms, though, so I'll wait.

riedgar-ms

Thanks for showing us how to do this!

MiroDudik · 2020-04-08T18:52:14Z

[i'm playing with jupytext to try things out a bit]

adrinjalali added 4 commits March 6, 2020 15:01

adding sphinx-gallery

36d9ded

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

add example as user guide

736a8ac

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

remove exceptions

dc30d66

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

cleanups

8acab5b

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

romanlutz reviewed Mar 19, 2020

View reviewed changes

docs/examples/binary_classification_COMPAS.rst Outdated Show resolved Hide resolved

romanlutz reviewed Mar 19, 2020

View reviewed changes

docs/fairlearn.exceptions.rst Show resolved Hide resolved

romanlutz requested a review from riedgar-ms March 19, 2020 03:09

romanlutz assigned adrinjalali Mar 19, 2020

romanlutz added the documentation label Mar 19, 2020

romanlutz linked an issue Mar 19, 2020 that may be closed by this pull request

python tests instead of notebooks #248

Closed

romanlutz added this to the Improve Testing milestone Mar 19, 2020

riedgar-ms reviewed Mar 19, 2020

View reviewed changes

docs/examples/binary_classification_COMPAS.rst Outdated Show resolved Hide resolved

cmarmo reviewed Mar 19, 2020

View reviewed changes

docs/index.rst Outdated Show resolved Hide resolved

adrinjalali added 4 commits March 20, 2020 17:47

fix link to doc

121bd03

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

remove user guide format

4a48977

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

Merge remote-tracking branch 'upstream/master' into doc/sphinx

88f7bfb

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

improve on pep8

8000

8b03200

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

adrinjalali added 3 commits March 24, 2020 11:09

Merge remote-tracking branch 'upstream/master' into doc/sphinx

98711f9

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

fix after merge upstream/master

1c3d699

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

change flake8 settings and fix some pep8 issues

d3abe3f

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

riedgar-ms reviewed Apr 1, 2020

View reviewed changes

adrinjalali and others added 4 commits April 3, 2020 14:18

move example to example section

a056265

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

Merge branch 'riedgar-ms/adrin-nb-copy' into doc/sphinx

d9b726e

point to the examples index file instead

cc7fdbe

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

merge remote

9d644b5

Signed-off-by: adrinjalali <adrin.jalali@gmail.com>

riedgar-ms approved these changes Apr 6, 2020

View reviewed changes

romanlutz merged commit 681389e into fairlearn:master Apr 6, 2020

adrinjalali deleted the doc/sphinx branch April 7, 2020 08:07

DOC move notebooks to sphinx user guides and generated docs #336

DOC move notebooks to sphinx user guides and generated docs #336

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!