8000 DOC: Update multiple category bar chart examples by kostyafarber · Pull Request #24498 · matplotlib/matplotlib · GitHub
[go: up one dir, main page]

Skip to content

DOC: Update multiple category bar chart examples #24498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Dec 11, 2022

Conversation

kostyafarber
Copy link
Contributor
@kostyafarber kostyafarber commented Nov 18, 2022

PR Summary

The purpose of this PR is to update the multiple category bar chart examples in the documentation; so they are more simple and use non-arbitrary data (penguin dataset). This helps address the problem discussed in #23465 in which the examples were not semantically meaningful.

The examples have also been refactored using loops, to demonstrate general use cases, also in #23465.

PR Checklist

Documentation and Tests

  • [N/A] Has pytest style unit tests (and pytest passes)
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • [N/A] New plotting related features are documented with examples.

Release Notes

  • [N/A] New features are marked with a .. versionadded:: directive in the docstring and documented in doc/users/next_whats_new/
  • [N/A] API changes are marked with a .. versionchanged:: directive in the docstring and documented in doc/api/next_api_changes/
  • [N/A] Release notes conform with instructions in next_whats_new/README.rst or next_api_changes/README.rst

8000
@story645
Copy link
Member

I think if we're gonna go for this, then one of the major reasons would be to provide examples of creating the groupings/stackings using loops 'cause I know that's a common question and I think would help these examples feel more general.

@kostyafarber
Copy link
Contributor Author
kostyafarber commented Nov 18, 2022

Sure thing did you want to pick an example to do it for? Using the penguin dataset you can't really extract that many groups (from what I can think of... maybe you can). Unless we want to regress back to one of the original plots and refactor it to use a loop?

(is there any best practise way we generate the groups using loops that I can have a look as well or is it up to me to write it)

@story645
Copy link
Member
story645 commented Nov 20, 2022

I think looping over your list is fine? for i, penguin_means in enumerate([adelie, gentoo, chinstrap]):

@kostyafarber
Copy link
Contributor Author

@story645 done

Copy link
Member
@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the bar label demo 👍.

I'm less convinced on the data of the other two plots because they contain a lot of missing data. I think we do not need to show missing data here - full datasets is the standard case and thus a bit simpler. It would also be ok to have one missing data point, but I feel that there is too much missing data for a simple example.

@kostyafarber
Copy link
Contributor Author

Thanks 👍🏻👍🏻

As for the other examples, I agree there are too many missing data points and it does clutter example.

I'm unsure on how to wrangle the actual dataset to make it "clean" for these particular graphs. That's the best I could do for the penguin dataset.

We could augment the data and just add some dummy values in and make a comment to say "hey this isn't all real data anymore".

Or we could just revert those examples back to the previous graphs and I can refactor those to use for loops.

I guess the question is what's more important for these examples; to demonstrate generalities (using loops) or to make the underlying data less arbitrary (in which case I'd try to work with the penguin dataset again).

Let me know 🤔

@story645
Copy link
Member

Demo generalities. I like penguins but the issue w/ the old datasets was that the labels/titles didn't make any sense - basically arbitrary is okay if the visualization is semantically meaningful. Please don't change the penguin data.

I don't know if this would be useful or confusing as all out, so I can mess around w/ it rather than you, but I'm kind of curious about how a self annotating image shakes out. Something like: PXL_20221122_010518877~2.jpg

This example might tilt way into teaching viz, especially since the transpose can produce the exact same graph depending on how you set up the loop, but what I'm after is that level of this cell in the table goes to this part of the bar chart.

Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>
@kostyafarber
Copy link
Contributor Author

Thanks for the clarification @story645. Sure I understand, I won't modify the penguin data.

I like your idea and thank you for the drawn out graph viz! I get what your trying to do but I don't think I get it enough to implement it into these examples right now.

I guess from me what are the next steps. I can continue and play around the with penguin dataset and try to make it more "complete", or leave it and let you try work your idea into these graphs.

Just let me know 👍🏻

Copy link
Member
@story645 story645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to make it more general (no if statements in the loops) and lets see if that cleans up the other two examples?

@kostyafarber kostyafarber requested review from timhoffm and story645 and removed request for timhoffm November 24, 2022 15:46
@kostyafarber kostyafarber requested a review from story645 December 2, 2022 09:29
@kostyafarber
Copy link
Contributor Author

@story645

kostyafarber and others added 2 commits December 6, 2022 08:51
Co-authored-by: Elliott Sales de Andrade <quantum.ana
6D40
lyst@gmail.com>
Co-authored-by: Elliott Sales de Andrade <quantum.analyst@gmail.com>
kostyafarber and others added 3 commits December 6, 2022 08:51
Co-authored-by: Elliott Sales de Andrade <quantum.analyst@gmail.com>
Co-authored-by: Elliott Sales de Andrade <quantum.analyst@gmail.com>
Co-authored-by: Elliott Sales de Andrade <quantum.analyst@gmail.com>
@kostyafarber kostyafarber requested review from QuLogic and story645 and removed request for story645 and QuLogic December 6, 2022 12:56
story645
story645 previously approved these changes Dec 6, 2022
Copy link
Member
@story645 story645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@story645 story645 dismissed their stale review December 6, 2022 18:47

accidentally clicked approve

@kostyafarber
Copy link
Contributor Author

@story645 should be good to go

@kostyafarber kostyafarber requested a review from story645 December 9, 2022 12:44
Copy link
Member
@story645 story645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to go except for the space before the plt.subplots() call - the other suggestions are more personal preferences.

I'll merge tomorrow if there are no other comments.


fig, ax = plt.subplots()
bottom = np.zeros(3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bottom = np.zeros(3)
bottom = np.zeros(len(species))

if you want to be even more generalt

ax.set_title('Scores by group and their beverage choices')
ax.set_xticks(ind, labels=['G1', 'G2', 'G3', 'G4', 'G5'])
ax.legend()
ax.bar_label(p, label_type='center')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is white or a gray easier to read?


fig, ax = plt.subplots()
bottom = np.zeros(3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above for generalization

Co-authored-by: hannah <story645@gmail.com>
@story645 story645 added this to the v3.7.0 milestone Dec 11, 2022
@story645
Copy link
Member

This has to be rebased/squashed - do you want to do it or should I?

@kostyafarber
Copy link
Contributor Author

Happy for you to do it, thanks!

@story645 story645 merged commit e710123 into matplotlib:main Dec 11, 2022
@kostyafarber kostyafarber deleted the issue-23465 branch December 11, 2022 21:41
@jklymak
Copy link
Member
jklymak commented Dec 11, 2022

Thanks for your persistence @kostyafarber !

@kostyafarber
Copy link
Contributor Author

Glad I could help!

melissawm pushed a commit to melissawm/matplotlib that referenced this pull request Dec 19, 2022
refactor category bar chart examples to use loops and penguin data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0