-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Remove some remnants of hist{,2d}(normed=...). #16572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lib/matplotlib/axes/_axes.py
Outdated
If ``True``, draw and return a probability density: each bin | ||
will display the bin's raw count *divided by the bin width* | ||
(``n = n0 / sum(n0) * np.diff(bins)``), so that the area under | ||
the histogram integrates to 1 (``np.sum(n * np.diff(bins)) == 1``). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a mistake that slipped in via #16149 (where the text was changed after the first approving review).
So initially it said "This is achieved by dividing the count by the number of observations times the bin width [..]", which is correct, but ambiguous. But that turned into n = n0 / sum(n0) * np.diff(bins)
, which is incorrect and should rather be n = n0 / (sum(n0) * np.diff(bins))
I also find n0
used here adds to confusion. So n
already has two meanings in this docstring, namely the histogram counts, which are returned, as well as the dimensionality of the input data. n0
is then then histogram counts in case of multiple data input, but also the "raw counts".
Hence ideally the complete docstring is revised to use one letter for the return, and another for the dimensionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. I just renamed them "density" and "counts" :)
- Remove mentions of it from docstrings, example. - np.histogram2d also supports density since np1.15 eric-wieser/numpy@081050a - Reword docs for `density` in a hopefully(?) clearer manner? In particular, few users will care about the "first element of the return tuple"; what mostly matters is what gets drawn. - Refer to hist's `density` in hist2d.
If ``True``, draw and return a probability density: each bin | ||
will display the bin's raw count divided by the total number of | ||
counts *and the bin width* | ||
(``density = counts / (sum(counts) * np.diff(bins))``), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't right either, is it? It should be density = counts / sum(counts * np.diff(bins))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sum(counts) is a scalar and diff(bins) a vector, aren't they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess both are the same in case of constant bin width. But in general it needs to be
density = counts / (sum(counts) * np.diff(bins))
. So the way it's currently written is correct. (Maybe it helps thinking of sum(counts)
as the total number of observations N
and write it down as 1/N * counts / diff(bins)
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apologies. I was confusing myself. This is correct.
eric-wieser/numpy@081050a
density
in a hopefully(?) clearer manner? Inparticular, few users will care about the "first element of the return
tuple"; what mostly matters is what gets drawn.
density
in hist2d.PR Summary
PR Checklist