-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Add plotting module with heatmap function #8082
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
print(title) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure if this function is still helpful for copy & paste. It mostly labels the axis and does normalization, which might be helpful... I left it for now.
ax.set_xlabel(xlabel) | ||
ax.set_ylabel(ylabel) | ||
|
||
ax.set_xlim(0, values.shape[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why this is necessary, but otherwise the grid-search plot adds another row / column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plt.pcolor(np.random.uniform(size=(13, 13)), snap=True)
gives a 14x14 plot... huh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be necessary, or at least on master, it isn't necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's not necessary on 2.0.0rc2, so we only have to support it for like 8 1/2 more years or something (requirements are two ubuntu LTS releases ago)
also pcolor and pcolormesh seem to have different opinions on where the center of a pixel / square is, so I'm using the slower pcolor for now. |
try: | ||
import matplotlib | ||
except ImportError: | ||
raise SkipTest("Not testing plot_heatmap, matplotlib not installed.") | 8000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be module-level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how would that work with nose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial thoughts:
How do we name and organise these things? How specific should they be to ML? to scikit-learn? really this is just a plot for a continuous function of two categorical inputs. Is that essentially identical to a heatmap?
Do we tie these more closely to API and applications, e.g. by taking a cv_results_
as input, together with a pair of parameter names?
How do we scale up this interface to the case where we want to produce a matrix of all bivariate and univariate parameter-score plots for some grid search?
Also, every plot_
function should be used in at least one example.
All good questions.
What are your thoughts?
This is taken from my book with some improvements. It started out as a
function for confusion matrices and then I generalized to grid search
parameters.
I didn't make it confusion matrix specific here, though we could add the
thin wrapper to do that that is in the example for now.
I started with the most general functions that I have. Arguably we could
aim at moving this to matplotlib and create application specific wrappers
here.
Another function along these lines that I wrote for the book is a "discrete
_scatter" which is calling plot repeatedly for all unique values of y, to
allow creating a legend.
This also comes up very often for sklearn but could arguably be in
matplotlib.
Sent from phone. Please excuse spelling and brevity.
…On Dec 20, 2016 8:39 AM, "Joel Nothman" ***@***.***> wrote:
***@***.**** commented on this pull request.
My initial thoughts:
How do we name and organise these things? How specific should they be to
ML? to scikit-learn? really this is just a plot for a continuous function
of two categorical inputs. Is that essentially identical to a heatmap?
Do we tie these more closely to API and applications, e.g. by taking a
cv_results_ as input, together with a pair of parameter names?
How do we scale up this interface to the case where we want to produce a
matrix of all bivariate and univariate parameter-score plots for some grid
search?
Also, every plot_ function should be used in at least one example.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8082 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAbcFh-4PDoun6es6p8KRFTf601KG4CSks5rJ9qugaJpZM4LQ-oA>
.
|
also, working for general |
Any further thoughts? I think having more "high-level" functions is probably good. There was just a submission to scikit-learn-contrib that did something similar. |
If not really integrated with the estimator interface, is having a separate
skmlplot library a better way to go?
…On 10 January 2017 at 12:11, Andreas Mueller ***@***.***> wrote:
Any further thoughts? I think having more "high-level" functions is
probably good. There was just a submission to scikit-learn-contrib that did
something similar.
@GaelVaroquaux <https://github.com/GaelVaroquaux> any thoughts?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8082 (comment)>,
or mute the threa
8000
d
<https://github.com/notifications/unsubscribe-auth/AAEz644Zlrowj-9C5do_cSa9Nf30_tdkks5rQtrTgaJpZM4LQ-oA>
.
|
I think actually I want more integrated functions, so I'd add a function plotting 2d grid search and one for confusion matrix. |
note to self: the alignment is messed up, it should use pcolormesh, I need to double check the flipping of the axes and vmin and vmax are not passed along |
Continued here - #9173. Please close this PR. |
Closing this pull request. |
This adds a plotting module and a first function to plot heatmaps with values inside.
This is a slight generalization of the confusion matrix plot.
One of the improvements is that the color of the text takes the colormap into account, which is actually somewhat tricky to do as you can see.
Here's what the grid-search plot looks before


and after:
I could add another keyword for the direction of the y-axis but I'm not sure if that's overkill. The current direction makes sense for confusion matrices, with the origin in the top right.
Slight caveat: I'm not sure it's easy to add a colorbar to a plot like this.
I think that's less important given how explicit the plot is, though.