-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
DOC/TST: cluster.hierarchy
: use xp_capabilities
#22960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f443522
to
8773ce9
Compare
cluster
: use xp_capabilitiescluster
: use xp_capabilities
fb3bf78
to
9cbcf60
Compare
cluster
: use xp_capabilitiescluster.hierarchy
: use xp_capabilities
9cbcf60
to
003d311
Compare
cluster.hierarchy
: use xp_capabilities
cluster.hierarchy
: use xp_capabilities
@@ -299,9 +267,10 @@ def test_mlab_linkage_conversion_multiple_rows(self, xp): | |||
rtol=1e-15) | |||
|
|||
|
|||
@skip_xp_backends(cpu_only=True) | |||
@make_xp_test_case(fclusterdata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I noticed the addition of make_xp_test_case
, but FWIW I find this much harder to parse intuitively than skip_xp_backends
.
Things are getting pretty convoluted IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're coming at the back end of 6 months worth of iterations to make it as user-friendly as possible, given the requirements.
skip_xp_backends
was no longer fit for purpose given the necessity to
- keep documentation and tests aligned
- have a central repository of which functions work on which backends
- test lazy backends (
jax.jit
and Dask materialization)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, just pointing out my own difficulty in parsing what is going on FWIW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was going to say, I've spent time this week studying xp_capabilities
to try to understand all of the ins and outs, and it solves very real problems. @tylerjereddy don't think of it as skip_xp_backends
becoming convoluted or bloated through poor planning or something, it's more that this is bringing to attention real complexity in the problem space. Maintaining a single source of truth for which functions work on which backends and having the test cases drawn from that source should make things simpler as a whole.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think things are getting a bit complex and we should be cautious here WRT maintenance burden. It does seem like the promise of a single code path for all backends has melted away for the time being and the complexity creep is something we should keep an eye on.
IIRC adding support tables to every function individually vs. a single summary location is something else that I may have missed the discussion on. Anyway, I do think that the volume of work and its complexity mean that it can be hard for folks to keep up and chime in without being simply told they don't understand so just want to make sure we're cautious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, just pointing out my own difficulty in parsing what is going on FWIW.
Noted. I agree there's room to document and explain more thoroughly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC adding support tables to every function individually vs. a single summary location is something else that I may have missed the discussion on.
I'd missed the discussion too. It looks like it took place here, #21255. I wasn't able to find a related thread on the discourse but seems like there probably should have been one.
I had a lot of bandwidth to look into this over the past week, and following along with all of the subsequent PRs, I was able to come to grips on why things have taken the form they did, but most maintainers don't have that luxury and it must be a lot to have sprung on you at once. I think the general idea is not so complicated as it appears. There's a decorator which let's you specify what is and isn't supported, which takes functions and adds entries for them to a central capabilities_table
, and also adds a table to the docs. The pytest decorator for selecting different backends then draws from this central table. There are also changes can be seen as orthogonal that address Jax and Dask's capabilities for lazy computation, which probably contribute to making the changes seem a little overwhelming at first look.
I don't think it's good for you to simply be told you don't understand either. Sorry for doing that. It's unfortunate that the discussion may have gone under the radar of more than a few maintainers. Having looked into it recently though, I've been convinced this is good stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem like the promise of a single code path for all backends has melted away for the time being
Is this referring to paths which are added or skipped based on is_lazy_array(x)
? Or are there other path splits?
We really need to update https://scipy.github.io/devdocs/dev/api-dev/array_api.html, but I haven't insisted on keeping it up to date at every stage due to the high iteration frequency. Now that we seem to be settling down to a pattern which covers what we need, it would be great to see it well documented for longevity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really need to update https://scipy.github.io/devdocs/dev/api-dev/array_api.html, but I haven't insisted on keeping it up to date at every stage due to the high iteration frequency. Now that we seem to be settling down to a pattern which covers what we need, it would be great to see it well documented for longevity.
💯 on this. I think it's great how much we collectively got done over the past 6-12 months, and also that now (after the last PRs go into 1.16.x) is a great time to let the dust settle and focus for a bit on docs and review of code patterns, to ensure we all get back on the same page and are happy with how this has shaped up. I personally also had a hard time keeping up with all the changes, even though I had a lot of context.
I'll try and write a summary on gh-18286 and then post it on Discourse after I'm done with 1.16.x review duties.
EDIT: done in #18867 (comment) and this Discourse post
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks quite good, the tables docs in the docs render well and look correct. Some local testing results from pixi r test-cpu -t scipy.cluster.tests.test_hierarchy
to verify we didn't lose test coverage or performance:
- CPU on
main
: 770 passed, 41 skipped in 42.12s - CPU on this PR: 770 passed, 41 skipped in 42.40s
- GPU on
main
: 437 passed, 374 skipped in 29.92s - GPU on this PR: 437 passed, 374 skipped in 31.81s
So all works as advertised. Thanks @crusaderky & reviewers.
CI failures are unrelated, so in it goes. |
Add
xp_capabilities
toscipy.cluster.hierarchy
.See Also
cluster.vq
: usexp_capabilities
#23000Demo render