-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Follow-up after mean_poisson_deviance array API PR #29549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We should probably add a check to our array API tests that raises an error/exception with a good message if the environment variable is not already set. |
I assume you mean scikit-learn/sklearn/utils/_array_api.py Lines 111 to 121 in 156ef1b
I think it is fine to have a warning for everything. Especially as scipy plan to remove the need for setting this environment variable. So adding logic to our code only when needed feels like a lot of effort. Did you see the warning when you ran the tests? Or is it lost in a sea of warnings? Asked differently, why did the warning we already have not prevent your negative experience when running the tests? One thing we should add IMHO is a check on the minimum version of scipy that the user has installed. If your scipy version is too old, enabling this environment variable won't do anything for you. The newer the installed scipy version is, the more of it will support the array API. So I think this means we require a minimum version for scikit-learn to work. Which means we need a CI setup with that version to make sure we don't rely on things in scipy that are "too new". Should we pick scipy 1.14 as the minimum?
For these tests array API support in scikit-learn would be turned off right?
I think we should pick a minimum scipy version that you need to use array API. Mostly because "who knows what will happen" if people use an older one. We should implement a check like we have for Numpy in |
No I did not see the warnings because there wasn't any, see below for more details why. I guess as discussed during the meeting, reducing the number of moving pieces would be a great start, so raising an exception when any of the following conditions hold seems probably like a good idea?
Why did I not see any warning?
|
I agree that the warning is essentially invisible. I knew it was somewhere in your screenshot but it took me two or three scans to actually find it :-/ Should we make the warning an error for tests? |
This is certainly an option to make it an error only for tests. I guess you are implying that making it an error in the general case seems too aggressive? Also unrelated, but as a developer I find it slightly annoying that now I have to know that I need to set Edit: for now I have found a work-around which is to have a untracked import os
try:
import array_api_strict
os.environ["SCIPY_ARRAY_API"] = "1"
except ImportError:
pass |
I am fine with this. This would mean catching |
Having to know to set the environment variable is annoying indeed. Some might suggest that I wonder if there is some Python magic (but not too magic) we can sprinkle onto scipy to enable array API support after having imported it? If we skip all tests that require array API support in scipy, won't we end up skipping a lot of tests in the future? Basically, I am wondering what is going to be more annoying: having to setup things like the env variable or learning that your work breaks lots of tests only after opening a PR? |
I opened an issue upstream in scipy to ask for a programmatic way to do it: scipy/scipy#21529. |
Closing this because in #29814. |
As a follow-up for #29227.
The following command fails locally:
see full error in #29227 (comment)
but setting
SCIPY_ARRAY_API=1
works:Honestly I think this is fair to say that as a newcomer to the array API work, my feeling is that it can break way too easily in very mysterious ways. There are probably a few unrelated reasons for this but I guess since at the last bi-weekly meetings we agreed that we should try to pay attention to this kind of breakages and improve the situation, here are a few comments on this #29277:
SCIPY_ARRAY_API
unset. This seems something that could easily happen in practice and we should have tests that make sure that nothing breaks. It's not clear to me why only array-api-strict breaks and not pytorch example, maybe as the name implies array-api-strict is even stricter than other namespaces.array_api_dispatch=True
with scipy<1.14. To be fair it is not clear to me what happens in this case. I guess scipy will transform the array into a numpy array, and maybe the code afterwards has issues because some arrays are not in the same namespace? Anyway, we should make sure that the user code does not break. If this is really too hard to achieve the alternative is to raise an exception saying something like "to enable array API dispatch with this particular function you need scipy 1.14 andSCIPY_ARRAY_API=1
".SCIPY_ARRAY_API
seems too broad (as soon asset_config(array_api_dispatch=True)
is called?). Could we not have warnings only in a few (at least for now) places where we know we call scipy functions?Originally posted by @lesteve in #29227 (comment)
The text was updated successfully, but these errors were encountered: