-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
TST Add option to run tests on 32bit data #22680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm not sure there's a need to run all or almost all tests against float32.
I think having these tests should be enough, in the sense that if we check that the result is the same between float32 and float64 then by transitivity, testing some property of an estimator on float64 is guaranteed to apply to float32 as well. |
Is the plan to convert all tests that parameterize over Lets say that a contributor is adjusting an algorithm to fix a bug that would fail a test for |
The fixture is already enabled on 1 CI job so we should know in the PR (except in rare situations where there's a numerical instability only visible on another platform).
At first I didn't think so, but then I thought why not. Transformers that have received the preserves_dtype flag recently have been granted new tests with the fixture. I don't see a reason to treat the other transformers differently. Maybe we shouldn't do that for tests added for a bug fix though (maybe it's already the case :/ ). Ideally I would always run tests for both precisions but parametrizing more and more tests over float64/float32 makes the CI take more and more time, and it's already too long (without mentioning energy consumption). I think having the fixture on 1 job for the PR and all jobs for the nightly builds is a good compromise. Wdyt ? |
It's a trade off between developer time and CI time. If there is a float32 issue in a PR that is not caught by the one PR job, then it requires more developer time to fix it after it is caught on nightly builds. Let's enable the "1 job for the PR and all jobs for nightly" plan now. In the long term, I think it depends on how often does the above situation happens. If it's rare, then I'm okay with the compromise. |
Addressed by #22690. |
1 similar comment
Addressed by #22690. |
Context
Currently most implementations are tested against 64bit datasets only. The re-factoring of some internals for computations on 32bit datasets brought the need to test user-facing interfaces on 32bit datasets (see #22590 (comment)). A few group of PRs (such as #22663) introduced a simple parametrisation on
dtype
:Problem
Yet, parametrising tests on
dtype
might just make running the test suite (much) longer. Hence, we could add an option to be able to run such tests for 32bit data optionally. In this case, we also need to have a way to specify this option so that it can be triggered on the CI.Proposed solution
We can keep the tests parametrisation on
dtype
as proposed in PRs (such as #22663) but have the value ofDTYPES
be specified globally (e.g.in sklearn/utils/_testing.py
) and adaptively on the value environement variable. Naively:As for the CI, we could have a new
[test 32bit]
commit message marker which would define this environement variable.The text was updated successfully, but these errors were encountered: