@@ -163,14 +163,16 @@ Python runtime
163
163
164
164
:func: `sklearn.set_config ` controls the following behaviors:
165
165
166
- :assume_finite:
166
+ `assume_finite `
167
+ ~~~~~~~~~~~~~~~
167
168
168
- used to skip validation, which enables faster computations but may
169
- lead to segmentation faults if the data contains NaNs.
169
+ Used to skip validation, which enables faster computations but may lead to
170
+ segmentation faults if the data contains NaNs.
170
171
171
- :working_memory:
172
+ `working_memory `
173
+ ~~~~~~~~~~~~~~~~
172
174
173
- the optimal size of temporary arrays used by some algorithms.
175
+ The optimal size of temporary arrays used by some algorithms.
174
176
175
177
.. _environment_variable :
176
178
@@ -179,83 +181,91 @@ Environment variables
179
181
180
182
These environment variables should be set before importing scikit-learn.
181
183
182
- :SKLEARN_ASSUME_FINITE:
184
+ `SKLEARN_ASSUME_FINITE `
185
+ ~~~~~~~~~~~~~~~~~~~~~~~
186
+
187
+ Sets the default value for the `assume_finite ` argument of
188
+ :func: `sklearn.set_config `.
189
+
190
+ `SKLEARN_WORKING_MEMORY `
191
+ ~~~~~~~~~~~~~~~~~~~~~~~~
192
+
193
+ Sets the default value for the `working_memory ` argument of
194
+ :func: `sklearn.set_config `.
195
+
196
+ `SKLEARN_SEED `
197
+ ~~~~~~~~~~~~~~
198
+
199
+ Sets the seed of the global random generator when running the tests, for
200
+ reproducibility.
201
+
202
+ Note that scikit-learn tests are expected to run deterministically with
203
+ explicit seeding of their own independent RNG instances instead of relying on
204
+ the numpy or Python standard library RNG singletons to make sure that test
205
+ results are independent of the test execution order. However some tests might
206
+ forget to use explicit seeding and this variable is a way to control the intial
207
+ state of the aforementioned singletons.
208
+
209
+ `SKLEARN_TESTS_GLOBAL_RANDOM_SEED `
210
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
211
+
212
+ Controls the seeding of the random number generator used in tests that rely on
213
+ the `global_random_seed`` fixture.
214
+
215
+ All tests that use this fixture accept the contract that they should
216
+ deterministically pass for any seed value from 0 to 99 included.
217
+
218
+ If the `SKLEARN_TESTS_GLOBAL_RANDOM_SEED ` environment variable is set to
219
+ `"any" ` (which should be the case on nightly builds on the CI), the fixture
220
+ will choose an arbitrary seed in the above range (based on the BUILD_NUMBER or
221
+ the current day) and all fixtured tests will run for that specific seed. The
222
+ goal is to ensure that, over time, our CI will run all tests with different
223
+ seeds while keeping the test duration of a single run of the full test suite
224
+ limited. This will check that the assertions of tests written to use this
225
+ fixture are not dependent on a specific seed value.
226
+
227
+ The range of admissible seed values is limited to [0, 99] because it is often
228
+ not possible to write a test that can work for any possible seed and we want to
229
+ avoid having tests that randomly fail on the CI.
230
+
231
+ Valid values for `SKLEARN_TESTS_GLOBAL_RANDOM_SEED `:
232
+
233
+ - `SKLEARN_TESTS_GLOBAL_RANDOM_SEED="42" `: run tests with a fixed seed of 42
234
+ - `SKLEARN_TESTS_GLOBAL_RANDOM_SEED="40-42" `: run the tests with all seeds
235
+ between 40 and 42 included
236
+ - `SKLEARN_TESTS_GLOBAL_RANDOM_SEED="any" `: run the tests with an arbitrary
237
+ seed selected between 0 and 99 included
238
+ - `SKLEARN_TESTS_GLOBAL_RANDOM_SEED="all" `: run the tests with all seeds
239
+ between 0 and 99 included. This can take a long time: only use for individual
240
+ tests, not the full test suite!
241
+
242
+ If the variable is not set, then 42 is used as the global seed in a
243
+ deterministic manner. This ensures that, by default, the scikit-learn test
244
+ suite is as deterministic as possible to avoid disrupting our friendly
245
+ third-party package maintainers. Similarly, this variable should not be set in
246
+ the CI config of pull-requests to make sure that our friendly contributors are
247
+ not the first people to encounter a seed-sensitivity regression in a test
248
+ unrelated to the changes of their own PR. Only the scikit-learn maintainers who
249
+ watch the results of the nightly builds are expected to be annoyed by this.
250
+
251
+ When writing a new test function that uses this fixture, please use the
252
+ following command to make sure that it passes deterministically for all
253
+ admissible seeds on your local machine:
183
254
184
- Sets the default value for the `assume_finite ` argument of
185
- :func: `sklearn.set_config `.
186
-
187
- :SKLEARN_WORKING_MEMORY:
188
-
189
- Sets the default value for the `working_memory ` argument of
190
- :func: `sklearn.set_config `.
191
-
192
- :SKLEARN_SEED:
193
-
194
- Sets the seed of the global random generator when running the tests,
195
- for reproducibility.
196
-
197
- Note that scikit-learn tests are expected to run deterministically with
198
- explicit seeding of their own independent RNG instances instead of relying
199
- on the numpy or Python standard library RNG singletons to make sure that
200
- test results are independent of the test execution order. However some
201
- tests might forget to use explicit seeding and this variable is a way to
202
- control the intial state of the aforementioned singletons.
203
-
204
- :SKLEARN_TESTS_GLOBAL_RANDOM_SEED:
205
-
206
- Controls the seeding of the random number generator used in tests that
207
- rely on the `global_random_seed`` fixture.
208
-
209
- All tests that use this fixture accept the contract that they should
210
- deterministically pass for any seed value from 0 to 99 included.
211
-
212
- If the SKLEARN_TESTS_GLOBAL_RANDOM_SEED environment variable is set to
213
- "any" (which should be the case on nightly builds on the CI), the fixture
214
- will choose an arbitrary seed in the above range (based on the BUILD_NUMBER
215
- or the current day) and all fixtured tests will run for that specific seed.
216
- The goal is to ensure that, over time, our CI will run all tests with
217
- different seeds while keeping the test duration of a single run of the full
218
- test suite limited. This will check that the assertions of tests
219
- written to use this fixture are not dependent on a specific seed value.
220
-
221
- The range of admissible seed values is limited to [0, 99] because it is
222
- often not possible to write a test that can work for any possible seed and
223
- we want to avoid having tests that randomly fail on the CI.
224
-
225
- Valid values for SKLEARN_TESTS_GLOBAL_RANDOM_SEED:
226
-
227
- - SKLEARN_TESTS_GLOBAL_RANDOM_SEED="42": run tests with a fixed seed of 42
228
- - SKLEARN_TESTS_GLOBAL_RANDOM_SEED="40-42": run the tests with all seeds
229
- between 40 and 42 included
230
- - SKLEARN_TESTS_GLOBAL_RANDOM_SEED="any": run the tests with an arbitrary
231
- seed selected between 0 and 99 included
232
- - SKLEARN_TESTS_GLOBAL_RANDOM_SEED="all": run the tests with all seeds
233
- between 0 and 99 included
234
-
235
- If the variable is not set, then 42 is used as the global seed in a
236
- deterministic manner. This ensures that, by default, the scikit-learn test
237
- suite is as deterministic as possible to avoid disrupting our friendly
238
- third-party package maintainers. Similarly, this variable should not be set
239
- in the CI config of pull-requests to make sure that our friendly
240
- contributors are not the first people to encounter a seed-sensitivity
241
- regression in a test unrelated to the changes of their own PR. Only the
242
- scikit-learn maintainers who watch the results of the nightly builds are
243
- expected to be annoyed by this.
244
-
245
- When writing a new test function that uses this fixture, please use the
246
- following command to make sure that it passes deterministically for all
247
- admissible seeds on your local machine:
255
+ .. prompt :: bash $
248
256
249
- SKLEARN_TESTS_GLOBAL_RANDOM_SEED="all" pytest -v -k test_your_test_name
257
+ SKLEARN_TESTS_GLOBAL_RANDOM_SEED="all" pytest -v -k test_your_test_name
250
258
251
- :SKLEARN_SKIP_NETWORK_TESTS:
259
+ `SKLEARN_SKIP_NETWORK_TESTS `
260
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
252
261
253
- When this environment variable is set to a non zero value, the tests
254
- that need network access are skipped. When this environment variable is
255
- not set then network tests are skipped.
262
+ When this environment variable is set to a non zero value, the tests that need
263
+ network access are skipped. When this environment variable is not set then
264
+ network tests are skipped.
256
265
257
- :SKLEARN_ENABLE_DEBUG_CYTHON_DIRECTIVES:
266
+ `SKLEARN_ENABLE_DEBUG_CYTHON_DIRECTIVES `
267
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
258
268
259
- When this environment variable is set to a non zero value, the `Cython `
260
- derivative, `boundscheck ` is set to `True `. This is useful for finding
261
- segfaults.
269
+ When this environment variable is set to a non zero value, the `Cython `
270
+ derivative, `boundscheck ` is set to `True `. This is useful for finding
271
+ segfaults.
0 commit comments