10000 `error_score=nan` issues hidden warnings in model selection utilities when n_jobs>1 · Issue #20475 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

error_score=nan issues hidden warnings in model selection utilities when n_jobs>1 #20475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ogrisel opened this issue Jul 6, 2021 · 6 comments · Fixed by #20619
Closed

error_score=nan issues hidden warnings in model selection utilities when n_jobs>1 #20475

ogrisel opened this issue Jul 6, 2021 · 6 comments · Fixed by #20619
Assignees
Labels

Comments

@ogrisel
Copy link
Member
ogrisel commented Jul 6, 2021

By default many model selection tools such as cross_validate, validation_curves and *SearchCV catch exceptions, raise a warning and score the model with nan. This can be a nice behavior, especially for *SearchCV estimators that can explore invalid hyper-parameter combinations.

However the warnings can be hidden when they are issued on the stderr of loky workers (python subprocesses), for instance in Jupyter interactive environments:

image

This is a big usability bug. I think we should refactor the model selection tools to raise the warning from the main process instead of the workers. This would make it possible to:

  • make it possible to display the warning on the main process stderr to avoid hiding those when parallel compution on several Python processes (e.g. with the default loky backend of joblib) or multi-machine (e.g. with the dask or ray cluster backends of joblib).
  • raise the warning only once (for the first case) to avoid a "wall of warnings"-effect when n_jobs=1
  • inform the user that can set error_score="raise" in the warning message if they want there code to raise an exception instead of gettting nan-valued scores.

This problem has been causing a lot of confusion to MOOC participants (INRIA/scikit-learn-mooc#377) so it's probably hurting scikit-learn usability significantly.

@ogrisel ogrisel added the Bug label Jul 6, 2021
@ogrisel ogrisel changed the title error_score=nan issue hidden warning in model selection utilities when n_jobs>1 error_score=nan issues hidden warnings in model selection utilities when n_jobs>1 Jul 6, 2021
@lesteve lesteve self-assigned this Jul 6, 2021
@lesteve
Copy link
Member
lesteve commented Jul 6, 2021

I'll try to take a look at this one.

@ogrisel
Copy link
Member Author
ogrisel commented Jul 6, 2021

It probably means we need to wrap joblib.Parallel into some kind of ModelEvaluation result object that can wrap both the traditional attributes and any warning message information with traceback info.

@lesteve
Copy link
Member
lesteve commented Jul 9, 2021

Looking a bit more at it, the first thing I am going to do is the simple case: when all the test scores are NaN raise a warning in the main process saying that something is probably wrong with the model configuration and that `error_score='raise' is a good way to debug the error.

I will leave the more complicated cases for later:

  • find a way to collect the warnings from the subprocesses and raise them in the main process
  • avoid the "wall of warning" effect, i.e. reduce the number of warnings since they are all the same

@lesteve
Copy link
Member
lesteve commented Jul 9, 2021

Hmmm actually another subtility in this: ipykernel 6 (released quite recently on June 30 2021) fixed the subprocesses stderr not being captured so you do see the warnings inside the Jupyter notebook with ipykernel 6: https://github.com/ipython/ipykernel/blob/master/CHANGELOG.md#600

All outputs to stdout/stderr should now be captured, including subprocesses and output of compiled libraries (blas, lapack....). In notebook server, some outputs that would previously go to the notebooks logs will now both head to notebook logs and in notebooks outputs. In terminal frontend like Jupyter Console, Emacs or other, this may ends up as duplicated outputs.

ipykernel 5.5.5 (no warnings shown)

image

ipykernel 6.1 (all warnings shown)

image

@senisioi
Copy link

This problem seems related to #12939 and overall the problem stems from the usage of warnings library which is not thread-safe. The solution would be to use the logging library and set the correct logging level when running the application.
I would gladly make a Pull request with changes that refactor warnings.warn to a logger.warning from the logging library.

@lesteve
Copy link
Member
lesteve commented Jul 20, 2021

Thanks for your input, I am planning to work on this issue as I mentioned above.

Don't worry though if you are looking to contribute to scikit-learn, there should be plenty of other issues to work on 😉. This part of the doc should help you getting started as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
0