8000 ENH: provide monitoring of model selection searches by stsievert · Pull Request #528 · dask/dask-ml · GitHub
[go: up one dir, main page]

Skip to content

ENH: provide monitoring of model selection searches #528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 78 commits into from
Apr 17, 2020

Conversation

stsievert
Copy link
Member
@stsievert stsievert commented Jun 24, 2019

What does this PR implement?
It allows the user to monitor the model selection search. It prints out information on the score received and allows the user to stop the searches via Cntrl-C.

  • If verbose is False (default): log all the time but don't pipe to stdout (aka print)
  • If verbose is True: log and print all the time
  • If verbose is an int/float and 0 < verbose <= 1: log and print verbose fraction of the time.
  • If verbose is zero: do not log past initialization.

This allows:

  • The user to see logs in Jupyter notebooks by specifying verbose=True
  • Advanced users to configure logging themselves by default with verbose=False.
  • Advanced users to turn off logging with verbose=0.

All types are cast to float except bool; valid values include verbose in [0.0, 1, True, False, 0.5]. An error will be raised if not 0 <= verbose <= 1.

@stsievert
Copy link
Member Author

I'm not sure this PR has the best method of printing/logging information:

def _show_msg(msg, verbose):
if verbose:
print(msg)
logger.info(msg)

I'm also unsure how to test KeyboardInterrupts.

Copy link
Member
@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may be able to write a test for the KeyboardInterrupt using signal.sigint. LMK if you want help with that.

@stsievert
Copy link
Member Author

I'm having difficult time with implementation that catches the KeyboardInterrupt. When I test it manually in a notebook, the code throws an KeyboardInterrupt when I manually send a single keyboard interruption and does not stop the training cleanly. Here's the traceback of the error I get:

KeyboardInterrupt               Traceback (most recent call last)
<ipython-input-5-3748455ece57> in <module>()
----> 9 df = _test_keyboardinterrupt()
<ipython-input-4-135866acf811> in _test_keyboardinterrupt()
---> 13     search.fit(X, y)
~/Developer/stsievert/dask-ml/dask_ml/model_selection/_incremental.py in fit(self, X, y, **fit_params)
--> 641         return default_client().sync(self._fit, X, y, **fit_params)
~/anaconda3/lib/python3.6/site-packages/distributed/client.py in sync(self, func, *args, **kwargs)
--> 753             return sync(self.loop, func, *args, **kwargs)
~/anaconda3/lib/python3.6/site-packages/distributed/utils.py in sync(loop, func, *args, **kwargs)
--> 329             e.wait(10)
~/anaconda3/lib/python3.6/threading.py in wait(self, timeout)
--> 551                 signaled = self._cond.wait(timeout)
~/anaconda3/lib/python3.6/threading.py in wait(self, timeout)
--> 299                     gotit = waiter.acquire(True, timeout)

KeyboardInterrupt: 

@TomAugspurger
Copy link
Member

I'm learning a bit about this for a distributed issue. Will take a look later today.

@TomAugspurger
Copy link
Member

Can you talk a bit about what the expected behavior of a KeyboardInterrupt halfway through is? As a user, I might expect the following

  1. All currently scheduled fits are cancelled (cluster should become idle)
  2. The usual attributes like best_estimator_ are present, and reflect the state of the world when the training was interrupted.
  3. I have some way of knowing that the training was stopped early? An attribute on the model?

Overall, this looks a bit more complicated that adding better logging. It may be worth splitting into its own PR.

@stsievert
Copy link
Member Author
  1. All currently scheduled fits are cancelled (cluster should become idle)
  2. The usual attributes like best_estimator_ are present, and reflect the state of the world when the training was interrupted.
  3. I have some way of knowing that the training was stopped early? An attribute on the model?

I would certainly expect 1 and 2. I intend the KeyboardInterrupt to be a mechanism for the busy data scientist to get on with their workflow (using the best estimator, etc).

@TomAugspurger
Copy link
Member

Looks like a merge conflict.

@stsievert
Copy link
Member Author
stsievert commented Apr 14, 2020

verbose is now a float as before, but I always cast to a float now (unless a bool is given). I've updated #528 (comment) to reflect this change.

@stsievert
Copy link
Member Author

The CI is failing because contextlib.nullcontext is new in Python 3.7 (source). Do we want to upgrade the minimum Python version or use a NullContext in dask_ml._utils.py?

@TomAugspurger
Copy link
Member

Ah, I didn't realize it was that new. I think defining something like this in dask_ml._compat

@contextlib.contextmanager
def nullcontext():
    yield

should suffice

@stsievert
Copy link
Member Author

I've also deprecated the scores_per_fit parameter in IncrementalSearchCV; the name should be fits_per_score.

):
if scores_per_fit is not None and fits_per_score != 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, didn't realize this deprecation was in the __init__. This should be done in the fit. __init__ isn't supposed to do any validation.

This will also fix the issue you saw with the warnings in the doc build.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh, of course. I've made that change, and removed the :okwarning:s. This PR should be ready merge now.

@TomAugspurger
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0