-
-
Notifications
You must be signed in to change notification settings - Fork 26.4k
Description
import sklearn creates a StreamHandler and attaches it to the sklearn logger:
scikit-learn/sklearn/__init__.py
Line 24 in 0eebade
| logger.addHandler(logging.StreamHandler()) |
I'm not sure what the motivation for this is, but it's a deviation from the normal "best practices" for logging, namely that libraries should restrict themselves to issuing log messages, but let the application do all logging configuration (setting up handlers, changing logger levels, and the like). There's lots written about this elsewhere, but here's one relevant blog post: http://pieces.openpolitics.com/2012/04/python-logging-best-practices/
In practice, this caused a hard-to-diagnose bug in our IPython- and sklearn-using application (actually, in more than one such application):
- At application start time, we start an IPython kernel. That kernel swaps out
sys.stdoutandsys.stderrfor its own custom streams, which rely on a lot of fairly complicated machinery (extra threads, ZMQ streams, the asyncio event loop, etc.) sklearnwas imported while that IPython kernel was running.- The log handler created at import time then picked up IPython's custom
sys.stderrstream instead of the usual one. - At application stop time, the IPython kernel and associated machinery were stopped.
- At process exit time, the stream associated to the handler was flushed (by the
loggingmodule'sshutdownfunction, which is registered as anatexithandler). Because the IPython machinery was no longer active, we got a hard-to-understand traceback.
If the intent of the handler is to suppress the "No logger configured ..." messages from the std. lib., perhaps a logging.NullHandler could be used for that purpose instead? I'm happy to create a PR for this if the proposed change sounds acceptable.