-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+2] catch more expected warnings in common tests #11151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for this PR @ShangwuYao !
You could use it as a context manager |
Thanks for the clarification @rth , I have tried several different ways of using |
Yes, I wonder if pytest warning capture interferes with You could also try to use the pytest warnings filter mechanism, approximately, ignore_common_warnings = pytest.mark.filterwarnings(
'ignore:.*:UserWarning',
'ignore:.*:ConvergenceWarning',
...)
@ignore_common_warnings
def some_test_function():
pass (I think, pytest filterwarnings arguments are not too well documented) |
Yeah, I tried that, it didn't work either. The next thing I tried is comparing this file (test_common.py) with other test files, and found that the
And it did helped, but I think I should only ignore those warnings in the common test, so I am still looking into how to deal with the |
Can you help me reopen it? @rth I was trying to rollback and commit. Thanks! |
I'm not sure what's happened, but the Reopen button is disabled. |
def controlled_warning_context_manager():
with ignore_warnings():
warnings.warn('yo', UserWarning)
print(warnings.filters)
@ignore_warnings
def controlled_warning_decorator():
warnings.warn('yo', UserWarning)
print(warnings.filters) Using warnings.resetwarnings()
warnings.simplefilter('ignore', UserWarning)
print(warnings.filters)
controlled_warning_context_manager()
print(warnings.filters)
Using warnings.resetwarnings()
warnings.simplefilter('ignore', UserWarning)
print(warnings.filters)
controlled_warning_decorator()
print(warnings.filters)
Both are working as expected now, can see old erroneous behavior in #11109 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In travis Python 2.7: there are 762 warnings in master, but 5555 warnings in this PR
In travis Python 3.4: there are 8718 warnings in master, but 6743 in this PR.
In travis Python 3.6: there are 8105 warnings in master, but 6004 in this PR.
So something is going wrong in Python 2.7. Perhaps this is one of the Python bugs that I noted was motivation for resetting warning filters in some contexts...
Search for "warnings in" in each of the respective logs to confirm these results. (Please report similar results in future updates to this PR.)
sklearn/utils/testing.py
Outdated
@@ -137,6 +137,7 @@ def assert_warns(warning_class, func, *args, **kw): | |||
|
|||
""" | |||
# very important to avoid uncontrolled state propagation | |||
warnings.resetwarnings() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Elsewhere there was a suggestion that we should never need to resetwarnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please see #11109 (comment) (if you have not seen it yet).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah actually looking at the diff, warnings.resetwarnings
has been removed from clean_warning_registry
which is the change I was advocating.
It would be great if you could give us some reason why you need warnings.resetwarnings
here. I am hoping we don't need to use it anywhere. In case we do need to use warnings.resetwarnings
, it should be used inside a warnings.catch_warnings
context manager so that the global warnings.filters
is not affected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have looked at #11109 before I did this. After I removed the warnings.resetwarnings()
from clean_warning_registry
, I couldn't pass one test case - the test_warn
in test_testing
:
# This class is inspired from numpy 1.7 with an alteration to check
# the reset warning filters after calls to assert_warns.
# This assert_warns behavior is specific to scikit-learn because
# `clean_warning_registry()` is called internally by assert_warns
# and clears all previous filters.
class TestWarns(unittest.TestCase):
def test_warn(self):
def f():
warnings.warn("yo")
return 3
# Test that assert_warns is not impacted by externally set
# filters and is reset internally.
# This is because `clean_warning_registry()` is called internally by
# assert_warns and clears all previous filters.
warnings.simplefilter("ignore", UserWarning)
assert_equal(assert_warns(UserWarning, f), 3)
# Test that the warning registry is empty after assert_warns
assert_equal(sys.modules['warnings'].filters, [])
assert_raises(AssertionError, assert_no_warnings, f)
assert_equal(assert_no_warnings(lambda x: x, 1), 1)
It looks like former developers expect to use warnings.resetwarnings
in assert_warns
, while I think we should store the filters, and restore them upon exiting, but this seems like another issue, I intended to create another PR for this. Right now I am just making it behave the same as I haven't remove warnings.resetwarnings
in clean_warning_registry
. If you could tell me what you want for the assert_warns
, I can do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented somewhere else that you should remove these two lines:
# Test that the warning registry is empty after assert_warns
assert_equal(sys.modules['warnings'].filters, [])
Expecting a assert_warns call to affect the global warnings.filters is not something I would expect. I can't see a good reason for it.
sklearn/utils/estimator_checks.py
Outdated
@@ -304,7 +304,7 @@ def check_estimator(Estimator): | |||
except SkipTest as message: | |||
# the only SkipTest thrown currently results from not | |||
# being able to import pandas. | |||
warnings.warn(message, SkipTestWarning) | |||
warnings.warn(str(message), SkipTestWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record I bumped into the same problem. This is clearly a bug in our code, trying to provide an exception object into warnings.warn
only work if none of the warnings filters use a regex. Here is a snippet to show what I means:
import warnings
exc = ValueError()
warnings.resetwarnings()
warnings.warn(exc) # works fine
warnings.filterwarnings('always', message='regex_goes_here')
warnings.warn(exc) # exception here TypeError: expected string or bytes-like object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you are at it, you may as well rename message
-> exception
to improve readability.
Does anyone have an idea why warnings would increase on py2? |
Not off the top of my head no. I would be quite happy with just reducing the warnings on Python 3 to a manageable number to be perfectly honest. More generally, something quite useful to reduce the number of warnings would be to break it down by warning class as I did in #10158 (comment) for example. |
hmm, I compared the warning log before and after the change, seems like the new warnings after my change are pretty legitimate warnings. Maybe because I fixed those bugs, we actually get to see those warnings that we are supposed to see. |
Yes, I suppose that's a reasonable explanation.
LGTM
|
Thank you all a lot! should I do anything to merge? |
Wait for a second reviewer to approve. :)
|
Can I have a breakdown of warnings by warning class just for test_common.py as I asked above? Without this I feel this is a bit a shot in the dark ... I have looked at this closely and based on what I found, I think these changes makes sense to me, but 1. I am not 100% sure of the subtle impacts of the changes and the Py27 increase is a bit weird |
And yes, I agree, the reduction is not a lot; the warnings in Py3 are still
not manageable.
|
This PR is initially intended for catching warnings in common tests, which is now reduced from thousands to 5-6. And during the process, I fixed the bugs in |
I agree. Why is "4403 /home/lesteve/dev/alt-scikit-learn/sklearn/utils/deprecation.py:77: DeprecationWarning: Function squared_exponential is deprecated; The function squared_exponential of correlation_models is deprecated in version 0.19.1 and will be removed in 0.22." apparently appearing after this PR and not before?? |
well, if it is deprecated, and no filters are set, it should give the warnings. As I checked, that is the case. |
sklearn/utils/testing.py
Outdated
if self._entered: | ||
raise RuntimeError("Cannot enter %r twice" % self) | ||
self._entered = True | ||
self._filters = self._module.filters | ||
self._module.filters = self._filters[:] | ||
self._showwarning = self._module.showwarning | ||
warnings.simplefilter("ignore", self.category) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why you moved it here?
Should there not be an equivalent Sorry for the noise this is not needed as warnings.filters are restored in warnings.filters.pop()
in __exit__
?__exit__
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose there are some external warning filters and local warning filters, and this change means that in the local domain, both the external warning filters and local warning filters are working, and upon exiting this local domain, the external warning filters are restored to be the way they were (before entering the local domain). This is intended to match the behavior of context manager as decorator, since the decorator use with warnings.catch_warnings():
to do so.
I have proposed to use this in the previous posts several times, I think this makes sense but no one confirms this, and then I found numpy's suppress_warnings has the same behavior.
Without moving this here, the warning.filters will restore both the local filters as well as the external filters upon exiting.
While you are at it, I think you can remove this
|
I don't quite understand why the build failed. |
There were due to flake8 errors that you introduced. I pushed some tweaks in the last commit, let me know if you have comments about them. I think this is looking good, being able to remove 4k warnings in test_gaussian_process.py in a straightforward manner makes me confident that the way we are handling the warnings is a lot less hacky than before. |
This is the only thing I can get to work under Python 2.7
LGTM. The number of warnings are: I am strongly in favour of creating separate PRs to investigate how to reduce the number of warnings further. FYI there were quirks on Python 2.7 due to the interaction of pytest.mark.parametrize and ignore_warnings/pytest.mark.filterwarnings which I did not investigate further. This is why my last commit reverts to using a ignore_warnings context manager as was the case originally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LTGM (apart for the minor comment below), thanks for working on this!
I am strongly in favour of creating separate PRs to investigate how to reduce the number of warnings further.
+1
@@ -143,7 +144,6 @@ def test_batch_size(): | |||
|
|||
|
|||
def test_random_starts(): | |||
# Test that an increasing number of random-starts of GP fitting only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this removal intentional?
alright, then I will create another PR to further reduce warnings. |
Merging this one, thanks a lot for working on this @ShangwuYao! |
This is exciting, thank you all for reviewing!! |
@@ -15,6 +15,7 @@ | |||
from sklearn.datasets import make_regression | |||
from sklearn.utils.testing import assert_greater, assert_true, assert_raises | |||
|
|||
pytestmark = pytest.mark.filterwarnings('ignore', category=DeprecationWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason that DeprecationWarnings are ignored here? (this is not part of the common tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR fixed bugs in ignore_warnings
and clean_warning_registry
, see here, not just common tests. Before this, some warnings that should be raised in test_gaussian_process
was somehow suppressed, after this change, they are correctly raised, and this file alone raises more than 4000 new DeprecationWarnings, so we ignored it to see this fix is really working, see here. We probably shouldn't ignore this warning as you said.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason that DeprecationWarnings are ignored here? (this is not part of the common tests)
Unless I am missing something sklearn/gaussian_process/gaussian_process.py is deprecated and scheduled to be removed in 0.20.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's a good reason :-)
Ideally I would say that deprecation warnings are asserted once and then explicitly ignored for other tests, but since this is deprecating a full sub-module and not a method or so, then probably makes sense to just ignore them for the full submodule.
Reference Issues/PRs
Fixes #11109. See also #7006.
What does this implement/fix? Explain your changes.
warnings.resetfilters
inclean_warning_registry()
.ignore_warnings
to ignore both local warnings and external warnings like suppress_warnings in numpy.ignore_warnings
as context manager, restore warning filters upon exiting, avoid affecting non-local filters.Any other comments?