-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+2] Pytest parametrize unit tests #11074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+2] Pytest parametrize unit tests #11074
Conversation
…e-non-common-tests
still broken? |
Was waiting for #11075 to reach consensus before updating this PR with changes there, which should also help fixing CI.. |
554902b
to
fbfb976
Compare
…scikit-learn into pytest-parametrize-non-common-tests
Tests are passing now, but it still might be easier review smaller chunks of this linked in the second part of the description #11074 (comment) (or liked above this comment). Leaving the WIP status even if this PR is mostly done to indicate that the diff will be updated as its chunks / sub-PRs are merged. |
Curious about the purpose of this PR? If it contains all the remaining things except from part1-part3 (after part3 begin merged), I think it might be better to review this PR directly (at least I'm willing to do so) |
Initially, I was trying to split it in smaller peaces to make review easier. After part 3 (#11143) is merged, I agree that the remaining diff here should be managable. Marking this as MRG. Thanks @qinhanmin2014 ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a small side-comment about the instable test that you found.
@@ -153,6 +156,8 @@ def test_ridge_regression_convergence_fail(): | |||
|
|||
def test_ridge_sample_weights(): | |||
# TODO: loop over sparse data as well | |||
# Note: parametrizing this test with pytest results in failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, maybe you can create an issue about this, if you haven't already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #11200
@rth I've merged part3 and will give this a review these days. I think it's important to quickly get this in to avoid conflicts. |
I have to admit I have not followed closely the subdivision of this work. Ping me if you need another review to get this completed. |
Thanks for the review @lesteve - will open the issue! @qinhanmin2014 I realized there was a PR duplication issue, but haven't though about it in terms or PR approval transitivity :) Thanks, everything should be in order now, this is the only PR for this parametrization, that remains.
You mean how I arrived at the diff in this PR? Well, some was already done in #8321 I then searched for tests using find sklearn -iname "*test_*py" -type f -exec grep -H --color -i yield {} \; Then I realized that there were a lot of tests that would benefit from straightforward parametrization, so I manually went through tests - file by file and made the changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of questions, LGTM.
@@ -126,7 +123,7 @@ def test_normalized_output(metric_name): | |||
# that is when 0 and 1 exchanged. | |||
@pytest.mark.parametrize( | |||
"metric_name", | |||
[name for name in dict(SUPERVISED_METRICS, **UNSUPERVISED_METRICS)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone has pointed out that dict(x, **y)
is not recommended in part1. But I don't think we're going to modify parametrized tests too much, so LGTM.
sklearn/mixture/tests/test_gmm.py
Outdated
|
||
assert_raises(ValueError, mixture.GMM, n_components=20, | ||
covariance_type='badcovariance_type') | ||
with pytest.raises(ValueError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why getting rid of assert_raises
here? I can't find any explanation. Also, I think there are many other places using assert_raises
. Maybe revert it and open an issue if needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right - better be systematic. Reverted this.
sklearn/tree/tests/test_tree.py
Outdated
def test_1d_input(): | ||
for name in ALL_TREES: | ||
yield check_raise_error_on_1d_input, name | ||
# XXX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's happening here for the XXX?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, @rth I still get these in Appveyor. I think they're valid.
I won't be strict about the completeness of parametrization, but I think we should remove all the yield tests here. |
Thanks for the detailed review @qinhanmin2014 !
You are right, looks like I missed parametrizations in |
I added the missing parametrizations in dd89184 for metrics @qinhanmin2014 . Had to split a few tests to remove all yields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @lesteve to double check your +1 since the new diff is large.
y_true, y_pred, | ||
beta=beta, average=average) | ||
assert_almost_equal(fbeta, 0) | ||
p, r, f, s = assert_warns(UndefinedMetricWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a bit awkward since this part will run multiple times. I'd rather have another test here (e.g.,test_precision_recall_f1_no_labels_average_none), or some better ways to avoid running same tests for multiple times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point done.
sklearn/metrics/tests/test_common.py
Outdated
'name', | ||
(set(ALL_METRICS) - set(REGRESSION_METRICS) | ||
- set(METRICS_WITHOUT_SAMPLE_WEIGHT) | ||
- METRIC_UNDEFINED_BINARY_MULTICLASS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding set to keep consistent? I'm fine with current version though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a bit awkward because ALL_METRICS
and REGRESSION_METRICS
are dicts, while METRICS_WITHOUT_SAMPLE_WEIGHT
and all the rest are lists, and finally METRIC_UNDEFINED_BINARY_MULTICLASS
is a set.
I changed the ones that were lists to sets which removes the need to cast them to set in parametrizations (since we are doing a number of set operations here).
@lesteve could you please have a look to the last 2 commits? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, ping @lesteve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, merging! Thanks a lot for working on this, this is really appreciated!
if name in METRIC_UNDEFINED_BINARY_MULTICLASS: | ||
continue | ||
|
||
with ignore_warnings(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, did you find a problem with the interaction of ignore_warnings and parametrize? I bumped into some quirks as I indicated in #11151 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think same things as what you mentioned in your comment, if I recall correctly,
ignore_warning
(as decorator) + parametrize failed on Python 2ignire_warning
(as decorator or not) + pytest.mark.filterwarnings didn't seem to work (but that might be fixed in [MRG+2] catch more expected warnings in common tests #11151)
Thanks for the review @qinhanmin2014 and @lesteve ! In this case reviewing was probably more work than making the changes :) |
@rth thanks for picking up my branch, if I had known someone wanted to work with it I would have left it in a better state :) |
Reference Issues/PRs
Continues #8321
Fixes #10728, related to #7319 , #11063
What does this implement/fix? Explain your changes.
This PR,
yield
with pytest parameterizationyield
but where it would be helpful (e.g. a tests is a for loop over estimators, solvers etc). This applies to,yield func, arg
replaced byfunc(arg)
Beyond removing deprecation warnings in pytest, the motivation is somewhat similar to #11063 : better verbosity in tests logs and ability to select tests with the
-k
argument.Any other comments?
The changes are fairly mechanical but there is a lot of them. For review, I think it would make sense to split this in smaller pieces,
sklearn.ensemble
PR [MRG+1] Pytest parametrize unit tests (part 1) - ensemble module #11075sklearn.{cluster, datasets, decomposition}
[MRG+1] Pytest parametrization part2 - cluster, datasets and decomposition modules #11142sklearn.{feature_extraction,gaussian_process}
[MRG+1] Pytest parametrize part3 - feature_extraction, gaussian_process modules #11143Please add general comments here, but review specific PRs linked above. The diff in this PR will be updated in the future.Once the above PRs are merged, the remaining diff can be directly reviewed here.Because a lot of these changes consist in taking a loop over some list of parameters, and writing it with
@pytest.mark.parametrize
the diff contains a lot of lines where only the indentation changes. As a workaround, I find helpful the option to ignore whitespaces in Github diff (by appending the&w=1
string to the URL or using the Refined Github extension which creates a UI button for it).Please let met me know if can do anything to make the review easier.