8000 [MRG] scorer: add sample_weight support (+test) by vene · Pull Request #3401 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG] scorer: add sample_weight support (+test) #3401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 19, 2014

Conversation

vene
Copy link
Member
@vene vene commented Jul 16, 2014

Wraps up #3098 (a part of #1574), ready for review.

Initial description by @ndawe:

This is part of the larger #1574 and adds support for sample weights in the scorer interface.

@vene
Copy link
Member Author
vene commented Jul 16, 2014

The test is simple but the change is equally simple, just passing the sample_weight parameter through.


for name, scorer in SCORERS.items():
try:
weighted = scorer(estimator[name], X_test, y_test,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather have a more explicit test. For example having for the regressor [0, 0, 1, 1, 2, 2] as output and then using weights=[0, 0, 1, 1, 0, 0] and once 1-weights and check that the expected thing happens.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amueller The difficulty is that, unlike when testing the metrics directly, for testing the scorer it's tricky to anticipate the output of the classifier / regressor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not if you use the dummys ;) What is wrong with my suggestion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'expected thing' would need to be manually computed for each different scorer, right?

That would be just like coming up with a different hand-crafted test for each scorer, which I could do, if you insist.

8000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, if you are testing all score_functions, that doesn't make
any sense.

On 07/16/2014 05:23 PM, Vlad Niculae wrote:

In sklearn/metrics/tests/test_score_objects.py:

  • sample_weight = np.ones_like(y_test)
  • sample_weight[:10] = 0
  • get sensible estimators for each metric

  • sensible_regr = DummyRegressor(strategy='median')
  • sensible_regr.fit(X_train, y_train)
  • sensible_clf = DecisionTreeClassifier()
  • sensible_clf.fit(X_train, y_train)
  • estimator = dict([(name, sensible_regr)
  •                  for name in REGRESSION_SCORER_NAMES] +
    
  •                 [(name, sensible_clf)
    
  •                  for name in CLF_SCORER_NAMES])
    
  • for name, scorer in SCORERS.items():
  •    try:
    
  •        weighted = scorer(estimator[name], X_test, y_test,
    

The 'expected thing' would need to be manually computed for each
different scorer, right?

That would be just like coming up with a different hand-crafted test
for each scorer, which I could do, if you insist.


Reply to this email directly or view it on GitHub
https://github.com/scikit-learn/scikit-learn/pull/3401/files#r15004137.

@arjoly
Copy link
Member
arjoly commented Jul 16, 2014

LGTM when travis is green.

@amueller
Copy link
Member

lgtm

@ndawe
Copy link
Member
ndawe commented Jul 16, 2014

Thanks a lot for taking this over @vene! I've had no time to work on this lately.

@ndawe ndawe mentioned this pull request Jul 16, 2014
6 tasks
ignored = scorer(estimator[name], X_test[10:], y_test[10:])
unweighted = scorer(estimator[name], X_test, y_test)
assert_not_equal(weighted, unweighted,
"scorer {} behaves identically when called with "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

support for Py2.6 means we can't use {}. Use {0}, {1}, {2}, ...

@arjoly
Copy link
Member
arjoly commented Jul 19, 2014

Not much to do to get that merged.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) when pulling 6a4aa1d on vene:scorer_weights into 4ec8630 on scikit-learn:master.

@arjoly
Copy link
Member
arjoly commented Jul 19, 2014

Travis is happy ! merging

arjoly added a commit that referenced this pull request Jul 19, 2014
 [MRG] scorer: add sample_weight support (+test)
@arjoly arjoly merged commit 70f4d47 into scikit-learn:master Jul 19, 2014
@ndawe
Copy link
Member
ndawe commented Jul 19, 2014

Thanks! 🍻

@jnothman
Copy link
Member

I'm glad to see this moving through. Thanks Noel and Vlad. I look forward
to the cross-validation support.

On 20 July 2014 07:47, Noel Dawe notifications@github.com wrote:

Thanks! [image: 🍻]


Reply to this email directly or view it on GitHub
#3401 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0