8000 Move all data-independent fit-params to ``__init__``. · Issue #4107 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Move all data-independent fit-params to __init__. #4107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
amueller opened this issue Jan 16, 2015 · 12 comments · Fixed by #5294
Closed

Move all data-independent fit-params to __init__. #4107

amueller opened this issue Jan 16, 2015 · 12 comments · Fixed by #5294
Labels
API Easy Well-defined and straightforward way to resolve
Milestone

Comments

@amueller
Copy link
Member

I thought we already did that, but QDA still has tol in fit.
I'm pretty sure we can list all fit params using a small snipplet and then go over them by hand.

@amueller amueller added Easy Well-defined and straightforward way to resolve API labels Jan 16, 2015
@amueller amueller added this to the 0.16 milestone Jan 16, 2015
@raghavrv
Copy link
Member

These need to be done with a deprecation warning right?

@amueller
Copy link
Member Author

Yes, indeed.

@akshayah3
Copy link
Contributor

Can i take this up?

@raghavrv
Copy link
Member

Sure! :)

@trevorstephens
Copy link
Contributor

Per @amueller 's comment, made a quick & dirty little script that may help track them down @akshayah3

from sklearn.utils.testing import all_estimators
import inspect
for name, Estimator in all_estimators():
    methods = inspect.getmembers(Estimator, predicate=inspect.ismethod)
    for m_name, method in methods:
        if m_name == 'fit':
            args, _, _, _ = inspect.getargspec(method)
            if len(set(args) - set(['self', 'X', 'y', 'sample_weight'])) > 0:
                print '%30s' % name, args

yields:

                           CCA ['self', 'X', 'Y']
               CountVectorizer ['self', 'raw_documents', 'y']
        DecisionTreeClassifier ['self', 'X', 'y', 'sample_weight', 'check_input']
         DecisionTreeRegressor ['self', 'X', 'y', 'sample_weight', 'check_input']
           ExtraTreeClassifier ['self', 'X', 'y', 'sample_weight', 'check_input']
            ExtraTreeRegressor ['self', 'X', 'y', 'sample_weight', 'check_input']
                        GMMHMM ['self', 'obs']
                   GaussianHMM ['self', 'obs']
    GradientBoostingClassifier ['self', 'X', 'y', 'sample_weight', 'monitor']
     GradientBoostingRegressor ['self', 'X', 'y', 'sample_weight', 'monitor']
                KernelCenterer ['self', 'K', 'y']
                           LDA ['self', 'X', 'y', 'store_covariance', 'tol']
                          Lars ['self', 'X', 'y', 'Xy']
                     LassoLars ['self', 'X', 'y', 'Xy']
                   LassoLarsIC ['self', 'X', 'y', 'copy_X']
              LinearRegression ['self', 'X', 'y', 'n_jobs']
                           MDS ['self', 'X', 'init', 'y']
                MultinomialHMM ['self', 'obs']
                  PLSCanonical ['self', 'X', 'Y']
                 PLSRegression ['self', 'X', 'Y']
                        PLSSVD ['self', 'X', 'Y']
   PassiveAggressiveClassifier ['self', 'X', 'y', 'coef_init', 'intercept_init']
    PassiveAggressiveRegressor ['self', 'X', 'y', 'coef_init', 'intercept_init']
                    Perceptron ['self', 'X', 'y', 'coef_init', 'intercept_init', 'class_weight', 'sample_weight']
                           QDA ['self', 'X', 'y', 'store_covariances', 'tol']
                 SGDClassifier ['self', 'X', 'y', 'coef_init', 'intercept_init', 'class_weight', 'sample_weight']
                  SGDRegressor ['self', 'X', 'y', 'coef_init', 'intercept_init', 'sample_weight']
               TfidfVectorizer ['self', 'raw_documents', 'y']
                      _BaseHMM ['self', 'obs']

@amueller
Copy link
Member Author

slight modification

from sklearn.utils.testing import all_estimators
import inspect
for name, Estimator in all_estimators():
    methods = inspect.getmembers(Estimator, predicate=inspect.ismethod)
    for m_name, method in methods:
        if m_name == 'fit':
            args, _, _, _ = inspect.getargspec(method)
            others = set(args) - set(['self', 'X', 'y', 'Y', 'sample_weight'])
            if others:
                print '%30s' % name, list(others)
CountVectorizer ['raw_documents']
        DecisionTreeClassifier ['check_input']
         DecisionTreeRegressor ['check_input']
           ExtraTreeClassifier ['check_input']
            ExtraTreeRegressor ['check_input']
                        GMMHMM ['obs']
                   GaussianHMM ['obs']
    GradientBoostingClassifier ['monitor']
     GradientBoostingRegressor ['monitor']
                KernelCenterer ['K']
                           LDA ['store_covariance', 'tol']
                          Lars ['Xy']
                     LassoLars ['Xy']
                   LassoLarsIC ['copy_X']
              LinearRegression ['n_jobs']
                           MDS ['init']
                MultinomialHMM ['obs']
   PassiveAggressiveClassifier ['intercept_init', 'coef_init']
    PassiveAggressiveRegressor ['intercept_init', 'coef_init']
                    Perceptron ['intercept_init', 'class_weight', 'coef_init']
                           QDA ['store_covariances', 'tol']
                 SGDClassifier ['intercept_init', 'class_weight', 'coef_init']
                  SGDRegressor ['intercept_init', 'coef_init']
               TfidfVectorizer ['raw_documents']
                      _BaseHMM ['obs']

@amueller
Copy link
Member Author

check_input is ok, Xy is ok (they are optimizations) monitor is a matter of discussion afaik.
I'm not sure about coef_init and intercept_init in the SGDClasses, maybe @larsmans or @mblondel can comment.
class_weight, the QDA parameters, n_jobs, ... should go.

@amueller
Copy link
Member Author

Also thanks @trevorstephens :)

@trevorstephens
Copy link
Contributor

NP :)

BTW class_weight in SGDClassifier was already deprecated in #3931 . I believe Perceptron is simply inheriting this deprecated arg. Though based on your comment #4112 (comment) @amueller perhaps that needs to be adjusted to 0.18 too?

@amueller
Copy link
Member Author

Maybe. It was more of a but-fix, though, as it was ignored.

@amueller amueller modified the milestones: 0.16, 0.17 Sep 11, 2015
@amueller
Copy link
Member Author

Update:

        DecisionTreeClassifier ['check_input']
         DecisionTreeRegressor ['check_input']
                    ElasticNet ['check_input']
           ExtraTreeClassifier ['check_input']
            ExtraTreeRegressor ['check_input']
    GradientBoostingClassifier ['monitor']
     GradientBoostingRegressor ['monitor']
                KernelCenterer ['K']
                           LDA ['store_covariance', 'tol']
                          Lars ['Xy']
                         Lasso ['check_input']
                     LassoLars ['Xy']
                   LassoLarsIC ['copy_X']
                           MDS ['init']
   PassiveAggressiveClassifier ['intercept_init', 'coef_init']
    PassiveAggressiveRegressor ['intercept_init', 'coef_init']
                    Perceptron ['intercept_init', 'coef_init']
                           QDA ['store_covariances', 'tol']
                 SGDClassifier ['intercept_init', 'coef_init']
                  SGDRegressor ['intercept_init', 'coef_init']

@amueller
Copy link
Member Author

QDA and LDA need deprecations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Easy Well-defined and straightforward way to resolve
Projects
None yet
4 participants
0