Closed
Description
Many estimators take a parameter named sample_weight
. Pipeline
does not, since it wants its fit
parameters to be prefixed by the step name with a __
delimiter:
>>> from sklearn.pipeline import make_pipeline
>>> from sklearn.linear_model import LogisticRegression
>>> clf = make_pipeline(LogisticRegression())
>>> clf.fit([[0], [0]], [0, 1], logisticregression__sample_weight=[1, 1])
Pipeline(memory=None,
steps=[('logisticregression', LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
penalty='l2', random_state=None, solver=
6236
'liblinear', tol=0.0001,
verbose=0, warm_start=False))])
>>> clf.fit([[0], [0]], [0, 1], sample_weight=[1, 1])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/n/schwafs/home/joel/miniconda3/envs/scipy3k/lib/python3.6/site-packages/sklearn/pipeline.py", line 248, in fit
Xt, fit_params = self._fit(X, y, **fit_params)
File "/n/schwafs/home/joel/miniconda3/envs/scipy3k/lib/python3.6/site-packages/sklearn/pipeline.py", line 197, in _fit
step, param = pname.split('__', 1)
ValueError: not enough values to unpack (expected 2, got 1)
This error message is not friendly enough. It should explicitly describe the correct format for passing sample_weight
to a step in a Pipeline.