Added comment about optional y in the developer docs.

amueller · amueller · commit f91dee723c95 · 2015-01-28T11:05:06.000-05:00
diff --git a/doc/developers/index.rst b/doc/developers/index.rst
@@ -716,8 +716,11 @@ is not met, an exception of type ``ValueError`` should be raised.
 ``y`` might be ignored in the case of unsupervised learning. However, to
 make it possible to use the estimator as part of a pipeline that can
 mix both supervised and unsupervised transformers, even unsupervised
-estimators are kindly asked to accept a ``y=None`` keyword argument in
+estimators need to accept a ``y=None`` keyword argument in
 the second position that is just ignored by the estimator.
+For the same reason, ``fit_predict``, ``fit_transform``, ``score``,
+``transform`` and ``partial_fit`` methods need to accept a ``y`` argument in
+the second place if they are implemented.
 
 The method should return the object (``self``). This pattern is useful
 to be able to implement quick one liners in an IPython session such as::
@@ -857,9 +860,10 @@ last step, it needs to provide a ``fit`` or ``fit_transform`` function.
 To be able to evaluate the pipeline on any data but the training set,
 it also needs to provide a ``transform`` function.
 There are no special requirements for the last step in a pipeline, except that
-it has a ``fit`` function.  All ``fit`` and ``fit_transform`` functions must
-take arguments ``X, y``, even if y is not used.
-
+it has a ``fit`` function. All ``fit`` and ``fit_transform`` functions must
+take arguments ``X, y``, even if y is not used. Similarly, for ``score`` to be
+usable, the last step of the pipeline needs to have a ``score`` function that
+accepts an optional ``y``.
 
 Working notes
 -------------
diff --git a/sklearn/cluster/k_means_.py b/sklearn/cluster/k_means_.py
@@ -795,7 +795,7 @@ def fit(self, X, y=None):
                 n_jobs=self.n_jobs)
         return self
 
-    def fit_predict(self, X):
+    def fit_predict(self, X, y=None):
         """Compute cluster centers and predict cluster index for each sample.
 
         Convenience method; equivalent to calling fit(X) followed by
diff --git a/sklearn/utils/estimator_checks.py b/sklearn/utils/estimator_checks.py
@@ -81,6 +81,10 @@ def set_fast_parameters(estimator):
         # K-Means
         estimator.set_params(n_init=2)
 
+    if estimator.__class__.__name__ == "SelectFdr":
+        # avoid not selecting any features
+        estimator.set_params(alpha=.5)
+
     if isinstance(estimator, BaseRandomProjection):
         # Due to the jl lemma and often very few samples, the number
         # of components of the random matrix projection will be probably
@@ -253,7 +257,7 @@ def check_fit_score_takes_y(name, Estimator):
     estimator = Estimator()
     set_fast_parameters(estimator)
     set_random_state(estimator)
-    funcs = ["fit", "score", "partial_fit"]
+    funcs = ["fit", "score", "partial_fit", "fit_predict", "fit_transform"]
 
     for func_name in funcs:
         func = getattr(estimator, func_name, None)