MultiOutputClassifier.predict_proba fails if targets have different number of values

Description

If two target columns are categorical and have a different number unique values, MultiOutputClassifier.predict_proba raises a value error when trying to dstack the probability matrices.

Steps/Code to Reproduce

Example:

from sklearn.linear_model import LogisticRegression
from sklearn.multioutput import MultiOutputClassifier

import numpy as np

# random features
X = np.random.normal(size=(100, 100))

# random labels
Y = np.concatenate([
        np.random.choice(['a', 'b'], (100, 1)),     # first column can have 2 values
        np.random.choice(['d', 'e', 'f'], (100, 1)) # second column can have 3 
    ], axis=1)

clf = MultiOutputClassifier(LogisticRegression())

clf.fit(X, Y)

clf.predict_proba(X)

Expected Results

No error is thrown. It looks like the RandomForestClassifier handles data of this shape and returns a list of numpy arrays. I would expect the same behavior in this case.

Actual Results

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-40-84c34a558c92> in <module>()
     18 clf.fit(X, Y)
     19 
---> 20 clf.predict_proba(X)

anaconda/lib/python3.5/site-packages/sklearn/multioutput.py in predict_proba(self, X)
    224 
    225         results = np.dstack([estimator.predict_proba(X) for estimator in
--> 226                             self.estimators_])
    227         return results
    228 

anaconda/lib/python3.5/site-packages/numpy/lib/shape_base.py in dstack(tup)
    366 
    367     """
--> 368     return _nx.concatenate([atleast_3d(_m) for _m in tup], 2)
    369 
    370 def _replace_zero_by_x_arrays(sub_arys):

ValueError: all the input array dimensions except for the concatenation axis must match exactly

Versions

Darwin-15.6.0-x86_64-i386-64bit
Python 3.5.2 |Anaconda custom (x86_64)| (default, Jul  2 2016, 17:52:12) 
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)]
NumPy 1.11.1
SciPy 0.18.1
Scikit-Learn 0.18.1

If returning a list is the right fix, happy to submit a PR for this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions