8000 Broken fill color generation for tree.export_graphviz · Issue #6352 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Broken fill color generation for tree.export_graphviz #6352

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tracer0tong opened this issue Feb 13, 2016 · 3 comments
Closed

Broken fill color generation for tree.export_graphviz #6352

tracer0tong opened this issue Feb 13, 2016 · 3 comments

Comments

@tracer0tong
Copy link
Contributor

Example:
from sklearn import tree
clf = tree.DecisionTreeClassifier()
clf.fit([[0]],[1])
tree.export_graphviz(clf, filled=True)

We'll get error in line 160 of sklearn/tree/export.py:
alpha = int(np.round(255 * ((value - colors['bounds'][0]) / (colors['bounds'][1] - colors['bounds'][0])), 0))

ValueError: cannot convert float NaN to integer

Basically we will get zero division, because colors['bounds'][0] equals colors['bounds'][1].

@yenchenlin
Copy link
Contributor

Hello @tracer0tong ,
I am not an expert regarding this, but from the traceback seems like there are two problems in current implementation.

1.
In here:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/export.py#L313
These lines:

elif tree.n_classes[0] == 1:
    # Find max and min values in leaf nodes for regression
    colors['bounds'] = (np.min(tree.value),
                                  np.max(tree.value))

indicate that export_graphviz use elif tree.n_classes[0] == 1: to judge if current tree is a regressor tree or not. However, if users only pass 1 class label to be fitted in DecisionTreeClassifier, such as example provided by @tracer0tong , export_graphviz will misunderstand and execute this elif-block which is dedicated to regression problem.

2.
And here:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/export.py#L161-L162

If there's only one kind of label provided by data, it will cause zero division.

@tracer0tong
Copy link
Contributor Author

You are right, because there is undefined state. We couldn't decide is it
classifier or regression. I think we shouldn't guess, just add condition check in this place.

tracer0tong added a commit to tracer0tong/scikit-learn that referenced this issue Feb 17, 2016
@tracer0tong
Copy link
Contributor Author

Finally, I've decided to make fix for this issue. Hope this fix will be accepted.

@TomDLT TomDLT closed this as completed Mar 24, 2016
TomDLT added a commit that referenced this issue Mar 24, 2016
jeff1evesque pushed a commit to jeff1evesque/scikit-learn that referenced this issue Apr 5, 2016
MAINT: Simplify n_features_to_select in RFECV

Make dump_svmlight_file support sparse y

Fix for issue scikit-learn#6352

Fixed codestyle

Merge PR scikit-learn#6037: copy_X in KernelPCA

Ensuring consistent transforms for KernelPCA

Taking @vene's changes into account, thanks!

Taking @jakevdp's comment into account

Added more verbose documentation to kernel_pca.py

Specifying that X_fit_ will not be None. @jakevdp

KernelPCA: Fixing more formatting of docstring

Addressed @vene's documentation comments

Addressing that dual_coef_ might not be present in model in docs. @vene

Make assign_rows_csr support Cython fused types

Use fused type in inplace normalize

Test normalize function in data.py

DOC: mark weights as optional

[DOC] Fix broken links
mannby pushed a commit to mannby/scikit-learn that referenced this issue Apr 22, 2016
TomDLT pushed a commit to TomDLT/scikit-learn that referenced this issue Oct 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0