-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
ENH add fontname argument in export_graphviz for non-English characters #18959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before reviewing, could you revert all cosmetic changes linked to style issues? We intend to not add any of these changes and only focus on the code that need real changes because it makes it difficult to review otherwise.
.gitignore
Outdated
@@ -79,3 +79,6 @@ _configtest.o.d | |||
sklearn/utils/_seq_dataset.pyx | |||
sklearn/utils/_seq_dataset.pxd | |||
sklearn/linear_model/_sag_fast.pyx | |||
|
|||
# history codes | |||
.history |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you revert this part
Please add an entry to the change log at |
fine, I will do that later. But when I open the |
Yes this is possible. But we only check on the diff so you can ignore those. |
I reset the branch and reopen pr this again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since, fontname
is created in _BaseTreeExporter
, it could be shared with plot_tree
(using matplotlib) and export_graphviz
(using graphviz).
Could you update the plot_tree
signature as well and pass the parameter to the underlying tree exported. It would also required to update the docstring of plot_tree
and to change the tests.
doc/whats_new/v0.24.rst
Outdated
@@ -785,6 +785,10 @@ Changelog | |||
- |Enhancement| :func:`tree.plot_tree` now uses colors from the matplotlib | |||
configuration settings. :pr:`17187` by `Andreas Müller`_. | |||
|
|||
- |Fix| :func:`tree.export_graphviz` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the entry in v1.0.rst
(you might need to merge master into your branch first). We already on our way to release 0.24 and we will probably be a bit too late to get this PR in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The v1.0.rst
file is not in my branch. The v1.0.rst
was committed to master two days ago.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you need to merge master into your branch to synchronize with master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine
doc/whats_new/v0.24.rst
Outdated
@@ -785,6 +785,10 @@ Changelog | |||
- |Enhancement| :func:`tree.plot_tree` now uses colors from the matplotlib | |||
configuration settings. :pr:`17187` by `Andreas Müller`_. | |||
|
|||
- |Fix| :func:`tree.export_graphviz` | |||
Add fontname argument in plot_tree for non-English characters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add fontname argument in plot_tree for non-English characters. | |
Add `fontname` argument in :func:`tree.export_graphviz` for non-English characters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
My test script as follows: import graphviz
from sklearn import tree
from sklearn.datasets import load_iris
clf = tree.DecisionTreeClassifier()
iris = load_iris()
names = ['中文', 'にほんご', '조선말', 'ру́сский язы́к']
clf = clf.fit(iris.data, iris.target)
dot_data = tree.export_graphviz(
clf, feature_names=names,
filled=True, leaves_parallel=True, node_ids=True,
rounded=True, special_characters=True, fontname='sans')
graph = graphviz.Source(dot_data)
graph.render("tree") As for |
Using A related issue, matplotlib/matplotlib#15062 plt.rcParams['font.family'] = [fontname]
plt.rcParams['font.sans-serif'] = [fontname] |
doc/whats_new/v0.24.rst
Outdated
@@ -792,6 +792,10 @@ Changelog | |||
- |Enhancement| :func:`tree.plot_tree` now uses colors from the matplotlib | |||
configuration settings. :pr:`17187` by `Andreas Müller`_. | |||
|
|||
- |Fix| Add `fontname` argument in :func:`tree.export_graphviz` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove this entry now.
Regarding your test scrip in #18959 (comment), I think that we should include it and check the content as for the other test. It will allow us to make sure that we overwrite the fontname. |
I think that for from sklearn.datasets import load_iris
from sklearn import tree
clf = tree.DecisionTreeClassifier(random_state=0)
iris = load_iris()
clf = clf.fit(iris.data, iris.target)
tree_mpl = tree.plot_tree(clf) Then, we can check for annotation in tree_mpl:
assert x.get_family() == ["fontname"] I did not tested it but it should something around that line. |
For this, there are four different kinds of languages. I couldn't find a font to render them all. |
What I meant is to check the |
When I tested it, I found So the And my testing codes as follow: import matplotlib.pyplot as plt
from sklearn import tree
from sklearn.datasets import load_iris
fontname = 'Microsoft YaHei'
plt.rcParams['font.family'] = [fontname]
plt.rcParams['font.sans-serif'] = [fontname]
iris = load_iris()
clf = tree.DecisionTreeClassifier(random_state=0)
clf = clf.fit(iris.data, iris.target)
names = ['中文', 'にほんご', '조선말', 'ру́сский язы́к']
plt.figure()
tree.plot_tree(
clf, feature_names=names,
filled=True, node_ids=True,
rounded=True)
plt.show() |
doc/whats_new/v1.0.rst
Outdated
................... | ||
|
||
- |Fix| Add `fontname` argument in :func:`tree.export_graphviz` and | ||
:func:`tree.plot_tree` for non-English characters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we should remove plot_tree
then?
sklearn/tree/_export.py
Outdated
precision=3, ax=None, fontsize=None): | ||
impurity=True, node_ids=False, proportion=False, | ||
rotate='deprecated', rounded=False, precision=3, | ||
ax=None, fontsize=None, fontname='helvetica'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we should revert the changes here as well for plot_tree
This seems to need to update keywords to fix failing checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks. We will need a second review.
@thomasjpfan @rth @jnothman Do you want to have a look since you looked at the former PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @Zeroto521 !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM
@glemaitre The most recent commits makes it so that fontname
changes the font regardness of rounded
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well. Thanks!
Reference Issues/PRs
See also #14290, #14275
What does this implement/fix? Explain your changes.
Add an arg called
fontname
replace using the freezing Helvetica font.Any other comments?
Well, this bug is very mad at me.
I have to handle it by changing the
dot
file font name and then using cmddot -Tpdf tree -o dot_data.pdf
again and again.So just do it.