-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
Checking for length of categories before doing string conversion. fix… #11306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
can you update |
Coming up. Had to read up where/how to do that. |
the categorical asv benchmark file breaks PEP8 302: 2 lines between functions. Should I correct? I searched but did not find a least of PEP8 errors the pandas decided to ignore. |
self.data = df[df.C == '20'] | ||
|
||
def time_rendering(self): | ||
str(data.C) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data -> self.data ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry. I can't run asv on my machine. first it complained about a missing config file, when i tried to run only the categorical bench test. apparently, when i'm not restricting it with a -b
flag, it creates a new environment, but it seems to be incompatible with conda, as it creates a py2.7 environment now, while being run in a conda python3.4 environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe, that's meant to happen, as it tests for all to be tested environment? I don't know, have never used asv before.
@michaelaye that's ok for the PEP8 changes |
@@ -55,7 +55,7 @@ Bug Fixes | |||
|
|||
- Bug in ``.to_latex()`` output broken when the index has a name (:issue: `10660`) | |||
- Bug in ``HDFStore.append`` with strings whose encoded length exceded the max unencoded length (:issue:`11234`) | |||
|
|||
- Performance bug in ``Categorical._repr_categories`` was rendering string before chopping them for display (:issue: `11305`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to Performance section
I was able to start an $ asv continuous master HEAD -b categorical
· Creating environments
· Discovering benchmarks
·· Uninstalling from py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlwt
·· Building for py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlwt...................................
·· Installing into py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlwt..
· Running 12 total benchmarks (2 commits * 1 environments * 6 benchmarks)
[ 0.00%] · For pandas commit hash cdff5bce:
[ 0.00%] ·· Building for py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlwt.........................................
[ 0.00%] ·· Benchmarking py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlwt
[ 8.33%] ··· Running categoricals.categorical_constructor.time_fastpath 4.85ms
[ 16.67%] ··· Running categoricals.categorical_constructor.time_regular_constructor 226.53ms
[ 25.00%] ··· Running categoricals.categorical_rendering.time_rendering 2.31ms
[ 33.33%] ··· Running categoricals.categorical_value_counts.time_value_counts 18.63ms
[ 41.67%] ··· Running categoricals.categorical_value_counts.time_value_counts_dropna 24.74ms
[ 50.00%] ··· Running categoricals.concat_categorical.time_concat_categorical 46.91ms
[ 50.00%] · For pandas commit hash c2aa6a23:
[ 50.00%] ·· Building for py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlwt......................................
[ 50.00%] ·· Benchmarking py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlwt
[ 58.33%] ··· Running categoricals.categorical_constructor.time_fastpath 4.93ms
[ 66.67%] ··· Running categoricals.categorical_constructor.time_regular_constructor 210.66ms
[ 75.00%] ··· Running categoricals.categorical_rendering.time_rendering 17.63ms
[ 83.33%] ··· Running categoricals.categorical_value_counts.time_value_counts 13.60ms
[ 91.67%] ··· Running categoricals.categorical_value_counts.time_value_counts_dropna 17.59ms
[100.00%] ··· Running categoricals.concat_categorical.time_concat_categorical 50.98ms before after ratio
[c2aa6a23] [cdff5bce]
- 17.63ms 2.31ms 0.13 categoricals.categorical_rendering.time_rendering
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY. |
@@ -50,6 +50,8 @@ Performance Improvements | |||
|
|||
.. _whatsnew_0171.bug_fixes: | |||
|
|||
- Performance bug in ``Categorical._repr_categories`` was rendering string before chopping them for display (:issue: `11305`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to Performance section
say Performance issue in rendering a large number of categories when printing a Categorical or Series of category dtype
comments, then pls squash |
merged via 4777800 thanks! |
closes #11305