8000 Sunburst/treemap path by emmanuelle · Pull Request #2006 · plotly/plotly.py · GitHub
[go: up one dir, main page]

Skip to content

Sunburst/treemap path #2006

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Jan 22, 2020
Merged
Changes from 1 commit
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
4ac1efb
proof of concept
emmanuelle Dec 13, 2019
c619e28
first version
emmanuelle Dec 16, 2019
10668b6
tests
emmanuelle Dec 16, 2019
edfcced
black
emmanuelle Dec 16, 2019
1f3b8da
added test with missing values
emmanuelle Dec 18, 2019
8cb9d99
examples for sunburst tutorial
emmanuelle Dec 18, 2019
cd500a5
added type check and corresponding test
emmanuelle Dec 18, 2019
c233220
corrected bug
emmanuelle Dec 18, 2019
edefabf
treemap branchvalues
emmanuelle Dec 18, 2019
41c8d30
Merge branch 'master' into sunburst-path
emmanuelle Jan 17, 2020
2952fe6
path is now from root to leaves
emmanuelle Jan 17, 2020
c6b7243
removed EPS hack
emmanuelle Jan 18, 2020
be3b622
working version for continuous color
emmanuelle Jan 20, 2020
7f2920b
new tests and more readable code, also added hover support
emmanuelle Jan 20, 2020
8519302
updated docs
emmanuelle Jan 20, 2020
437bbd7
removed named agg which is valid only starting from pandas 0.25
emmanuelle Jan 20, 2020
fb9d992
version hopefully compatible with older pandas
emmanuelle Jan 20, 2020
a57b027
still debugging
emmanuelle Jan 21, 2020
bf8da4b
do not use lambdas
emmanuelle Jan 21, 2020
9e23890
removed redundant else
emmanuelle Jan 21, 2020
f67602f
discrete color
emmanuelle Jan 22, 2020
6b6a105
always add a count column when no values column is passed
emmanuelle Jan 22, 2020
9996731
removed if which is not required any more
emmanuelle Jan 22, 2020
f3e7e27
nicer labels with /
emmanuelle Jan 22, 2020
8cd227a
simplified code
emmanuelle Jan 22, 2020
8b66c90
better id labels
emmanuelle Jan 22, 2020
19b81ac
discrete colors
emmanuelle Jan 22, 2020
ba6ec19
raise ValueError for non-leaves with None
emmanuelle Jan 22, 2020
c0cbce0
other check
emmanuelle Jan 22, 2020
57503b4
discrete color other comes first
emmanuelle Jan 22, 2020
0ab2afd
fixed tests
emmanuelle Jan 22, 2020
0d86998
hover
emmanuelle Jan 22, 2020
d63d4bd
fixed pandas API pb
emmanuelle Jan 22, 2020
9b217f8
pandas stuff
emmanuelle Jan 22, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
version hopefully compatible with older pandas
  • Loading branch information
emmanuelle committed Jan 20, 2020
commit fb9d9922915ed6257681addb2da377db61fab8ae
16 changes: 4 additions & 12 deletions packages/python/plotly/plotly/express/_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -1007,14 +1007,6 @@ def build_dataframe(args, attrables, array_attrables):
return args


def _named_agg(colname, aggfunc, mode="old_pandas"):
if mode == "old_pandas":
return (colname, aggfunc)
else:
# switch to this mode when tuples become deprecated
return pd.NamedAgg(colname, aggfunc)


def process_dataframe_hierarchy(args):
"""
Build dataframe for sunburst or treemap when the path argument is provided.
Expand Down Expand Up @@ -1054,15 +1046,15 @@ def process_dataframe_hierarchy(args):
aggfunc_color = lambda x: np.average(
x, weights=df.loc[x.index, count_colname]
)
agg_f[args["color"]] = _named_agg(colname=args["color"], aggfunc=aggfunc_color)
agg_f[args["color"]] = aggfunc_color
if args["color"] or args["values"]:
agg_f[count_colname] = _named_agg(colname=count_colname, aggfunc="sum")
agg_f[count_colname] = "sum"

# Other columns (for color, hover_data, custom_data etc.)
cols = list(set(df.columns).difference(path))
for col in cols: # for hover_data, custom_data etc.
if col not in agg_f:
agg_f[col] = _named_agg(colname=col, aggfunc=lambda_discrete)
agg_f[col] = lambda_discrete
# ----------------------------------------------------------------------------

df_all_trees = pd.DataFrame(columns=["labels", "parent", "id"] + cols)
Expand All @@ -1074,7 +1066,7 @@ def process_dataframe_hierarchy(args):
if not agg_f:
dfg = df.groupby(path[i:]).sum(numerical_only=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under what circumstances do we do this, and what's the reasoning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's if neither values nor color was passed. I think I need to use some sort of aggregation function because after I call reset_index to get the groups, but maybe it's possible to do it in a more elegant/efficient way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could get rid of this part with the count_colname logic.

else:
dfg = df.groupby(path[i:]).agg(**agg_f)
dfg = df.groupby(path[i:]).agg(agg_f)
dfg = dfg.reset_index()
df_tree["labels"] = dfg[level].copy().astype(str)
df_tree["parent"] = ""
Expand Down
0