-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
feat: make plotly-express dataframe agnostic via narwhals #4790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
131 commits
Select commit
Hold shift + click to select a range
9873e97
non core changes
FBruzzesi 0389591
_core overhaul
FBruzzesi ba93236
some _core fixes
FBruzzesi 421fc1d
tests replace sort_index(axis=1)
FBruzzesi ca5c820
reset_index in concat and allow any object to pandas
FBruzzesi a6aab24
trendline prep
FBruzzesi 7665f10
WIP Index
FBruzzesi ec4f250
clean from breakpoints
FBruzzesi 7e0d4c2
some tests fix
FBruzzesi 5543638
hotfix and tests output to pandas
FBruzzesi cd0dab7
FIX: columns never as index
FBruzzesi f334b32
getting there with the tests
FBruzzesi e5eb949
get_column instead of pandas slicing, unix to seconds
FBruzzesi 7747e30
bump narhwals, hierarchy fastpath
FBruzzesi ac00b36
fix to_unindexed_series
FBruzzesi da80c5b
fix trendline
FBruzzesi 8a72ba1
rm numpy dep in _core
FBruzzesi aeff203
fix: _check_dataframe_all_leaves
FBruzzesi 2041bef
(maybe) fix to_unindexed_series
FBruzzesi 71473f1
(maybe) fix to_unindexed_series
FBruzzesi 9f74c38
started tests with constructor
FBruzzesi 28587c9
added constructor to all tests
FBruzzesi 1bb2448
added some comments for fixme
FBruzzesi f45addf
to_py_scalar and more tests
FBruzzesi 5341759
dealing with exceptions and tests
FBruzzesi dfc957c
bump version, sort(...,nulls_last=True)
FBruzzesi 90f2667
We did it: no more dups in group by :D
FBruzzesi fb58d1b
concat_str
FBruzzesi ddb3b35
fix test_several_dataframes
FBruzzesi 37ce302
dedups customdata
FBruzzesi 4da8768
getting there
FBruzzesi 210e01a
xfail pyarrow chunked-array because name-less
FBruzzesi c00525e
all green with edge narhwals
FBruzzesi 3486a3e
add pandas nullable constructors in tests
FBruzzesi c0ce093
bump narwhals and address todos
FBruzzesi 0eb6951
check narwhals installation
FBruzzesi 844a6a9
rm unused comments
FBruzzesi 0c27789
rm unused code
FBruzzesi 0e6ff78
add pyarrow and narwhals to requirements_39_pandas_2_optional
FBruzzesi c2337c9
requirements, test requirements optional
FBruzzesi 2cc5d7b
refactor tests
FBruzzesi 1b27487
address feedbacks
FBruzzesi 23a23be
typos
FBruzzesi 7968cff
conftest
FBruzzesi cf76721
merge master
FBruzzesi 91db84b
mock interchange
FBruzzesi 5c6772e
optional requirements
FBruzzesi 9ec3f9e
move conftest in express folder
FBruzzesi 400a624
hotfix and figure_factory hexbin
FBruzzesi 1aa5163
old versions, polars[timezone], hotfix
FBruzzesi 594ded0
fix frame value in hexbin
FBruzzesi 6676061
copy numpy array
FBruzzesi d7d2884
hotfix hexbin mapbox
FBruzzesi d6ee676
Merge branch 'master' into plotly-with-narwhals
FBruzzesi 82c114d
fix test
FBruzzesi 0ceabc1
Merge branch 'plotly:master' into plotly-with-narwhals
FBruzzesi c9b626e
use lazy in process_dataframe_hierarchy
FBruzzesi 87841d1
fix custom sort in process_dataframe_pie
FBruzzesi ffa7b3b
Merge branch 'master' into plotly-with-narwhals
archmoj 3ba19ae
bump version and adjust core
FBruzzesi a70146b
use dtype.is_numeric
FBruzzesi 1fa9fe4
Merge branch 'master' into plotly-with-narwhals
FBruzzesi 0103aa6
revert test
FBruzzesi 673d141
Merge branch 'plotly-with-narwhals' of https://github.com/FBruzzesi/p…
FBruzzesi b858ed8
feedback adjustments
FBruzzesi bbcf438
Merge branch 'master' into plotly-with-narwhals
FBruzzesi 49efae2
raise if numpy is missing, conftest fix, typo
FBruzzesi a36bc24
__plotly_n_unique__
FBruzzesi c119153
Merge branch 'master' into plotly-with-narwhals
FBruzzesi 7416407
format
FBruzzesi 1867f6f
format
FBruzzesi d3a28c0
feedback adjustments
FBruzzesi e6e9994
use drop_null_keys, some pandas fastpaths
MarcoGorelli 64b8c70
bump narwhals version
MarcoGorelli 3f6b383
some improvements by Marco
FBruzzesi 755aea8
format and pyspark path
FBruzzesi 6f18021
add narwhals to requirements core
FBruzzesi 4d62e73
Update packages/python/plotly/plotly/express/_core.py
FBruzzesi a770fd8
refactor checking for df
MarcoGorelli 7d6f7d6
pushdown only for interchange libraries, sort out test
MarcoGorelli b8c10ec
Update packages/python/plotly/plotly/express/_core.py
MarcoGorelli 490b64a
fixup
MarcoGorelli f7fd4c9
Merge remote-tracking branch 'origin/plotly-with-narwhals' into plotl…
MarcoGorelli 8753acb
lint
MarcoGorelli 1429e6f
bump narwhals version
MarcoGorelli 878d4db
refactor checking for df and bump version
FBruzzesi 192e0a8
use token in process_dataframe_hierarchy
FBruzzesi de6761c
Range(label=...) for px.funnel
FBruzzesi bcfef68
improve error message and in-line comments
FBruzzesi 519cc68
better comments
FBruzzesi e5520a7
rm unused import and fix typo
FBruzzesi b855352
Merge branch 'master' into plotly-with-narwhals
FBruzzesi 51e2b23
make sure column + token is unique, replace **{} with .alias()
FBruzzesi 7ef9f28
WIP
FBruzzesi e9a367d
WIP
FBruzzesi 12fed31
Merge branch 'master' into plotly-with-narwhals
FBruzzesi 27b2996
use nw.get_native_namespace
FBruzzesi f27f959
Merge branch 'plotly-with-narwhals' of https://github.com/FBruzzesi/p…
FBruzzesi 126a79d
Merge branch 'master' into feat/dataframe-agnostic-data
FBruzzesi 7735366
add narwhals in various requirements
FBruzzesi b6516b4
docstrings
FBruzzesi 6f1389f
rm type hints, change post_agg to use alias
FBruzzesi db22268
feedback adjustments
FBruzzesi b514c01
move imports out, fix pyarrow
FBruzzesi ce8fb9a
rm unused narwhals wrapper
FBruzzesi e47827e
comment about stable api
FBruzzesi 9a9283a
update changelog
FBruzzesi 2630a5a
fixup time zone handling
MarcoGorelli fef6dbe
modin and cudf
FBruzzesi 48c7f62
defensive from_native call
FBruzzesi 18cc11c
typo
FBruzzesi d94cbf7
fixup timezones
FBruzzesi c320c46
move from object to datetime dtype in _plotly_utils/test/validators
FBruzzesi afdb31f
simplify ecdfnorm
MarcoGorelli 68ab52a
Merge pull request #4 from MarcoGorelli/ecdf-mode-perf
FBruzzesi b8ccec4
Merge branch 'master' into plotly-with-narwhals
FBruzzesi f102998
rm to_py_scalar call in for loop -> fix Pie performances
FBruzzesi 2df0427
Merge branch 'plotly-with-narwhals' of https://github.com/FBruzzesi/p…
FBruzzesi 55a0178
Merge branch 'master' into feat/dataframe-agnostic-data
FBruzzesi bb327d5
merge feat/dataframe-agnostic-data
FBruzzesi 7d611fb
use return_type directly when building datasets
FBruzzesi a22a7be
stocks date to string and test_trendline_on_timeseries fix
FBruzzesi 44a52e5
merge master and rm FIXME comment
FBruzzesi fc74b2e
do not repeat new_series unnecessarely
FBruzzesi 499e2fa
bump version, use numpy for range
FBruzzesi d2e1008
trigger ci now that new version is published
FBruzzesi 742b2ec
add narwhals to np2_optional.txt
FBruzzesi 269dea6
version
FBruzzesi b1dc48d
Merge branch 'master' into plotly-with-narwhals
MarcoGorelli 17fb96f
Merge branch 'master' into plotly-with-narwhals
FBruzzesi 9f2c55b
Merge branch 'master' into plotly-with-narwhals
FBruzzesi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
make sure column + token is unique, replace **{} with .alias()
- Loading branch information
commit 51e2b23b0e617c151e742f0d1f63952ba5fc5cbf
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -163,6 +163,50 @@ def _is_continuous(df: nw.DataFrame, col_name: str) -> bool: | |
return df.get_column(col_name).dtype.is_numeric() | ||
|
||
|
||
def _to_unix_epoch_seconds(s: nw.Series) -> nw.Series: | ||
dtype = s.dtype | ||
if dtype == nw.Date: | ||
return s.dt.timestamp("ms") / 1_000 | ||
if dtype == nw.Datetime: | ||
if dtype.time_unit in ("s", "ms"): | ||
return s.dt.timestamp("ms") / 1_000 | ||
elif dtype.time_unit == "us": | ||
return s.dt.timestamp("us") / 1_000_000 | ||
elif dtype.time_unit == "ns": | ||
return s.dt.timestamp("ns") / 1_000_000_000 | ||
else: | ||
msg = "Unexpected dtype, please report a bug" | ||
raise ValueError(msg) | ||
else: | ||
msg = f"Expected Date or Datetime, got {dtype}" | ||
raise TypeError(msg) | ||
|
||
|
||
def _generate_temporary_column_name(n_bytes: int, columns: list[str]) -> str: | ||
"""Wraps of Narwhals generate_temporary_column_name to generate a token | ||
which is guaranteed to not be in columns, nor in [col + token for col in columns] | ||
""" | ||
counter = 0 | ||
while True: | ||
# This is guaranteed to not be in columns by Narwhals | ||
token = nw.generate_temporary_column_name(n_bytes, columns=columns) | ||
|
||
# Now check that it is not in the [col + token for col in columns] list | ||
if token not in {f"{c}{token}" for c in columns}: | ||
return token | ||
|
||
counter += 1 | ||
if counter > 100: | ||
msg = ( | ||
"Internal Error: Plotly was not able to generate a column name with " | ||
f"{n_bytes=} and not in {columns}.\n" | ||
"Please report this to " | ||
"https://github.com/plotly/plotly.py/issues/new and we will try to " | ||
"replicate and fix it." | ||
) | ||
raise AssertionError(msg) | ||
|
||
|
||
def get_decorated_label(args, column, role): | ||
original_label = label = get_label(args, column) | ||
if "histfunc" in args and ( | ||
|
@@ -443,7 +487,7 @@ def make_trace_kwargs(args, trace_spec, trace_data, mapping_labels, sizeref): | |
# dict.fromkeys(customdata_cols) allows to deduplicate column | ||
# names, yet maintaining the original order. | ||
trace_patch["customdata"] = trace_data.select( | ||
[nw.col(c) for c in dict.fromkeys(customdata_cols)] | ||
*[nw.col(c) for c in dict.fromkeys(customdata_cols)] | ||
) | ||
elif attr_name == "color": | ||
if trace_spec.constructor in [ | ||
|
@@ -1693,7 +1737,7 @@ def build_dataframe(args, constructor): | |
other_dim = "x" if missing_bar_dim == "y" else "y" | ||
if not _is_continuous(df_output, args[other_dim]): | ||
args[missing_bar_dim] = count_name | ||
df_output = df_output.with_columns(**{count_name: nw.lit(1)}) | ||
df_output = df_output.with_columns(nw.lit(1).alias(count_name)) | ||
else: | ||
# on the other hand, if the non-missing dimension is continuous, then we | ||
# can use this information to override the normal auto-orientation code | ||
|
@@ -1760,7 +1804,7 @@ def build_dataframe(args, constructor): | |
else: | ||
args["x" if orient_v else "y"] = value_name | ||
args["y" if orient_v else "x"] = count_name | ||
df_output = df_output.with_columns(**{count_name: nw.lit(1)}) | ||
df_output = df_output.with_columns(nw.lit(1).alias(count_name)) | ||
args["color"] = args["color"] or var_name | ||
elif constructor in [go.Violin, go.Box]: | ||
args["x" if orient_v else "y"] = wide_cross_name or var_name | ||
|
@@ -1773,12 +1817,12 @@ def build_dataframe(args, constructor): | |
args["histfunc"] = None | ||
args["orientation"] = "h" | ||
args["x"] = count_name | ||
df_output = df_output.with_columns(**{count_name: nw.lit(1)}) | ||
df_output = df_output.with_columns(nw.lit(1).alias(count_name)) | ||
else: | ||
args["histfunc"] = None | ||
args["orientation"] = "v" | ||
args["y"] = count_name | ||
df_output = df_output.with_columns(**{count_name: nw.lit(1)}) | ||
df_output = df_output.with_columns(nw.lit(1).alias(count_name)) | ||
|
||
if no_color: | ||
args["color"] = None | ||
|
@@ -1789,10 +1833,10 @@ def build_dataframe(args, constructor): | |
def _check_dataframe_all_leaves(df: nw.DataFrame) -> None: | ||
cols = df.columns | ||
df_sorted = df.sort(by=cols, descending=False, nulls_last=True) | ||
null_mask = df_sorted.select(*[nw.col(c).is_null() for c in cols]) | ||
df_sorted = df_sorted.with_columns(nw.col(*cols).cast(nw.String())) | ||
null_mask = df_sorted.select(nw.all().is_null()) | ||
df_sorted = df_sorted.select(nw.all().cast(nw.String())) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
null_indices_mask = null_mask.select( | ||
null_mask=nw.any_horizontal(nw.col(cols)) | ||
null_mask=nw.any_horizontal(nw.all()) | ||
).get_column("null_mask") | ||
|
||
for row_idx, row in zip( | ||
|
@@ -1854,26 +1898,15 @@ def process_dataframe_hierarchy(args): | |
|
||
new_path = [col_name + "_path_copy" for col_name in path] | ||
df = df.with_columns( | ||
FBruzzesi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
**{ | ||
new_col_name: nw.col(col_name) | ||
for new_col_name, col_name in zip(new_path, path) | ||
} | ||
nw.col(col_name).alias(new_col_name) | ||
for new_col_name, col_name in zip(new_path, path) | ||
) | ||
path = new_path | ||
# ------------ Define aggregation functions -------------------------------- | ||
agg_f = {} | ||
if args["values"]: | ||
try: | ||
if isinstance(args["values"], Sequence) and not isinstance( | ||
args["values"], str | ||
): | ||
df = df.with_columns( | ||
**{c: nw.col(c).cast(nw.Float64()) for c in args["values"]} | ||
) | ||
else: | ||
df = df.with_columns( | ||
**{args["values"]: nw.col(args["values"]).cast(nw.Float64())} | ||
) | ||
df = df.with_columns(nw.col(args["values"]).cast(nw.Float64())) | ||
|
||
except Exception: # pandas, Polars and pyarrow exception types are different | ||
raise ValueError( | ||
9E88
|
@@ -1883,7 +1916,7 @@ def process_dataframe_hierarchy(args): | |
|
||
if args["color"] and args["color"] == args["values"]: | ||
new_value_col_name = args["values"] + "_sum" | ||
df = df.with_columns(**{new_value_col_name: nw.col(args["values"])}) | ||
df = df.with_columns(nw.col(args["values"]).alias(new_value_col_name)) | ||
args["values"] = new_value_col_name | ||
count_colname = args["values"] | ||
else: | ||
|
@@ -1894,7 +1927,7 @@ def process_dataframe_hierarchy(args): | |
"count" if "count" not in columns else "".join([str(el) for el in columns]) | ||
) | ||
# we can modify df because it's a copy of the px argument | ||
df = df.with_columns(**{count_colname: nw.lit(1)}) | ||
df = df.with_columns(nw.lit(1).alias(count_colname)) | ||
args["values"] = count_colname | ||
|
||
# Since count_colname is always in agg_f, it can be used later to normalize color | ||
|
@@ -1904,8 +1937,8 @@ def process_dataframe_hierarchy(args): | |
discrete_aggs = [] | ||
continuous_aggs = [] | ||
|
||
n_unique_token = nw.generate_temporary_column_name( | ||
n_bytes=16, columns=[*path, count_colname] | ||
n_unique_token = _generate_temporary_column_name( | ||
n_bytes=16, columns=df.collect_schema().names() | ||
) | ||
|
||
# In theory, for discrete columns aggregation, we should have a way to do | ||
|
@@ -1941,10 +1974,10 @@ def process_dataframe_hierarchy(args): | |
|
||
discrete_aggs.append(args["color"]) | ||
agg_f[args["color"]] = nw.col(args["color"]).max() | ||
agg_f[f'{args["color"]}_{n_unique_token}__'] = ( | ||
agg_f[f'{args["color"]}{n_unique_token}'] = ( | ||
nw.col(args["color"]) | ||
.n_unique() | ||
.alias(f'{args["color"]}_{n_unique_token}__') | ||
.alias(f'{args["color"]}{n_unique_token}') | ||
) | ||
else: | ||
# This first needs to be multiplied by `count_colname` | ||
|
@@ -1954,16 +1987,15 @@ def process_dataframe_hierarchy(args): | |
|
||
# Other columns (for color, hover_data, custom_data etc.) | ||
cols = list(set(df.collect_schema().names()).difference(path)) | ||
df = df.with_columns( | ||
**{c: nw.col(c).cast(nw.String()) for c in cols if c not in agg_f} | ||
) | ||
df = df.with_columns(nw.col(c).cast(nw.String()) for c in cols if c not in agg_f) | ||
|
||
for col in cols: # for hover_data, custom_data etc. | ||
if col not in agg_f: | ||
# Similar trick as above | ||
discrete_aggs.append(col) | ||
agg_f[col] = nw.col(col).max() | ||
agg_f[f"{col}_{n_unique_token}__"] = ( | ||
nw.col(col).n_unique().alias(f"{col}_{n_unique_token}__") | ||
agg_f[f"{col}{n_unique_token}"] = ( | ||
nw.col(col).n_unique().alias(f"{col}{n_unique_token}") | ||
) | ||
# Avoid collisions with reserved names - columns in the path have been copied already | ||
cols = list(set(cols) - set(["labels", "parent", "id"])) | ||
|
@@ -1972,7 +2004,7 @@ def process_dataframe_hierarchy(args): | |
|
||
if args["color"] and not discrete_color: | ||
df = df.with_columns( | ||
**{args["color"]: nw.col(args["color"]) * nw.col(count_colname)} | ||
(nw.col(args["color"]) * nw.col(count_colname)).alias(args["color"]) | ||
) | ||
|
||
def post_agg(dframe: nw.LazyFrame, continuous_aggs, discrete_aggs) -> nw.LazyFrame: | ||
|
@@ -1981,14 +2013,14 @@ def post_agg(dframe: nw.LazyFrame, continuous_aggs, discrete_aggs) -> nw.LazyFra | |
- discrete_aggs is either [args["color"], <rest_of_cols>] or [<rest_of cols>] | ||
""" | ||
return dframe.with_columns( | ||
**{c: nw.col(c) / nw.col(count_colname) for c in continuous_aggs}, | ||
**{col: nw.col(col) / nw.col(count_colname) for col in continuous_aggs}, | ||
FBruzzesi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
**{ | ||
c: nw.when(nw.col(f"{c}_{n_unique_token}__") == 1) | ||
.then(nw.col(c)) | ||
col: nw.when(nw.col(f"{col}{n_unique_token}") == 1) | ||
.then(nw.col(col)) | ||
.otherwise(nw.lit("(?)")) | ||
for c in discrete_aggs | ||
for col in discrete_aggs | ||
}, | ||
).drop([f"{c}_{n_unique_token}__" for c in discrete_aggs]) | ||
).drop([f"{col}{n_unique_token}" for col in discrete_aggs]) | ||
|
||
for i, level in enumerate(path): | ||
|
||
|
@@ -2006,30 +2038,26 @@ def post_agg(dframe: nw.LazyFrame, continuous_aggs, discrete_aggs) -> nw.LazyFra | |
id=nw.col(level).cast(nw.String()), | ||
) | ||
if i < len(path) - 1: | ||
_concat_str_token = nw.generate_temporary_column_name( | ||
n_bytes=8, columns=[*cols, "labels", "parent", "id"] | ||
_concat_str_token = _generate_temporary_column_name( | ||
n_bytes=16, columns=[*cols, "labels", "parent", "id"] | ||
) | ||
df_tree = ( | ||
df_tree.with_columns( | ||
**{ | ||
_concat_str_token: nw.concat_str( | ||
[ | ||
nw.col(path[j]).cast(nw.String()) | ||
for j in range(len(path) - 1, i, -1) | ||
], | ||
separator="/", | ||
) | ||
} | ||
nw.concat_str( | ||
[ | ||
nw.col(path[j]).cast(nw.String()) | ||
for j in range(len(path) - 1, i, -1) | ||
], | ||
separator="/", | ||
).alias(_concat_str_token) | ||
) | ||
.with_columns( | ||
**{ | ||
"parent": nw.concat_str( | ||
[nw.col(_concat_str_token), nw.col("parent")], separator="/" | ||
), | ||
"id": nw.concat_str( | ||
[nw.col(_concat_str_token), nw.col("id")], separator="/" | ||
), | ||
} | ||
parent=nw.concat_str( | ||
[nw.col(_concat_str_token), nw.col("parent")], separator="/" | ||
), | ||
id=nw.concat_str( | ||
[nw.col(_concat_str_token), nw.col("id")], separator="/" | ||
), | ||
) | ||
.drop(_concat_str_token) | ||
) | ||
|
@@ -2049,7 +2077,7 @@ def post_agg(dframe: nw.LazyFrame, continuous_aggs, discrete_aggs) -> nw.LazyFra | |
while sort_col_name in df_all_trees.columns: | ||
sort_col_name += "0" | ||
df_all_trees = df_all_trees.with_columns( | ||
**{sort_col_name: nw.col(args["color"]).cast(nw.String())} | ||
nw.col(args["color"]).cast(nw.String()).alias(sort_col_name) | ||
).sort(by=sort_col_name, nulls_last=True) | ||
|
||
# Now modify arguments | ||
|
@@ -2080,10 +2108,8 @@ def process_dataframe_timeline(args): | |
try: | ||
df: nw.DataFrame = args["data_frame"] | ||
df = df.with_columns( | ||
**{ | ||
args["x_start"]: nw.col(args["x_start"]).str.to_datetime(), | ||
args["x_end"]: nw.col(args["x_end"]).str.to_datetime(), | ||
} | ||
nw.col(args["x_start"]).str.to_datetime().alias(args["x_start"]), | ||
nw.col(args["x_end"]).str.to_datetime().alias(args["x_end"]), | ||
) | ||
except Exception: | ||
raise TypeError( | ||
|
@@ -2092,11 +2118,9 @@ def process_dataframe_timeline(args): | |
|
||
# note that we are not adding any columns to the data frame here, so no risk of overwrite | ||
args["data_frame"] = df.with_columns( | ||
**{ | ||
args["x_end"]: ( | ||
nw.col(args["x_end"]) - nw.col(args["x_start"]) | ||
).dt.total_milliseconds() | ||
} | ||
(nw.col(args["x_end"]) - nw.col(args["x_start"])) | ||
.dt.total_milliseconds() | ||
.alias(args["x_end"]) | ||
) | ||
args["x"] = args["x_end"] | ||
args["base"] = args["x_start"] | ||
|
@@ -2594,20 +2618,22 @@ def make_figure(args, constructor, trace_patch=None, layout_patch=None): | |
group_sum = group.get_column( | ||
var | ||
).sum() # compute here before next line mutates | ||
group = group.with_columns(**{var: nw.col(var).cum_sum()}) | ||
group = group.with_columns(nw.col(var).cum_sum().alias(var)) | ||
if not ascending: | ||
group = group.sort(by=base, descending=False, nulls_last=True) | ||
|
||
if args.get("ecdfmode", "standard") == "complementary": | ||
group = group.with_columns( | ||
**{var: (nw.col(var) - nw.lit(group_sum)) * (-1)} | ||
((nw.col(var) - nw.lit(group_sum)) * (-1)).alias(var) | ||
) | ||
|
||
if args["ecdfnorm"] == "probability": | ||
group = group.with_columns(**{var: nw.col(var) / nw.lit(group_sum)}) | ||
group = group.with_columns( | ||
(nw.col(var) / nw.lit(group_sum)).alias(var) | ||
) | ||
elif args["ecdfnorm"] == "percent": | ||
group = group.with_columns( | ||
**{var: nw.col(var) / nw.lit(group_sum) * nw.lit(100.0)} | ||
(nw.col(var) / nw.lit(group_sum) * nw.lit(100.0)).alias(var) | ||
) | ||
|
||
patch, fit_results = make_trace_kwargs( | ||
|
@@ -2835,22 +2861,3 @@ def _spacing_error_translator(e, direction, facet_arg): | |
annot.update(font=None) | ||
|
||
return fig | ||
|
||
|
||
def _to_unix_epoch_seconds(s: nw.Series) -> nw.Series: | ||
dtype = s.dtype | ||
if dtype == nw.Date: | ||
return s.dt.timestamp("ms") / 1_000 | ||
if dtype == nw.Datetime: | ||
if dtype.time_unit in ("s", "ms"): | ||
return s.dt.timestamp("ms") / 1_000 | ||
elif dtype.time_unit == "us": | ||
return s.dt.timestamp("us") / 1_000_000 | ||
elif dtype.time_unit == "ns": | ||
return s.dt.timestamp("ns") / 1_000_000_000 | ||
else: | ||
msg = "Unexpected dtype, please report a bug" | ||
raise ValueError(msg) | ||
else: | ||
msg = f"Expected Date or Datetime, got {dtype}" | ||
raise TypeError(msg) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.