-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Ability to Pass Dask Arrays as data
in DataArray Creation
#4650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Duplicate of #3929 |
As stated there and in dask/dask#6058, we would need to add a |
Thanks, I saw dask/dask#6058 but missed #3929. If I'm understanding you correctly there should be no problem passing a dask array for the data parameters its just the dims/coords. If the |
Have started to implement this but will continue the discussion in 3929 |
Is your feature request related to a problem? Please describe.
I'm trying to convert a dask dataframe into a dask xarray without having to load the data fully into memory.
I was hoping I'd be able to pass
df.values
which is a Dask array to thedata
parameter inxr.DataArray
However this raises the error:
ValueError: conflicting sizes for dimension 'datetime': length nan on the data but length 90386 on coordinate 'datetime'
Describe the solution you'd like
An ability to create DataArrays from dask dataframes, similar to the existing reverse method for converting Datasets to dask dataframes:
Dataset.to_dask_dataframe
Describe alternatives you've considered
I tried using
xr.Dataset.from_dataframe(df)
but it required the dataframe to be fully loaded into memoryAdditionally, unlike the standard Pandas dataframe the Dask dataframe does not have a
.to_xarray
method.Additional context
This is in part made necessary by the decision of the Zarr developers to not support saving of dask dataframes to zarr, instead suggesting that you convert to an xarray and then save that to zarr.
The text was updated successfully, but these errors were encountered: