8000 Fixes to apply_parallel for functions working with multichannel data by grlee77 · Pull Request #4927 · scikit-image/scikit-image · GitHub
[go: up one dir, main page]

Skip to content

Fixes to apply_parallel for functions working with multichannel data #4927

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 5, 2020

Conversation

grlee77
Copy link
Contributor
@grlee77 grlee77 commented Aug 18, 2020

Description

closes #4900

The multichannel argument helps give sensible default chunks and expands scalar depth arguments appropriately for multichannel data.

A concrete example where the dtype argument is needed for this to pass was added as a test case. There is an explanation of the reason it is needed in #4900 (comment).

Checklist

For reviewers

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.

@grlee77
Copy link
Contributor Author
grlee77 commented Aug 18, 2020

One subtlety: specifying dtype does not guarantee the specific output dtype that will be returned by apply_parallel. For floating point inputs, the dtype of the output seems to match precision of the input and not that of the specified dtype argument.

The following example demonstrates via direct use of map_blocks that the output dtype is determined via a combination of the input data-type and the dtype argument to map_blocks.

import numpy as np
import dask.array as da
for dtype_in, dtype_map_blocks in [
    (np.uint8, np.float32),
    (np.uint16, np.float32),
    (np.uint32, np.float32),
    (np.float32, np.float16),
    (np.float32, np.float32),
    (np.float32, np.float64),
    (np.float64, np.float16),
    (np.float64, np.float32),
    (np.float64, np.float64)
]:
    x = da.from_array(np.arange(64, dtype=dtype_in))

    out = da.map_blocks(np.sqrt, x, chunks=(8,), dtype=dtype_map_blocks).compute()
    print(f"dtype_in={np.dtype(dtype_in).name}, "
          f"dtype_map_blocks={np.dtype(dtype_map_blocks).name} -> "
          f"dtype_out={out.dtype.name}")
dtype_in=uint8, dtype_map_blocks=float32 -> dtype_out=float16
dtype_in=uint16, dtype_map_blocks=float32 -> dtype_out=float3
8000
2
dtype_in=uint32, dtype_map_blocks=float32 -> dtype_out=float64
dtype_in=float32, dtype_map_blocks=float16 -> dtype_out=float32
dtype_in=float32, dtype_map_blocks=float32 -> dtype_out=float32
dtype_in=float32, dtype_map_blocks=float64 -> dtype_out=float32
dtype_in=float64, dtype_map_blocks=float16 -> dtype_out=float64
dtype_in=float64, dtype_map_blocks=float32 -> dtype_out=float64
dtype_in=float64, dtype_map_blocks=float64 -> dtype_out=float64

@sciunto sciunto added this to the 0.18 milestone Aug 19, 2020
@emmanuelle
Copy link
Member

One subtlety: specifying dtype does not guarantee the specific output dtype that will be returned by apply_parallel. For floating point inputs, the dtype of the output seems to match precision of the input and not that of the specified dtype argument.

Should the docstring be modified to suggest that this parameter is here to help dask, but is by no means a guarantee of the output dtype?

Copy link
Member
@emmanuelle emmanuelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @grlee77 ! I just left a small comment.

Your PR also reminds me that we should incorporate the doc on apply_parallel of #4214 and #3386 but this is independent of this PR!

@sciunto
Copy link
Member
sciunto commented Sep 5, 2020

thank you @grlee77 !

@sciunto sciunto merged commit 3a0ac03 into scikit-image:master Sep 5, 2020
@grlee77 grlee77 deleted the apply_parallel_rgb branch July 8, 2021 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging t 3FFD his pull request may close these issues.

IndexError with util.apply_parallel (RGB images)
4 participants
0