-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Speed up skimage.feature.peak_local_max #3984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up skimage.feature.peak_local_max #3984
Conversation
@clementkng Thanks for submitting. Is it still a WIP? I see that all the tests are passing... |
@sciunto Hi, I believe it may still be a WIP b/c the algorithm has changed a bit since to pass the tests, so I'm not sure if the performance gain is still there. Also, I'm considering putting in new tests that @brownmk used initially to test the code. Or do you think there's enough coverage w/ the existing tests? |
Hi, The performance gain is still there. If you want to add my test case, use the method below:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left several comments (sorry if I raised some on which you already wanted to work on before reviews). They are all minor and intended to improve the code readability.
skimage/feature/peak.py
Outdated
else: | ||
return out | ||
|
||
threshold_abs = threshold_abs if threshold_abs is not None else image.min() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I would move the block "if type(exclude_border)..." few lines above just before or after this line, to gather these lines which have the same spirit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only this one to fix, and it will be good for me.
skimage/feature/peak.py
Outdated
|
||
for i, obj in enumerate(ndi.find_objects(labels)): | ||
label = i + 1 | ||
# print("Label>", label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line can be removed.
skimage/feature/peak.py
Outdated
|
||
def _exclude_border(mask, footprint, exclude_border): | ||
""" | ||
Remove peaks round the borders |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove peaks round the borders | |
Remove peaks round the borders. |
skimage/feature/peak.py
Outdated
def _get_peak_mask(image, min_distance, footprint, threshold_abs, | ||
threshold_rel): | ||
""" | ||
Return the mask containing all peak candidates above thresholds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return the mask containing all peak candidates above thresholds | |
Return the mask containing all peak candidates above thresholds. |
skimage/feature/peak.py
Outdated
footprint, exclude_border) | ||
|
||
for i, obj in enumerate(ndi.find_objects(labels)): | ||
label = i + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable seems to be unused... (Am i correct?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable label is used in the next line:
img=image[obj]*(labels[obj]==label)
skimage/feature/peak.py
Outdated
return out | ||
|
||
threshold_abs = threshold_abs if threshold_abs is not None else image.min() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment just below " # In the case of labels, recursively build and return an output
# operating on each label separately" seems to be uncorrect to me now. There is no more recursive call and instead, it's a call to ndi on each label. Can you fix this please?
skimage/feature/peak.py
Outdated
inner_mask = _exclude_border(np.ones_like(labels, dtype=bool), | ||
footprint, exclude_border) | ||
|
||
for i, obj in enumerate(ndi.find_objects(labels)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a comment describing what this for loop is doing would help the reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brownmk How would you describe this loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For each label, extract a smaller image enclosing the object of interest, identify num_peaks_per_label peaks and mark them in variable out.
skimage/feature/peak.py
Outdated
for i, obj in enumerate(ndi.find_objects(labels)): | ||
label = i + 1 | ||
# print("Label>", label) | ||
img = image[obj] * (labels[obj] == label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a less generic variable name here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brownmk What would you recommend here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call it image_object? Up to you.
Hello @clementkng! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2019-07-25 02:51:42 UTC |
skimage/feature/peak.py
Outdated
mask = mask.swapaxes(0, i) | ||
remove = (footprint.shape[i] if footprint is not None | ||
else 2 * exclude_border) | ||
mask[:remove // 2] = mask[-remove // 2:] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really dislike swapping axes
mask[(slice(None),) * i + slice(None, remove // 2)] = False
mask[(slice(None),) * i + slice(-remove // 2, None)] = False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, are you referring to the line above or below or both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace
mask = mask.swapaxes(0, i)
remove = (footprint.shape[i] if footprint is not None
else 2 * exclude_border)
mask[:remove // 2] = mask[-remove // 2:] = False
mask = mask.swapaxes(0, i)
with what @hmaarrfk provided:
remove = (footprint.shape[i] if footprint is not None
else 2 * exclude_border)
mask[(slice(None),) * i + slice(None, remove // 2)] = False
mask[(slice(None),) * i + slice(-remove // 2, None)] = False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10000The code actually should be:
def _exclude_border(mask, footprint, exclude_border):
"""
Remove peaks round the borders
"""
# zero out the image borders
for i in range(mask.ndim):
remove = (footprint.shape[i] if footprint is not None
else 2 * exclude_border)
mask[(slice(None),) * i + (slice(None, remove // 2),)] = False
mask[(slice(None),) * i + (slice(-remove // 2, None),)] = False
return mask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making that super explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment should be ("round" was a typo):
Remove peaks near the borders
out = np.zeros_like(image, dtype=np.bool) | ||
|
||
# no peak for a trivial image | ||
if np.all(image == image.flat[0]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trivial images are definitely an "edge case", is this check really necessary? the call to np.all
and checking equality for large images is actually quite expensive. Since this is an "unlikely" scenerio, is it better to move this logic to the end or remove it all together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving the code to the end does not help, as it will still need to be checked before final results are returned (potentially overwrite peak candidates).
I am okay with removing the logic, just we might need to somehow documente that there could be an incompatibility for edge cases.
Here is an example, where there was no peak before, it will become 25 peaks. I agree this is a case probably users should check, instead of checking it every time within the library.
image = np.ones((5, 5))
# return 0, will return 25 if we remove the trivial case check
print(len(peak.peak_local_max(image, min_distance=0, threshold_abs=0, indices=True)))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks for the explanation. I guess this can be left as a secondary issue.
|
||
coordinates = _get_high_intensity_peaks(image, out, num_peaks) | ||
if indices is True: | ||
return coordinates | ||
else: | ||
nd_indices = tuple(coordinates.T) | ||
out[nd_indices] = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coordinates is obtained from out
. Do you need to set out
again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. If num_peaks is provided, out may contain more than num_peaks candidates. It needs to be regenerated in that case.
I actually see a bug here. It was missing a statement out.fill(False).
The code review really helps!!
It should be:
if not indices and np.isinf(num_peaks):
return out
coordinates = _get_high_intensity_peaks(image, out, num_peaks)
if indices:
return coordinates
else:
out.fill(False)
nd_indices = tuple(coordinates.T)
out[nd_indices] = True
return out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did the first two lines of code come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first two lines is to deal with the case where user does not provide num_peaks (defaults to all, np.inf) and indices is False, out is already the answer that can be returned.
The code I provided (bold) in within the context below:
for i,obj in enumerate(ndi.find_objects(labels)):
label=i+1
#print("Label>", label)
img=image[obj]*(labels[obj]==label)
mask=_get_peak_mask(img, min_distance, footprint, threshold_abs, threshold_rel)
if exclude_border:
# remove peaks fall in the exclude region
mask &= inner_mask[obj]
coordinates =_get_high_intensity_peaks(img, mask, num_peaks_per_label)
nd_indices = tuple(coordinates.T)
mask.fill(False)
mask[nd_indices] = True
out[obj] += mask
if not indices and np.isinf(num_peaks):
return out
coordinates = _get_high_intensity_peaks(image, out, num_peaks)
if indices:
return coordinates
else:
out.fill(False)
nd_indices = tuple(coordinates.T)
out[nd_indices] = True
return out
# Non maximum filter
mask = _get_peak_mask(image, min_distance, footprint, threshold_abs, threshold_rel)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brownmk are you sure you don't want to open your own PR and take it over from here? You seem to be going through this quite extensively. Github Desktop
, and editors like Atom
or VS Code
make git much less painful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's less of git, it's the 500-line guideline seems rather intimidating. All issues are addressed now, I will try to learn it next time.
https://github.com/scikit-image/scikit-image/blob/master/CONTRIBUTING.txt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a really fair point. I'll open an issue about this.
Most of that is a reference for us to point people to when they hit problems. We often forget how we setup our own environments....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brownmk If you want, I can try guiding you through a setup process in exchange for allowing me to create a PR w/ your code.
Anyway to add teh benchmark code to the benchmarks dir? |
@hmaarrfk I can add the benchmark code (assuming you mean the code sample under Way to Produce in the original issue). I can extrapolate based off of other benchmarks how the benchmark for |
i don't think you need conda for asv do you? |
According to the install link, I need either virtualenv or conda. For now, the virtualenv solution is not working for me, possibly b/c I'm working of the Windows Subsystem for Linux. |
Having never used WSL it seems like a potential challenge. I feel like you are always going to hit these problems and never know if your code is wrong, or if it is a bug in WSL. |
|
Seems to work too
|
I think the test for the trivial case should be added back. Unless we decide that this is a behavioural change we want to make. Which I don't think this PR is advocating for. You can test for it in a different way, that isn't O(N) in cost. $ git diff skimage/
diff --git a/skimage/feature/peak.py b/skimage/feature/peak.py
index 7736bf70c..7aa6ae20a 100644
--- a/skimage/feature/peak.py
+++ b/skimage/feature/peak.py
@@ -190,6 +190,14 @@ def peak_local_max(image, min_distance=1, threshold_abs=None,
out[obj] += mask
coordinates = _get_high_intensity_peaks(image, out, num_peaks)
+
+ # Test for the case where the image is a constant
+ # Here, we decide to return to the user that there is no local maximum
+ if coordinates.shape[0] == image.size:
+ if indices is True:
+ return np.empty((0, 2), np.int)
+ else:
+ return np.zeros_like(image, dtype=bool)
if indices is True:
return coordinates
else:
@@ -205,6 +213,14 @@ def peak_local_max(image, min_distance=1, threshold_abs=None,
# Select highest intensities (num_peaks)
coordinates = _get_high_intensity_peaks(image, mask, num_peaks)
+ # Test for the case where the image is a constant
+ # Here, we decide to return to the user that there is no local maximum
+ if coordinates.shape[0] == image.size:
+ if indices is True:
+ return np.empty((0, 2), np.int)
+ else:
+ return np.zeros_like(image, dtype=bool)
+
if indices is True:
return coordinates
else: A test like the In [1]: from skimage.feature.peak import peak_local_max
i
In [2]: import numpy as np
In [3]: image = np.ones((5, 5))
In [4]: print(len(peak_local_max(image, min_distance=0, threshold_abs=0, indices=True)))
0 Should also be added to test suite so that this behaviour doesn't magically disappear |
The change is not right. One cannot determine if the image is trivial by relying on the number of peaks detected. The returned number of peaks depends on the settings of threshold and num_peaks. For instance one can set num_peaks to 1 for a trivial image. We should stick with the original code here.
|
Ah got it. Sorry for misunderstanding that. Yeah, for this PR. the original check should be kept. |
background? I think you had the PR mostly complete. I don't think the failures are due to your code. |
skimage/feature/tests/test_peak.py
Outdated
dist = ndi.distance_transform_edt(mask) | ||
local_max = peak.peak_local_max(dist, min_distance=20, indices=True, | ||
exclude_border=False, labels=labels) | ||
assert (len(local_max) == 625) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert (len(local_max) == 625) | |
assert len(local_max) == 625 |
just triggering a build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just apply this to trigger a fresh new build. I think builds are fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Builds should be passing these days.
I think you can likely remove your WIP too. |
Thanks @clementkng That's a great contribution! |
scikit-image#3984 adds an optimization where a flat image immediately returns. However, it returns an array where the shape is incorrectly set to always be 2D when indices=True. A test is added to catch this scenario, and a fix is introduced where we set the output array's shape based on the input array's shape.
* Fix peak finding for empty images when indices=True #3984 adds an optimization where a flat image immediately returns. However, it returns an array where the shape is incorrectly set to always be 2D when indices=True. A test is added to catch this scenario, and a fix is introduced where we set the output array's shape based on the input array's shape.
Description
@brownmk's proposal to modify the
peak_local_max
algorithm to recursively pass a small image enclosing each object rather than an image of the same size, improving algorithm performance.Fixes gh-3974
Checklist
- [ ] Gallery example in./doc/examples
(new features only)./benchmarks
, if your changes aren't covered by anexisting benchmark
For reviewers
later.
__init__.py
.doc/release/release_dev.rst
.@meeseeksdev backport to v0.14.x