8000 allow passing extra measurement functions to regionprops and regionprops_table by VolkerH · Pull Request #4810 · scikit-image/scikit-image · GitHub
[go: up one dir, main page]

Skip to content

allow passing extra measurement functions to regionprops and regionprops_table #4810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Jul 7, 2020
Merged

allow passing extra measurement functions to regionprops and regionprops_table #4810

merged 35 commits into from
Jul 7, 2020

Conversation

VolkerH
Copy link
Contributor
@VolkerH VolkerH commented Jun 30, 2020

Description

This is a WIP implements the feature request discussed here: #4466

It extends regionprops and regionprops_table by adding an extra argument, which is an iterable of callables/functions.

A user might want to do this if a measurment they require is not included in the built-in properties. By specifying these functions users won't have to duplicate the functionality that loops over the regions. Possible use cases:

  • provide a measurement function that is too specifically focused to include in the skimage code base
  • provide measurement function that is under an incompatible license
  • provide a faster implementation for an existing measurement function

Each provided function must either take one (region) or two (region, intensity_image) (names for illustration, this is by order) arguments.

In the issue, there was some discussion on whether the name and return dtype of the measurement function need to be passed in addition, as these are needed to build the regionprops table. Both the name of new property and the dtype are now inferred.
The name is inferred simply as the function name, and a warning is issued if there is a name clash. The return dtype is determined by passing a tiny test image to the function and inspecting the return value.

At this stage, the functionality is working but tests and examples still need to be added.

Code contributions:
Much of code in this PR was co-authored by @jni during a VS Code LiveShare session (somehow VS code only appended the co-authorship note to one of the commit messages).

Checklist

  • DONE Docstrings for all functions
  • TODO Gallery example in ./doc/examples (new features only)
  • SKIP Benchmark in ./benchmarks, if your changes aren't covered by an
    existing benchmark
  • TODO Unit tests
  • DONE Clean style in the spirit of PEP8

For reviewers

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.

@pep8speaks
Copy link
pep8speaks commented Jun 30, 2020

Hello @VolkerH! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 262:80: E501 line too long (85 > 79 characters)

Comment last updated at 2020-07-06 15:21:14 UTC

@jni
Copy link
Member
jni commented Jun 30, 2020

@scikit-image/core before we keep moving on this, it would be great to get some consensus on the API. Currently, it looks like this:

rp = regionprops(
    label_image,
    intensity_image,
    extra_properties=(mesh_surface, porosity, average_surface_curvature),
)

Those properties can then be accessed with rp[0].mesh_surface etc. I think it's pretty nice!

@stefanv
Copy link
Member
stefanv commented Jun 30, 2020

@scikit-image/core before we keep moving on this, it would be great to get some consensus on the API. Currently, it looks like this:

rp = regionprops(
    label_image,
    intensity_image,
    extra_properties=(mesh_surface, porosity, average_surface_curvature),
)

Those properties can then be accessed with rp[0].mesh_surface etc. I think it's pretty nice!

Shouldn't that be a dict instead of a tuple?

@jni
Copy link
Member
jni commented Jul 1, 2020

@stefanv

Shouldn't that be a dict instead of a tuple?

dict mapping what to what? It's a sequence of functions. We can make a dict mapping name -> function, but Python makes that unnecessary/redundant since we can use func.__name__.

@stefanv
Copy link
Member
stefanv commented Jul 1, 2020 via email

@jni
Copy link
Member
jni commented Jul 2, 2020

I guess I was thinking about lambdas

Yes, I would leave this concern for future PRs. We could also accept both a sequence and a dict, but I fear that complicates things.

VolkerH and others added 3 commits July 3, 2020 22:55
Co-authored-by: Juan Nunez-Iglesias <juan.nunez-iglesias@monash.edu>
Co-authored-by: Juan Nunez-Iglesias <juan.nunez-iglesias@monash.edu>
@VolkerH
Copy link
Contributor Author
VolkerH commented Jul 3, 2020

Thanks for those changes. Writing code in comments presents different challenges from just writing code.

@jni
Copy link
Member
jni commented Jul 3, 2020

Everything is passing! @scikit-image/core this is ready for review!

the dtype is inferred by calling the function on a small sample.
If the name of an extra property clashes with the name of an existing
property the extra property wil not be visible and a UserWarning is
issued.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we specify that the first argument of the property function must be the mask of the considered region and the optional second argument must be the corresponding image?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we should. So far that could only be derived from the examples. I just added this, happy for suggestions about how to phrase it better.

f"Extra property '{name}' is shadowed by existing "
"property and will be inaccessible. Consider renaming it."
)
warn(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we dynamically edit COL_DTYPES to avoid the handling of dtype for extra properties in _props_to_dict?

Suggested change
warn(msg)
warn(msg)
else:
COL_DTYPES[func.__name__] = _infer_regionprop_dtype(
func,
intensity=intensity_image is not None,
ndim=self._ndim,
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rfezzani here we chose to preserve existing behaviour rather than try to improve it, which could have unintended consequences. I suggest leaving dynamic inference of existing properties for a future PR.

23DA

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jni was faster than me. In general, I don't like modifying a global (in the namespace) state. That state change will persist in COL_DTYPES pote F438 ntially leading to the "unintended consequences" that @jni mentioned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion is not an improvement to existing behavior but the modification of the extra properties (new feature) dtype handeling ^^. Avoiding namespace variable modification is a good point, no problem ;)

the dtype is inferred by calling the function on a small sample.
If the name of an extra property clashes with the name of an existing
property the extra property wil not be visible and a UserWarning is
issued.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question?

VolkerH and others added 2 commits July 4, 2020 11:56
@VolkerH
Copy link
Contributor Author
VolkerH commented Jul 4, 2020

Thanks very much for the review and the suggestions @rfezzani, I made the suggested changes apart from the dynamic addition to COL_DTYPES (see comments there).

Copy link
Member
@rfezzani rfezzani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again @VolkerH

Co-authored-by: Riadh Fezzani <rfezzani@gmail.com>
Co-authored-by: Riadh Fezzani <rfezzani@gmail.com>
@rfezzani
Copy link
Member
rfezzani commented Jul 7, 2020

🎉 The CI is green again (almost :p). Merging!
Thank you again @VolkerH ^^

@rfezzani rfezzani merged commit 3a4dc58 into scikit-image:master Jul 7, 2020
@VolkerH
Copy link
Contributor Author
VolkerH commented Jul 7, 2020

Thank you for fixing the Sphinx build and the review @rfezzani . Credit should also go to @jni !

@refack
Copy link
refack commented Apr 2, 2021

Hello @VolkerH,

Firstly I would like to say that I love this feature, thank you for this 🙏.
Secondly I'd like to ask if there's a reason why you chose not to pass the RegionProperties object to the functions? (a.k.a. self in the context of their evaluation)

return func(self.image, self.intensity_image)

Just for background, for my use case I need to know the label of the region in order to cross corelate it.

@VolkerH
Copy link
Contributor Author
VolkerH commented Apr 6, 2021

Hi @refack ,

glad you like it, appreciate the feedback that it is useful.

As for your second question: it simply did not occur to me at the time and I wanted a simple function signature for the extra_properties function. Someone who implements such a function, should not need to know about any other input other than the mask and potentially the intensity image for the region.

If your extra property depends on other properties calculated for the region, I am not sure whether passing self will be so good as I don't think it can be guarantueed that those other properties will already have been calculated. Also, passing the object might allow an incorrectly implemented extra_properties function to modify the object (and thus other properties).

If your desired result depends on several properties, is it possible to calculate if from several columns of the regionsprops_table?

I have not looked at the regionprops code since last July (and that is the only time I looked at it) so I might be missing something. It may also be possible to add what you want without breaking things. But @jni (who deserves at least equal credit for implementing this feature) will be better positioned to answer that.

@jni
Copy link
Member
jni commented Apr 7, 2021

it simply did not occur to me at the time

Me neither! But just today I found that I could have used it!

I don't think it can be guaranteed that those other properties will already have been calculated.

Actually this is not a concern — all the properties are dynamically calculated. It would be a concern if you wanted to have chains of custom functions that depended on each other. Then you would have to make sure that you provided them in the order they are needed.

@refack I think this is certainly worth considering. Would you mind creating an issue specifically for this (e.g. "Proposal: optionally pass Regionprops object to custom properties") so that we can discuss the best way to do this? imho the argument should not be called "self" as that is a reserved-by-convention term in Python. Perhaps simply props?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0