horizontally threaded labeled regions #4060
Replies: 1 comment 3 replies
-
Hi @andreas-kupries, That's a very cool idea! I suppose the difficulty (if any) would be that the memory use could be extremely high (much larger than the source image) in pathological cases. I think your worst case would be a checkerboard pattern (is this right?) where the sparse range data would be number-of-pixels * (3 * sizeof(uint32) + 1), or 9 times larger than the original image. Perhaps this doesn't matter, but I wonder if a tile-based (rather than scanline-based) approach could get the worst-case memory use down? Cut the image into eg. 128x128 tiles, find connected components in parallel in each tile with a conventional flood fill approach (libvips has one of these), and write the separate tiles out. A second phase, running in parallel as tiles are computed, could find a rename array for each tile. It could look along the tile edges where they join, and if the new tile had a different label for the same pixel value, it could add an entry to a rename array for the new tile. You'd have to wait for the whole of phase 2 to complete and write back to eg. disc, but a final stage could then push the image through the set of rename arrays to create a single unified label image. When you compute the rename arrays, the pathological case is a rename for every pixel along the joining edges, so 2 * 128 * sizeof(uint32), or 1024 bytes. A tile is 64kb, so peak memory use will be 1/64th of the image size. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
See CC dev docs for a description of my take on calculating labeled regions / connected components in a horizontally threaded manner.
Example results at:
Hopefully this inspires ideas for how VIPS can do this as well.
My primary function (See 1 above) actually does not return an image. It instead returns a data structure describing the regions (area, bounding box, and the range pieces it consists of).
The secondary function (See 2 above) uses (1), transforms the data structure it gets a bit, then feeds the result into from sparse ranges, a virtual image taking range information to describe its content. The other, non-range information, is placed into the image meta data.
See https://core.tcl-lang.org/akupries/aktive/file?ci=7215f3713794bae4&name=etc/transformer/cc.tcl&ln=34-45 for the implementation.
Beta Was this translation helpful? Give feedback.
All reactions