-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Description
The function src/distance/edge.hpp::found::ConnectedComponentsAlgorithm takes a long time to run on a standard image. This is partly due to:
- Use of standard template objects (
std::map,std::unordered_map, etc.) that both allocate heap data AND also do complex things with it. - The nature of the return type, which is an
std::vectorwhose size does not need to be added to after. - Each element of the return type is a group of connected pixels. It is a
structthat contains:
a) The "top-left" and "bottom-right", or the pixels that define a box that contains all the pixels in the group. These are 2 Vec2, which is inefficient because all pixel locations can be defined viauint64_tas the x and y coordinates, and notdecimal(Floating Point types take a lot more math to store and do calculations around).
b) Anunordered_set(hash set) of all the pixels asVec2. Same inefficiency.
Note
For context, componentPoints used to be a std::map that mapped from pixel index to its group number (L). Of course, not every pixel will have a value (because not all pixels belong to a group), but by instead using a size-constant heap array as the map (much bigger than the internal memory used by std::map, the runtime decreased by half, even though memory usage went up.
Potential Solutions
1) Standard Template Objects
At the beginning of the algorithm we define 3 types:
Components ConnectedComponentsAlgorithm(const Image &image, std::function<bool(uint64_t, const Image &)> Criteria) {
// Step 0: Setup the Problem
std::unordered_map<int, Component> components;
std::unordered_map<int, int> equivalencies;
std::unique_ptr<int[]> componentPoints(new int[image.width * image. Height]{});
...
}components: Maps from group number to group (from return type)equivalencies: Maps group numbers to equivalent group numbers.componentPoints: Maps from pixel index to group number (already optimized).
The first two fields can be optimized. You will notice that their maximum capacity would be the number of possible groups, and the key is always the group number (at the moment, the group number starts at 1, but that can always be changed to be zero based. I don't recommend changing this so you take advantage of zero initialization). Thus, you can convert both fields to arrays + size trackers if you know their maximum capacity. You may conduct a proof on this, but the maximum size would be the ceiling of image area / 4, or
where
Tip
You don't need to invoke std::ceil for this. Think of how you can natively use / and % operators together.
2) Returning an std::vector with non-increasing size
This eliminates need for a vector really, and if the user doesn't change the size of the array, we should just return a struct with the array and its size (still has to be heap-allocated). This could be integrated with the previous step!
3) Return Type's Element Type
A few optimizations can be performed here:
- Make an internal
structdefinition for aVec2(do not call it that) that usesuint32_tas the underlying type and substitute it throughout the return type. Also provide the maximum size tostd::unordered_set(the image's area) viastd::unordered_set::reserve(size_t count) - Eliminate the
std::unordered_setfrom the return type and just return thestd::moveedcomponentPointsalong with the metadata on each group. You will need to modify the return type in this case to be astruct, and properly set equivalent groups in a parallel manner that's being done in the algorithm within thecomponentPoints.