I assume that you are not pursuing GPU computation just for the sake of it. In other words, you are willing to consider CPU or "traditional techniques", comparing multiple techniques, and finally choosing whatever gives the highest performance, instead of having to stick to using GPU in order to make a thesis.
I find that Chain Code (contour tracing) algorithm is sufficient for the task, if you smooth the contour coordinates afterwards.
In terms of concurrency and parallelism
If a collection of starting points for contour tracing is supplied, then contour tracing can be run independently by each agent working on one starting point. This is possible because contour tracing is non-destructive on the image or label matrix - it only reads neighbor values around the current position.
However, contour tracing is not data-parallel. In other words, it cannot be vectorized in a SIMD or SPMD machine.
- Contours can have drastically different lengths. So, some agents will finish early, some will finish late. They will not have coherent flow-control patterns.
- Also, the memory access patterns will be scattered all over the image. It will also be very wasteful on CPU caches - from each cache line, maybe only up to three pixel values will actually be meaningful for the contour tracing algorithm.
Is there a divide-and-conquer algorithm for contour tracing of a single blob?
I don't know. Please leave a comment if you find one, because I'd like to learn about that too. I haven't done a serious literature search so I may be ignorant. (Of course you should conceal it if it will form the backbone of your thesis.)
On the other hand, I have implemented a tile-based algorithm for connected-component labeling, based on labeling each tile independently and follow up with a "seam stitching" process, and finish off with a final label assignment. It is not data-parallel (not SIMD/SPMD) but it is highly parallelizable - the image can be divided into hundreds or thousands of tiles.
Regarding performance.
In a previous project where I used contour tracing, I was pleasantly surprised that contour tracing algorithms is in general much faster than the execution time for connected-component labeling algorithms, for the type of images I processed, because contour tracing does not perform as many memory operations as would be required by connected-component labeling.
Note that in general, contour-tracing and connected-component labeling aren't direct substitutes for each other - they give outputs in different representations, despite the outputs corresponding to the same blob. You may have to run both algorithms - label the whole image first, sample the "contour starting points", and trace out the contours of each.
Finding nested connected components.
There is an enhanced contour tracing algorithm which resolves nested connected component relationship.
- Given the contour of the outer connected component,
- Convert the chain code sequence into a sequence of markers.
- There are two types of markers:
- Marks the leftmost edge (begin X-position) of a horizontal run of pixels
- Marks the rightmost edge (end X-position) of a horizontal run of pixels
- Which marker is generated for the current chain code depends on the orientation of the current chain code.
- Sort the marker sequence by vertical Y-position (major), and then by X-position (minor).
- The sorted marker sequence then becomes a run-length raster descriptor of the blob surrounded by the contour.
Smoothing the contour coordinates into floating point.
The smoothing of contour coordinates will cause it to change from discrete (integers) to floating point. After that, you will find that the summation of the pairwise Euclidean distance is good enough for most purposes.
The exact detail of smoothing isn't important - for example, you can use Gaussian smoothing, applying it to the sequence of contour X and Y coordinates just like a 1D convolution. Keep in mind the circular (periodic / wraparound) nature of the sequence.
You're going to have to loop through all the points and calculate the distance. There is a great question on StackOverflow about how to calculate the distance: Shortest distance between a point and a line segment
Some of the work can be precalculated, given that you have to do this more than once for a given line segment. You also don't need to figure out the smallest distance, only the smallest distance squared (since squaring is strictly increasing). I converted the top voted answer in that question to a Java version that does this precalculation (in the constructor), and is easier to read and follow. Unfortunately I don't know Python well enough to give you Python code, but you should be able to figure it out.
import java.util.Arrays;
import java.util.List;
public class LineSegment {
public static class Point {
public final double x;
public final double y;
public Point(double x, double y) {
this.x = x;
this.y = y;
}
public String toString() {
return "(" + x + "," + y + ")";
}
}
public static void main(String[] args) {
LineSegment segment = new LineSegment(new Point(0, 3), new Point(2, 0));
List<Point> pointList =
Arrays.asList(new Point[] { new Point(-5, 3), new Point(1, 1),
new Point(2, 3), new Point(0, 5) });
Point answer = segment.closestPoint(pointList);
System.out.println("The closest point is: " + answer);
}
private static double sqr(double x) {
return x * x;
}
private static double distanceSquared(Point v, Point w) {
return sqr(v.x - w.x) + sqr(v.y - w.y);
}
private final Point firstSegPoint;
private final Point secondSegPoint;
private final double segmentDistance;
private double xDifference;
private double yDifference;
public LineSegment(Point firstSegPoint, Point secondSegPoint) {
this.firstSegPoint = firstSegPoint;
this.secondSegPoint = secondSegPoint;
this.segmentDistance = distanceSquared(firstSegPoint, secondSegPoint);
this.xDifference = secondSegPoint.x - firstSegPoint.x;
this.yDifference = secondSegPoint.y - firstSegPoint.y;
}
public Point closestPoint(List<Point> pointList) {
double minDistance = Double.POSITIVE_INFINITY;
Point answer = null;
for (Point point : pointList) {
double distSquared = distToSegmentSquared(point);
if (distSquared < minDistance) {
answer = point;
minDistance = distSquared;
}
}
return answer;
}
private double distToSegmentSquared(Point input) {
if (segmentDistance == 0)
return distanceSquared(input, firstSegPoint);
double xComponent = (input.x - firstSegPoint.x) * xDifference;
double yComponent = (input.y - firstSegPoint.y) * yDifference;
double t = (xComponent + yComponent) / segmentDistance;
if (closestPointIsFirst(t))
return distanceSquared(input, firstSegPoint);
if (closestPointIsSecond(t))
return distanceSquared(input, secondSegPoint);
Point closestPointOnLine =
new Point(firstSegPoint.x + t * xDifference, firstSegPoint.y
+ t * yDifference);
return distanceSquared(input, closestPointOnLine);
}
private boolean closestPointIsFirst(double t) {
return t < 0;
}
private boolean closestPointIsSecond(double t) {
return t > 1;
}
}
See the full implementation here: http://ideone.com/fBFwda
Best Answer
Well, this turns out to be non-trivial extension of Veldaeven's solution.
For each sequence of 3 points, calculate the 2 vectors between them, as suggested by Veldaeven.
Find the angle between the two vectors by adding the angles from the x axis - (nb do not use dot product which always returns an angle less than pi and hence is too symmetrical for our problem).
Halve this angle. This is the perpendicular bisector along which the resulting vector will lie.
Now to calculate which quadrant to draw the point in (the "which side" problem). Find the difference between the 2 vectors. If the gradient of this (i.e. the "double derivative") is positive then the path is going back on itself through a maximum: add pi/2. If the gradient is negative, the path is going through a minimum: subtract pi / 2
Let's give it a simple rectangular path
Try out our function:
Yes, the points are all inside the input path at a distance of sqrt(2) from the vertex.
NB all methods using e.g. the cross product of the two vectors are too symmetrical.
Hopefully this will help someone in the future.