input: C matrix 2xN (2D points)
output: C matrix 2xM (2D points) with equal or less points.
Lets say we have C matrix 2xN that contains several 2D points, and it looks something like that:
What we want is to group "close" points to one point, measured by the average of the other points. For example, in the second image, every group of blue circle will be one point, the point coordinate will be the average point off all points in the blue circle. also by saying "close", I mean that: their distance one to each other will be smaller then DELTA (known scalar). So wanted output is:
About running time of the algorithm, I don't have upper-limit request but I call that method several times…
I am using Matlab, and what i have tried is this:
function C = ReduceClosePoints(C ,l_boundry)
x_size = abs(l_boundry(1,1)-l_boundry(1,2)); %220
DELTA = x_size/10;
T = [];
for i=1:size(C,2)
sum = C(:,i);
n=1;
for j=1:size(C,2)
if i~=j %not same point
D = DistancePointToPoint(C(:,i),C(:,j));
if D < DELTA
sum = sum + C(:,j);
n=n+1;
end
end
end
sum = sum./n; %new point -> save in T matrix
T = [T sum];
end
C = T;
end
And its not working 🙁
Also I am new to Matlab.
Thank you for your help!!
Best Answer
You need to do this in two steps. First, group nearby points together. Then, replace each group by one point.
Let
s
be a nonempty subset of your points anddistance(p,s)
the smallest distance between p and some element of s. Now here is some pseudocode for creating groups of "point clusters":Note that this algorithm does not guarantee that each pair of points of a group has a distance < DELTA, only that each found group cannot be divided into two smaller groups with distance >= DELTA. You have to decide for yourself if that is sufficient, but as @DieterLücking has pointed out, don't expect to find an algorithm to solve the problem exactly the way you have described it originally.
Once you have your groups of points calculated, you can replace each group by the median value of it's points.
Concerning running time: the running time depends heavily of calculating the
distance
function fast. So in case the straight forward implementation of subsets by lists or arrays is not fast enough (which you should test first before starting to optimize!), then you might consider to use an improved data structure here. For example, you could keep track of the "bounding box" of all elements of a subset, since ifdistance(p, boundingbox(s))>=DELTA
, then you can be sure thatdistance(p, s)>=DELTA
as well.