Algorithms in MATLAB – How to Reduce Close Points

algorithmsMATLAB

input: C matrix 2xN (2D points)

output: C matrix 2xM (2D points) with equal or less points.

Lets say we have C matrix 2xN that contains several 2D points, and it looks something like that:
enter image description here

What we want is to group "close" points to one point, measured by the average of the other points. For example, in the second image, every group of blue circle will be one point, the point coordinate will be the average point off all points in the blue circle. also by saying "close", I mean that: their distance one to each other will be smaller then DELTA (known scalar). So wanted output is:

enter image description here

About running time of the algorithm, I don't have upper-limit request but I call that method several times…
I am using Matlab, and what i have tried is this:

function C = ReduceClosePoints(C ,l_boundry)
    x_size = abs(l_boundry(1,1)-l_boundry(1,2)); %220
    DELTA = x_size/10;
    T = [];
    for i=1:size(C,2)
        sum = C(:,i);
        n=1;
        for j=1:size(C,2)
            if i~=j     %not same point
                D = DistancePointToPoint(C(:,i),C(:,j));
                if D < DELTA
                    sum = sum + C(:,j);
                    n=n+1;
                end
            end
        end
        sum = sum./n; %new point -> save in T matrix
        T = [T sum];
    end
    C = T;
end

And its not working 🙁
Also I am new to Matlab.
Thank you for your help!!

Best Answer

You need to do this in two steps. First, group nearby points together. Then, replace each group by one point.

Let s be a nonempty subset of your points and distance(p,s) the smallest distance between p and some element of s. Now here is some pseudocode for creating groups of "point clusters":

 input: pointlist
 groups:={}   // will contain groups of points, disjoint subsets of "pointlist"
 for each p in pointlist
     let newgroup:={p} 
     for each g in groups where `distance(p,g)<DELTA`
          newgroup := join(newgroup, g)  // join the two sets
          remove g from groups
     next g
     add newgroup to groups
 next p
 output: groups

Note that this algorithm does not guarantee that each pair of points of a group has a distance < DELTA, only that each found group cannot be divided into two smaller groups with distance >= DELTA. You have to decide for yourself if that is sufficient, but as @DieterLücking has pointed out, don't expect to find an algorithm to solve the problem exactly the way you have described it originally.

Once you have your groups of points calculated, you can replace each group by the median value of it's points.

Concerning running time: the running time depends heavily of calculating the distance function fast. So in case the straight forward implementation of subsets by lists or arrays is not fast enough (which you should test first before starting to optimize!), then you might consider to use an improved data structure here. For example, you could keep track of the "bounding box" of all elements of a subset, since if distance(p, boundingbox(s))>=DELTA, then you can be sure that distance(p, s)>=DELTA as well.

Related Topic