Algorithm to Find All Points Within a Certain Distance

algorithms

Let's say, we have… 50k points randomly allocated in 3D space, according to some volumetric function, so that in some parts of the space, points are closer together, but in other parts, they are further away. Every point has a random color (represented by an integer, for example)

I would like to find all points that:

  • Are within a certain (general) distance of each other, and
  • are of different color.

What is the fastest way to compute this?

Further, non-mandatory details:

My end goal, is to remove all points with the same color, until there are no more collisions. I would like to remove as few colors as possible, however, this is not mandatory I could simply remove all colors that are involved in any collisions AT ALL, though, it could result in a lower quality result.

The distance does not need to be exact. For instance, if an exact "collision radius" represents a perfect sphere, then I would be fine with the collision radius being the shape of, say, a cube. I am also fine with a trigger-happy solution. That is, I am fine if points get "detected" even if they are not too close. I just don't want this to happen very often, and I cannot accept points that are to close not being detected.

Also, I'm sure there is a family of algorithms devoted to this, but I can't for the life of me come up with any names… Knowing the name of an algorithm of this type would be immensely helpful.

Best Answer

Here is what I would do:

  1. Divide the space into a regular lattice of cubes. The the length of the side of each cube should be half the minimum distance between points.
  2. For the first point, see which cube it is in. Add a flag to the cube that contains the point to indicate that that cube contains a point of that color.
  3. For the next point, see which cube is in. If none of the adjacent cubes are flagged with a point of that color, flag the cube with the color of the point like in (2). Otherwise throw the point away.
  4. If you still have unprocessed points, go to 3.

Advantages

The advantages of this approach are:

  1. It is fast
  2. It is easy to understand and to implement

Limitations

This algorithm has several limitations:

Use of Cubes

The algorithm will throw away too many points because it uses cubes instead of spheres. The question suggests that this is OK.

It it is a problem, however, this limitation can be overcome.

One way is to add the coordinates of the point to the flags. Then, when you test points in step (3) and you find that there is already a point in an adjacent cube, check the distance to that point before you decide whether to throw the point away.

Of course, this will slow things down / add complexity / increase the memory requirements of the algorithm.

Order of Testing Points

The number of points you end up with will depend on the order in which you test the points. To prove this, imagine the situation where you have just 3 red points, and that the cubes they fall into happen to be in a row:

+-----+-----+-----+
|     |   . |     |
|.    |     |     |
|     |     |   . |
+-----+-----+-----+

In this case, you could either throw away the point in the centre cube or the two points in the adjacent cubes.

To improve on the original algorithm, then, you could use the cubes as a basis to count the points of a given color that are tooo close to each other, then throw away the points that have the most points too close. I have not explored this in much depth: I'd need to work through a set of scenarios on paper to figure out the best way to do this.

Memory Use

Another potential problem is that the algorithm could use up quite a lot of memory.

A sensible first step would be to limit the number of cubes created to the minimum by calculate the bounding cuboid for all the points and then only using enough cubes to contain the bounding cuboid.

Another improvement would be to divide the problem into larger cubic regions that contain sub-sets of the full set of points, and processing each of these regions separately. Naturally, special care will need to be taken at the boundaries of thes regions.

Related Topic