Algorithm to Find All Points Within a Certain Distance

algorithms

Let's say, we have… 50k points randomly allocated in 3D space, according to some volumetric function, so that in some parts of the space, points are closer together, but in other parts, they are further away. Every point has a random color (represented by an integer, for example)

I would like to find all points that:

Are within a certain (general) distance of each other, and
are of different color.

What is the fastest way to compute this?

Further, non-mandatory details:

My end goal, is to remove all points with the same color, until there are no more collisions. I would like to remove as few colors as possible, however, this is not mandatory I could simply remove all colors that are involved in any collisions AT ALL, though, it could result in a lower quality result.

The distance does not need to be exact. For instance, if an exact "collision radius" represents a perfect sphere, then I would be fine with the collision radius being the shape of, say, a cube. I am also fine with a trigger-happy solution. That is, I am fine if points get "detected" even if they are not too close. I just don't want this to happen very often, and I cannot accept points that are to close not being detected.

Also, I'm sure there is a family of algorithms devoted to this, but I can't for the life of me come up with any names… Knowing the name of an algorithm of this type would be immensely helpful.

Best Answer

Here is what I would do:

Divide the space into a regular lattice of cubes. The the length of the side of each cube should be half the minimum distance between points.
For the first point, see which cube it is in. Add a flag to the cube that contains the point to indicate that that cube contains a point of that color.
For the next point, see which cube is in. If none of the adjacent cubes are flagged with a point of that color, flag the cube with the color of the point like in (2). Otherwise throw the point away.
If you still have unprocessed points, go to 3.

Advantages

The advantages of this approach are:

It is fast
It is easy to understand and to implement

Limitations

This algorithm has several limitations:

Use of Cubes

The algorithm will throw away too many points because it uses cubes instead of spheres. The question suggests that this is OK.

It it is a problem, however, this limitation can be overcome.

One way is to add the coordinates of the point to the flags. Then, when you test points in step (3) and you find that there is already a point in an adjacent cube, check the distance to that point before you decide whether to throw the point away.

Of course, this will slow things down / add complexity / increase the memory requirements of the algorithm.

Order of Testing Points

The number of points you end up with will depend on the order in which you test the points. To prove this, imagine the situation where you have just 3 red points, and that the cubes they fall into happen to be in a row:

+-----+-----+-----+
|     |   . |     |
|.    |     |     |
|     |     |   . |
+-----+-----+-----+

In this case, you could either throw away the point in the centre cube or the two points in the adjacent cubes.

To improve on the original algorithm, then, you could use the cubes as a basis to count the points of a given color that are tooo close to each other, then throw away the points that have the most points too close. I have not explored this in much depth: I'd need to work through a set of scenarios on paper to figure out the best way to do this.

Memory Use

Another potential problem is that the algorithm could use up quite a lot of memory.

A sensible first step would be to limit the number of cubes created to the minimum by calculate the bounding cuboid for all the points and then only using enough cubes to contain the bounding cuboid.

Another improvement would be to divide the problem into larger cubic regions that contain sub-sets of the full set of points, and processing each of these regions separately. Naturally, special care will need to be taken at the boundaries of thes regions.

Related Solutions

Visiting points on a number line while minimizing a cost not related to distance

I think you can improve this by looking at the problem as a directed graph of pairs of positions.

For this example I will use the line with values -9, -6, -1, 3, and 5.

Because it's too hard to draw a graph with just text, I'm going to represent the pairs as a table. We can think of the cells as representing the state where all containers between those two positions have been collected. Each cell has two values, one representing the cost to go left, the other representing the cost to go right (to the next container).

The values can simply be calculated as the distance between the two points multiplied by the number of barrels outside of those two points + 1. Cells where both numbers have the same sign represent cases when all barrels of the opposite sign have been collected. These are calculated using only the number of barrels in the direction away from zero.

In the table values of X mean that you can't go in that direction (all the barrels in that direction have been taken). The rows represent the current position of the collector and the columns represent the location of the next opposite barrel.

    +------+------+------+------+------+
    |  -9  |  -6  |  -1  |   3  |   5  |
+---+------+------+------+------+------+
|-9 |      |      |      | X, 24| X, 14|
+---+------+------+------+------+------+
|-6 | 3, X |      |      | 9, 27| 6, 22|
+---+------+------+------+------+------+
|-1 |      |10, X |      |20, 8 |15, 18|
+---+------+------+------+------+------+
| 3 |24, 4 |27, 6 |16, 8 |      | X, 2 |
+---+------+------+------+------+------+
| 5 |14, X |22, X |18, X |      |      |
+---+------+------+------+------+------+

The rules for moving between squares can be somewhat confusing.

For negative numbered columns, the rules are as follows:

Going right moves down one cell.
Going left moves down one cell and then mirrors across the diagonal.
If the right value is X, going left moves up to the diagonal (where the column and row are equal) and left by one.

For positive numbered columns, the rules are as follows:

Going left moves up one cell.
Going right moves up one cell and then mirrors across the diagonal.
If the left value is X, going right moves down to the diagonal and right by one.

Now we can run Dijkstra's algorithm to calculate the best path, using these movement rules to traverse the graph. Our starting positions are (-1, 3) and (3, -1) with initial costs of 5 and 15, respectively. Once we've calculated values for the two end positions (left of (-9, -9) and right of (5, 5)) the smaller of the two is our answer.

The running time for each of the steps are:

O(N log N) for initially sorting the input values along the line
O(N^2) for calculating the table/graph
O(N^2 log N) for running Dijkstra's on the graph (Note: there are at most two edges for any given vertex).

The first two steps are dominated by the last, so your overall runtime is O(N^2 log N) which should be a good enough runtime for the challenge.

Design – How to Detect Sprites in a SpriteSheet

After a little browsing, I created my own solution a few months back but failed to post it here in the interest of anybody that would like to do the same. Therefore, I will quickly describe the method before showing the code. It's in kotlin, but if you're familiar with Java and/or AS3 you should be able to understand the syntax.

The Method

Scan the entire image and store each color in a table.
Determine which color shows up most often- this is the background color
Start at the top left and scan each line.
If a pixel does not match the background color, then it is a sprite.
Use a backtracking algorithm to "fill out" the entire sprite.
Create a rectangle out of this plot.
Store it and cut the sprite out of the image.
Repeat 4-7 until finished.

The Code

The source is available on github, and here is the code w/o documentation or imports.

 fun unpack(spriteSheet: Path): List<Image> {
    Preconditions.checkArgument(Files.exists(spriteSheet),
                                "The file ${spriteSheet.getFileName()} does not exist")

    logger.debug("Loading sprite sheet.")
    // TODO: Convert to png so we have an alpha layer to work with
    val spriteSheetImage = readImage(spriteSheet).toBufferedImage()

    logger.debug("Determining most probable background color.")
    val backgroundColor  = determineProbableBackgroundColor(spriteSheetImage)
    logger.debug("The most probable background color is $backgroundColor")

    return findSprites(spriteSheetImage, backgroundColor) map(::copySubImage.bindFirst(spriteSheetImage))
}

private fun findSprites(image: BufferedImage,
                        backgroundColor: Color): List<Rectangle> {
    val workingImage = copy(image)

    val spriteRectangles = ArrayList<Rectangle>()
    for (pixel in workingImage) {
        val (point, color) = pixel

        if (color != backgroundColor) {
            logger.debug("Found a sprite starting at (${point.x}, ${point.y})")
            val spritePlot = findContiguous(workingImage, point) { it != backgroundColor }
            val spriteRectangle = Rectangle(spritePlot, image)

            logger.debug("The identified sprite has an area of ${spriteRectangle.width}x${spriteRectangle.height}")

            spriteRectangles.add(spriteRectangle)
            eraseSprite(workingImage, backgroundColor, spritePlot)
        }
    }

    logger.info("Found ${spriteRectangles.size()} sprites.")
    return spriteRectangles
}

private fun findContiguous(image: BufferedImage, point: Point, predicate: (Color) -> Boolean): List<Point> {
    val unvisited = LinkedList<Point>()
    val visited   = HashSet<Point>()

    unvisited.addAll(neighbors(point, image) filter { predicate(Color(image.getRGB(it.x, it.y))) })

    while (unvisited.isNotEmpty()) {
        val currentPoint = unvisited.pop()
        val currentColor = Color(image.getRGB(currentPoint.x, currentPoint.y))

        if (predicate(currentColor)) {
            unvisited.addAll(neighbors(currentPoint, image) filter {
                !visited.contains(it) && !unvisited.contains(it) &&
                predicate(Color(image.getRGB(it.x, it.y)))
            })

            visited.add(currentPoint)
        }
    }

    return visited.toList()
}

private fun neighbors(point: Point, image: Image): List<Point> {
    val points = ArrayList<Point>()

    if (point.x > 0)
        points.add(Point(point.x - 1, point.y))
     if (point.x < image.getWidth(null) - 1)
     points.add(Point(point.x + 1, point.y))

     if (point.y > 0)
     points.add(Point(point.x, point.y - 1))
     if (point.y < image.getHeight(null) - 1)
     points.add(Point(point.x, point.y + 1))

     return points
 }

However, the code is not yet complete- packing and creating the metafile is missing- and has been languishing in my Github since March, but I'll hopefully complete it one day.

Best Answer

Related Solutions

Visiting points on a number line while minimizing a cost not related to distance

Design – How to Detect Sprites in a SpriteSheet

Related Topic