Visiting points on a number line while minimizing a cost not related to distance

algorithmsgraph

I need some help on this ACM ICPC problem. My current idea is to model this as a shortest path problem, which is described under the problem statement.

Problem

There are N = 1000 nuclear waste containers located along a 1-D number line at distinct positions from -500,000 to 500,000, except x=0. A person is tasked with collecting all the waste bins. Each second that a waste container isn't collected, it emits 1 unit of radiation. The person starts at x = 0 and can move 1 unit every second, and collecting the waste takes a negligible amount of time. We want to find the minimum amount of radiation released while collecting all of the containers.

Sample Input:

4 Containers located at [-12, -2, 3, 7].

The best order to collect these containers is [-2, 3, 7, -12], for a minimum emitting of 50 units. Explanation: the person goes to -2 in 2 seconds and during that time 2 units of radiation are emitted. He then goes to 3 (distance: 5) so that barrel has released 2 + 5 = 7 units of radiation. He takes 4 more seconds to get to x = 7 where that barrel has emitted 2 + 5 + 4 = 11 units. He takes 19 seconds to get to x = -12 where that barrel has emitted 2 + 5 + 4 + 19 = 30 units. 2 + 7 + 11 + 30 = 50, which is the answer.

Notes

There is an obvious O(N!) solution. However, I've explored greedy methods such as moving to the closest one, or moving to the closest cluster but those haven't worked.

I've thought about this problem for quite a while, and have kind of modeled it as a graph search problem:

  1. We insert 0 in as a baseline position (This will be the initial state)
  2. Then, we sort the positions from least to greatest.
  3. We then do a BFS/PFS, where the state consists of
    • Two integers l and r that represent a contiguous range in the sorted position array that we have visited already
    • An integer loc that tells us whether we're on the left or right endpoint of the range
    • An integer time that tells us the elapsed time
    • An integer 'cost' that tells us the total cost so far (based on nodes we've visited)
  4. From each state we can move to [l – 1, r] and [l, r + 1], tweaking the other 3 integers accordingly
  5. Final state is [0, N], checking both ending positions.

However, it seems that [L, R, loc] does not uniquely define a state, and we have to store L, R, loc, and time, while minimizing cost at each of these. This leads to an exponential algorithm, which is still way too slow for any good.

Can anyone help me expand on my idea or push my into the right direction?

Edit: Maybe this can be modeled as a dynamic programming optimization problem? Thinking about it, it has the same issues as the graph search solution – just because the current cost is low doesn't mean it is the optimal answer for that sub problem, since the time also affects the answer greatly.

Greedy doesn't work: I have a greedy selection algorithm that estimates the cost of moving to a certain place (e.g. if we move right, we double the distances to the left barrels and such).

Could you do a Priority-first search, with a heuristic? The heuristic could combine the cost of the current trip with the amount of time elapsed.

Best Answer

I think you can improve this by looking at the problem as a directed graph of pairs of positions.

For this example I will use the line with values -9, -6, -1, 3, and 5.

Because it's too hard to draw a graph with just text, I'm going to represent the pairs as a table. We can think of the cells as representing the state where all containers between those two positions have been collected. Each cell has two values, one representing the cost to go left, the other representing the cost to go right (to the next container).

The values can simply be calculated as the distance between the two points multiplied by the number of barrels outside of those two points + 1. Cells where both numbers have the same sign represent cases when all barrels of the opposite sign have been collected. These are calculated using only the number of barrels in the direction away from zero.

In the table values of X mean that you can't go in that direction (all the barrels in that direction have been taken). The rows represent the current position of the collector and the columns represent the location of the next opposite barrel.

    +------+------+------+------+------+
    |  -9  |  -6  |  -1  |   3  |   5  |
+---+------+------+------+------+------+
|-9 |      |      |      | X, 24| X, 14|
+---+------+------+------+------+------+
|-6 | 3, X |      |      | 9, 27| 6, 22|
+---+------+------+------+------+------+
|-1 |      |10, X |      |20, 8 |15, 18|
+---+------+------+------+------+------+
| 3 |24, 4 |27, 6 |16, 8 |      | X, 2 |
+---+------+------+------+------+------+
| 5 |14, X |22, X |18, X |      |      |
+---+------+------+------+------+------+

The rules for moving between squares can be somewhat confusing.

For negative numbered columns, the rules are as follows:

  • Going right moves down one cell.
  • Going left moves down one cell and then mirrors across the diagonal.
  • If the right value is X, going left moves up to the diagonal (where the column and row are equal) and left by one.

For positive numbered columns, the rules are as follows:

  • Going left moves up one cell.
  • Going right moves up one cell and then mirrors across the diagonal.
  • If the left value is X, going right moves down to the diagonal and right by one.

Now we can run Dijkstra's algorithm to calculate the best path, using these movement rules to traverse the graph. Our starting positions are (-1, 3) and (3, -1) with initial costs of 5 and 15, respectively. Once we've calculated values for the two end positions (left of (-9, -9) and right of (5, 5)) the smaller of the two is our answer.

The running time for each of the steps are:

  • O(N log N) for initially sorting the input values along the line
  • O(N^2) for calculating the table/graph
  • O(N^2 log N) for running Dijkstra's on the graph (Note: there are at most two edges for any given vertex).

The first two steps are dominated by the last, so your overall runtime is O(N^2 log N) which should be a good enough runtime for the challenge.

Related Topic