Algorithms – Finding the Kth Optimal Path in a Matrix

algorithms

I had the following question I was trying to solve:

You are given a n*n matrix. Every cell of the matrix contain some value (positive).

You are currently at (0,0) and you have to go to right most and bottom most cell ((n-1)*(n-1)). You can move either right or down.

I am aware of finding the most optimal solution for this using the dynamic programming approach. But what is the best way to find the kth optimal solution?

Best Answer

In the classic version(k = 1, solve for best path), the key insight used is as follows.

1) In computing best path for cell (i, j) (best[i][j] is the cost of best path from top-left to (i, j)), we notice that this path could either come from it's left or upper neighbor. Thus we do (leaving aside corner cases for a moment)

best[i][j] = min(best[i][j-1], best[i-1][j]) + value[i][j]

2) Now, if we want to compute k best paths from top-left corner to (i, j), we can again follow a similar, but more general technique. Here, best[i][j] can be thought of as a list of k best paths instead of one best path.

best[i][j] = merge_and_select_k(best[i-1][j], best[i][j-1], k)

merge_and_select_k takes two input lists, an integer k - It concatenates the two lists and keeps only the k best values. Of course, you must also add value[i][j] to all these selected items.

3) Loop over the matrix, just like in the classic version and compute best[i][j] for every i and j. Your final result is best[n][n][k].

4) This works because, we only need to consider best[i-1][j] and best[i][j-1] for computing best[i][j]. Time complexity is O(k * n * n) and space complexity is O(k * n), as we only need keep the values for current and previous rows.

Related Solutions

Algorithms – Finding Points with Pairwise Distances Matching a Distance Matrix

This is the standard problem known as "Multidimensional Scaling".

There are lots of varieties, so I recommend reading the wikipedia page on the subject (it's not so long), but in your case, it looks like plain PCA (Principal component analysis) is the way to go.

In particular, a (squared) distance matrix D is closely related to B, the product of the transposed position matrix X with itself:

B = -0.5 * J D J where J = I - 1/n (i.e. 1-1/n on the diagonal, -1/n elsewhere)

B = X^T X (by construction)

PCA will give you the eigenvalues and eigenvectors such that

B = U^T L U

with the columns of U being the eigenvectors, and the diagonal of L being the eigenvalues.

Then it's just a question of multiplication:

X = L^(1/2) U

If you're not used to math this may sound a little scary, but PCA is almost certainly available as a library whatever platform you're on, and then it's just a question of a multiplying a diagonal matrix with matrix, which is easy to do manually if necessary. PCA will reconstruct a full-dimensional space, so if you want just the 2 or 3 most significant dimensions, simply crop the resultant matrix. More significant dimensions are those with the highest eigenvalue; most PCA algorithms list them in order, so you can just take the top few rows.

There are various documents online that explain this in more detail; alas wikipedia falls short here. See, for instance: Wickelmaier, F. (2003). An introduction to MDS. Reports from the Sound Quality Research Unit (SQRU), No. 7 pages 9-11 for more info.

The above is a general pattern you might apply in many languages. To verify this works, I tested it using Eigen which is the fastest linear algebra library I know of that can express this kind of thing simply.

I'll walk you through it:

#include <iostream>
#include <Eigen/Core>
#include <Eigen/Eigenvalues> 

using namespace Eigen;
using namespace std;

#define DBG(s) (cout<< #s<<":\n"<<(s)<<"\n\n" )


int main(int argc, char* argv[])
{

The above is just boilerplate referring to Eigen and simple console output function.

    auto const n = 4;//4 points.

    MatrixXd points(3, n);
    points <<
        0.0,   4.0,  2.0,   3.0,
        3.0,   1.0,  0.0,  -4.0,
        0.01, -0.1, -0.05, -0.01;


    MatrixXd D(n, n);
    for (int i = 0; i < n; i++)
        D.col(i) = (points.colwise() - points.col(i)).colwise().squaredNorm().transpose();

    DBG(D);
    //      0 20.0121 13.0036 58.0004
    //20.0121       0  5.0025 26.0081
    //13.0036  5.0025       0 17.0016
    //58.0004 26.0081 17.0016       0

I'm initializing the input (squared) distance matrix D with real distances to make sure they sort of make sense. Notice that the third dimensions values are all really small - I intend to make a 2D reconstruction and I'd like data that's not quite perfect :-).

    MatrixXd J = MatrixXd::Identity(n, n) - MatrixXd::Constant(n, n, 1.0 / 4.0);
    DBG(J);
    // 0.75 -0.25 -0.25 -0.25
    //-0.25  0.75 -0.25 -0.25
    //-0.25 -0.25  0.75 -0.25
    //-0.25 -0.25 -0.25  0.75

    MatrixXd B = -0.5 * J * D * J;
    DBG(B);
    //  14.0648 -0.940469  0.561906  -13.6862
    //-0.940469   4.06641 -0.436719  -2.68922
    // 0.561906 -0.436719 0.0626563 -0.187844
    // -13.6862  -2.68922 -0.187844   16.5633

Note that B has a non-zero diagonal (you worried about this in the comments below).

    SelfAdjointEigenSolver<MatrixXd> eigendecomp(B);
    DBG(eigendecomp.eigenvalues());
    //-1.77725e-015
    //  0.000568306
    //      5.61749
    //       29.139

Note that Eigen returns eigenvalues in ascending order; also note that the first eigenvalue is negative. This is a problem, because we're going to take the square root. Of course, this is just a numeric artifact; since our input data was 3D, it makes sense that the least-significant dimention of a 4D representation is just noise. It's tiny too. In general, you need to make sure that you pick the right number of dimensions to use, because too few and you'll introduce error; too many and you'll run into negative eigenvalues. Note that if any of these negative eigenvalues were large, that would indicate your input is bad, such as that it violates the triangle inequality that A to C via B is always longer than A to C directly.

Your input above has such a violation: points 0 and 2 are the same (have 0 distance), yet point 0 is at distance 0.4 from point 1, but point 2 is at distance 0.2. Clearly regardless of dimensionality or where you put the points, this is impossible. That's going to cause negative eigenvalues that aren't just numerical noise. There are various hacks to mitigate that, but let's get back on track and use those eigenvalues:

    MatrixXd X = eigendecomp.eigenvalues().bottomRows(2).cwiseSqrt().asDiagonal()
        * eigendecomp.eigenvectors().transpose().bottomRows(2);

That's just the wordy way of saying X = L^(1/2) U while only considering the most significant 2 dimensions.

    DBG(X.transpose() * X);
    //  14.0647 -0.940508  0.562078  -13.6863
    //-0.940508   4.06638 -0.436623  -2.68925
    // 0.562078 -0.436623 0.0622356 -0.187691
    // -13.6863  -2.68925 -0.187691   16.5632

Right, so we can verify that the decomposition works: this is quite similar to B. It's not exactly the same because we chopped off two dimensions, after all.

    DBG(X);
    //-0.999577   1.99533 -0.232164 -0.763594
    //  3.61463  0.291588 0.0912992  -3.99751

These points turn out not to be very similar to the original points we started from. That's not actually surprising; after all, distances are invariant under rotation and translation, so you'll find that these points are zero-centered, and they're arbitrarily rotated compared to the points we started with. You could actually reconstruct the rotation matrix effectively used with another PCA, but let's not :-).

    MatrixXd D2(n, n);
    for (int i = 0; i < n; i++)
        D2.col(i) = (X.colwise() - X.col(i)).colwise().squaredNorm().transpose();
    DBG(D2);
    //      0 20.0121 13.0028 58.0004
    //20.0121       0 5.00187 26.0081
    //13.0028 5.00187       0 17.0008
    //58.0004 26.0081 17.0008       0

    return 0;
}

As a final verification, note that the distances these new points generate are pretty much identical to the distances we started with. Numerical inaccuracies and more importantly our 3D -> 2D reduction mean it's not completely perfect, but you should be able to see that D is almost the same as D2.

Note for when your data really isn't very clean and/or doesn't actually represent Euclidean distances: MDS (using PCA) solves the optimization problem of minimizing the squared error of the distance. That's all fine and dandy, but notice that errors in large distances are (over-)emphasized here (1² - 0² = 1, but 11² - 10² = 21, so MDS will try 21 times as hard to fix the second error). If your distances aren't perfect, PCA will try to make the "most significant" i.e. largest distance fit best. This is often actually exactly the wrong thing to do; usually, you care more about local structure than global structure. IF your distance matrix is inaccurate and you cannot reconstruct a good overall position matrix, you might want to look at techniques that at least reconstruct local neighbourhoods best. State of the art there (AFAIK) is t-SNE: you won't get an absolutely accurate map from that technique, but you will get the right things close to each other, at least.

Good luck - sounds like a fun problem!

Algorithms – Can the Gold Mine Problem Be Solved Using Divide-and-Conquer?

There's a problem with your divide-and-conquer approach of dividing the grid into four quadrants: The solutions for the quadrants are not independent of each other. And you can't find the optimal solution by combining the locally optimal sub-solutions of each quadrant.

+---+---+
| C | D |
+---+---+
| A | B |
+---+---+

Here, D can be calculated independently. However, the solution for C depends on D, B depends on D, and A depends on B, C, D. (Dependency is determined by reachability.)

Solving the problem for a quadrant does not mean we find the best path. Instead, we have to find the best path for each cell on their left and lower edges, since we do not know where the best path will enter that quadrant.

The exact space complexities depend on what we are trying to calculate:

the best path: this will take O(n) space per record for a quadrant with n-long edges.
the value of the best path: this will take O(1) space per record.

I'll sidestep that for this discussion and will denote the space requirement with s(n). Then, the solution of a quadrant requires O(n·s(n)) space for the records of the left and lower edge.

The total space complexity of the algorithm is then given through the recurrence relation T(n) = 4·T(n/4) + O(n · s(n)). This would give T(n) ∈ Θ(n · s(n)) for either choice for s(n).

Writing down the algorithm would be extremely tedious, so I'll not do that here. However, the signature of the recursive solver would be:

solve_quadrant(grid: int[n, n],
               x_start: int, x_end: int,
               y_start: int, y_end: int,
               upper_edge: Result[],
               right_edge: Result[])
               -> (left_edge: Result[], lower_edge: Result[])

And the top-level invocation would be:

solve(grid: int[n, n]) -> Result {
  left_edge, lower_edge = solve_quadrant(grid, 0, n, 0, n, {0, ..., 0}, {0, ..., 0})
  return lower_edge[0] // assuming bottom-left is at coordinate (0, 0)
}

Since the divide-and-conquer algorithm has to satisfy the same data dependencies as a dynamic programming solution, it is unlikely to beat the DP solution with space complexity Θ(n · s(n)). In fact, the complexities of the “divide & conquer” algo and the DP algo are equivalent, since they do the same thing the same way – only the strategy for subdivision is different. However, the DP solution is far easier to program since it doesn't have to consider as many edge conditions.

Best Answer

Related Solutions

Algorithms – Finding Points with Pairwise Distances Matching a Distance Matrix

Algorithms – Can the Gold Mine Problem Be Solved Using Divide-and-Conquer?

Related Topic