Algorithms – Finding Points with Pairwise Distances Matching a Distance Matrix

algorithmsgeometry

Problem. I have a symmetric distance matrix with entries between zero and one, like this one:

D = ( 0.0 0.4 0.0 0.5 )
    ( 0.4 0.0 0.2 1.0 )
    ( 0.0 0.2 0.0 0.7 )
    ( 0.5 1.0 0.7 0.0 )

I would like to find points in the plane that have (approximately) the pairwise distances given in D. I understand that this will usually not be possible with strictly correct distances, so I would be happy with a "good" approximation.

My matrices are smallish, no more than 10×10, so performance is not an issue.

Question. Does anyone know of an algorithm to do this?

Background. I have sets of probability densities between which I calculate Hellinger distances, which I would like to visualize as above. Each set contains no more than 10 densities (see above), but I have a couple of hundred sets.

What I did so far.

I did consider posting at math.SE, but looking at what gets tagged as "geometry" there, it seems like this kind of computational geometry question would be more on-topic here. If the community thinks this should be migrated, please go ahead.
This looks like a straightforward problem in computational geometry, and I would assume that anyone involved in clustering might be interested in such a visualization, but I haven't been able to google anything.
One simple approach would be to randomly plonk down points and perturb them until the distance matrix is close to D, e.g., using Simulated Annealing, or run a Genetic Algorithm. I have to admit that I haven't tried that yet, hoping for a smarter way.
One specific operationalization of a "good" approximation in the sense above is Problem 4 in the Open Problems section here, with k=2. Now, while finding an algorithm that is guaranteed to find the minimum l1-distance between D and the resulting distance matrix may be an open question, it still seems possible that there at least is some approximation to this optimal solution. If I don't get an answer here, I'll mail the gentleman who posed that problem and ask whether he knows of any approximation algorithm (and post any answer I get to that here).

Best Answer

This is the standard problem known as "Multidimensional Scaling".

There are lots of varieties, so I recommend reading the wikipedia page on the subject (it's not so long), but in your case, it looks like plain PCA (Principal component analysis) is the way to go.

In particular, a (squared) distance matrix D is closely related to B, the product of the transposed position matrix X with itself:

B = -0.5 * J D J where J = I - 1/n (i.e. 1-1/n on the diagonal, -1/n elsewhere)

B = X^T X (by construction)

PCA will give you the eigenvalues and eigenvectors such that

B = U^T L U

with the columns of U being the eigenvectors, and the diagonal of L being the eigenvalues.

Then it's just a question of multiplication:

X = L^(1/2) U

If you're not used to math this may sound a little scary, but PCA is almost certainly available as a library whatever platform you're on, and then it's just a question of a multiplying a diagonal matrix with matrix, which is easy to do manually if necessary. PCA will reconstruct a full-dimensional space, so if you want just the 2 or 3 most significant dimensions, simply crop the resultant matrix. More significant dimensions are those with the highest eigenvalue; most PCA algorithms list them in order, so you can just take the top few rows.

There are various documents online that explain this in more detail; alas wikipedia falls short here. See, for instance: Wickelmaier, F. (2003). An introduction to MDS. Reports from the Sound Quality Research Unit (SQRU), No. 7 pages 9-11 for more info.

The above is a general pattern you might apply in many languages. To verify this works, I tested it using Eigen which is the fastest linear algebra library I know of that can express this kind of thing simply.

I'll walk you through it:

#include <iostream>
#include <Eigen/Core>
#include <Eigen/Eigenvalues> 

using namespace Eigen;
using namespace std;

#define DBG(s) (cout<< #s<<":\n"<<(s)<<"\n\n" )


int main(int argc, char* argv[])
{

The above is just boilerplate referring to Eigen and simple console output function.

    auto const n = 4;//4 points.

    MatrixXd points(3, n);
    points <<
        0.0,   4.0,  2.0,   3.0,
        3.0,   1.0,  0.0,  -4.0,
        0.01, -0.1, -0.05, -0.01;


    MatrixXd D(n, n);
    for (int i = 0; i < n; i++)
        D.col(i) = (points.colwise() - points.col(i)).colwise().squaredNorm().transpose();

    DBG(D);
    //      0 20.0121 13.0036 58.0004
    //20.0121       0  5.0025 26.0081
    //13.0036  5.0025       0 17.0016
    //58.0004 26.0081 17.0016       0

I'm initializing the input (squared) distance matrix D with real distances to make sure they sort of make sense. Notice that the third dimensions values are all really small - I intend to make a 2D reconstruction and I'd like data that's not quite perfect :-).

    MatrixXd J = MatrixXd::Identity(n, n) - MatrixXd::Constant(n, n, 1.0 / 4.0);
    DBG(J);
    // 0.75 -0.25 -0.25 -0.25
    //-0.25  0.75 -0.25 -0.25
    //-0.25 -0.25  0.75 -0.25
    //-0.25 -0.25 -0.25  0.75

    MatrixXd B = -0.5 * J * D * J;
    DBG(B);
    //  14.0648 -0.940469  0.561906  -13.6862
    //-0.940469   4.06641 -0.436719  -2.68922
    // 0.561906 -0.436719 0.0626563 -0.187844
    // -13.6862  -2.68922 -0.187844   16.5633

Note that B has a non-zero diagonal (you worried about this in the comments below).

    SelfAdjointEigenSolver<MatrixXd> eigendecomp(B);
    DBG(eigendecomp.eigenvalues());
    //-1.77725e-015
    //  0.000568306
    //      5.61749
    //       29.139

Note that Eigen returns eigenvalues in ascending order; also note that the first eigenvalue is negative. This is a problem, because we're going to take the square root. Of course, this is just a numeric artifact; since our input data was 3D, it makes sense that the least-significant dimention of a 4D representation is just noise. It's tiny too. In general, you need to make sure that you pick the right number of dimensions to use, because too few and you'll introduce error; too many and you'll run into negative eigenvalues. Note that if any of these negative eigenvalues were large, that would indicate your input is bad, such as that it violates the triangle inequality that A to C via B is always longer than A to C directly.

Your input above has such a violation: points 0 and 2 are the same (have 0 distance), yet point 0 is at distance 0.4 from point 1, but point 2 is at distance 0.2. Clearly regardless of dimensionality or where you put the points, this is impossible. That's going to cause negative eigenvalues that aren't just numerical noise. There are various hacks to mitigate that, but let's get back on track and use those eigenvalues:

    MatrixXd X = eigendecomp.eigenvalues().bottomRows(2).cwiseSqrt().asDiagonal()
        * eigendecomp.eigenvectors().transpose().bottomRows(2);

That's just the wordy way of saying X = L^(1/2) U while only considering the most significant 2 dimensions.

    DBG(X.transpose() * X);
    //  14.0647 -0.940508  0.562078  -13.6863
    //-0.940508   4.06638 -0.436623  -2.68925
    // 0.562078 -0.436623 0.0622356 -0.187691
    // -13.6863  -2.68925 -0.187691   16.5632

Right, so we can verify that the decomposition works: this is quite similar to B. It's not exactly the same because we chopped off two dimensions, after all.

    DBG(X);
    //-0.999577   1.99533 -0.232164 -0.763594
    //  3.61463  0.291588 0.0912992  -3.99751

These points turn out not to be very similar to the original points we started from. That's not actually surprising; after all, distances are invariant under rotation and translation, so you'll find that these points are zero-centered, and they're arbitrarily rotated compared to the points we started with. You could actually reconstruct the rotation matrix effectively used with another PCA, but let's not :-).

    MatrixXd D2(n, n);
    for (int i = 0; i < n; i++)
        D2.col(i) = (X.colwise() - X.col(i)).colwise().squaredNorm().transpose();
    DBG(D2);
    //      0 20.0121 13.0028 58.0004
    //20.0121       0 5.00187 26.0081
    //13.0028 5.00187       0 17.0008
    //58.0004 26.0081 17.0008       0

    return 0;
}

As a final verification, note that the distances these new points generate are pretty much identical to the distances we started with. Numerical inaccuracies and more importantly our 3D -> 2D reduction mean it's not completely perfect, but you should be able to see that D is almost the same as D2.

Note for when your data really isn't very clean and/or doesn't actually represent Euclidean distances: MDS (using PCA) solves the optimization problem of minimizing the squared error of the distance. That's all fine and dandy, but notice that errors in large distances are (over-)emphasized here (1² - 0² = 1, but 11² - 10² = 21, so MDS will try 21 times as hard to fix the second error). If your distances aren't perfect, PCA will try to make the "most significant" i.e. largest distance fit best. This is often actually exactly the wrong thing to do; usually, you care more about local structure than global structure. IF your distance matrix is inaccurate and you cannot reconstruct a good overall position matrix, you might want to look at techniques that at least reconstruct local neighbourhoods best. State of the art there (AFAIK) is t-SNE: you won't get an absolutely accurate map from that technique, but you will get the right things close to each other, at least.

Good luck - sounds like a fun problem!

Best Answer

Related Solutions

Algorithm to Find All Points Within a Certain Distance

Algorithms – Finding the Lowest Average Hamming Distance

Related Topic