Merging similar graphs based solely on the graph structure

graph

I am looking for (or attempting to design) a technique for matching nodes from very similar graphs based on the structure of the graph*. In the examples below, the top graph has 5 nodes, and the bottom graph has 6 nodes. I would like to match the nodes from the top graph to the nodes in the bottom graph, such that the "0" nodes match, and the "1" nodes match, etc. This seems logically possible, because I can do it in my head for these simple examples. Now I just need to express my intuition in code. Are there any established algorithms or patterns I might consider?

(* When I say based on the structure of the graph, I mean the solution shouldn't depend on the node labels; the numeric labels on the nodes are only for demonstration.)

UPDATE: It has been correctly pointed out that using the structure alone, most of these graphs can't be solved with 100% accuracy. I've added some thoughts to each graph.

I'm also interested in the performance of any potential solutions. How well will they scale? Could I merge graphs with millions of nodes?

In more complex cases, I recognize that the best solution may be subject to interpretation. Still, I'm hoping for a "good" way to merge complex graphs.

(These are directed graphs; the thicker portion of an edge represents the head.)

Example 1
This graph is a cycle with 1 node "attached to the outside." After attaching a second node to the outside it's not possible to determine whether the new node (node 5) is attached to node 1 or 3. The 4 nodes composing the cycle can't be perfectly matched either.
Example 2
Nodes 1 and 5 can be swapped in the bottom graph without changing the structure. Thus, you can only guess with 50/50 accuracy which is the new node.
Example 3
This graph has symmetry in 2 dimensions and can be mirrored without changing the structure. Meaning you can swap several nodes without changing the structure.
Example 4
This last graph is the only one for which you can identify the new node with 100% accuracy.

Thanks Kirk Broadhurst, for pointing this out.

Best Answer

As others pointed out, a site like cstheory may be more helpful. From what I can understand of your problem this sounds pretty much exactly like the subgraph isomorphism problem:

Given two graphs G and H find a subgraph of H that is isomorphic to G.

What you tried to express with "matching" 0 and 1, and so on, is referred to as isomorphism in a graph-theoretic setting. This is the typical mathematical concept of isomorphisms and general graph isomorphism is an interesting problem, for which it is yet unclear if it is NP-complete.

The good news is that this problem is well-known, well-researched and there exists a recursive algorithm that solves it (the paper is linked from the wp article).

The bad news is that the algorithm is exponential (except for some special cases, which seem to not apply to your problem), and in fact, the very problem itself is NP-complete. Yes, unfortunate as it my seem, for the graph isomorphism problem it is not known whether it is NP-complete, but the sub-graph isomorphism problem definitely is NP-complete.

Related Topic