Java – Interview: Remove Loop in linked list – Java

data structuresjavalinked list

I was asked this question in interview: "How to detect the loop in linked list?", I solved this but immediately the interviewer asked me how do I remove the loop in a linked list. I fumbled.

So any pointers on how to solve this, may be pseudo code, or method definition?

I'm comfortable with Java so I have tagged this question under java.

For Instance this linked list has a loop

 0--->1---->2---->3---->4---->5---->6
                  ▲                 |
                  |                 ▼
                 11<—-22<—-12<—-9<—-8

Best Answer

There are two parts to this problem:

Detect if there is a loop in the list
Identify the start of the loop

Once you know where the loop starts, it's easy to identify the last element in the list since it's the element in the list following the start of the loop that ends up pointing back to the start of the loop. It is then trivial to set the next pointer/reference of this element to null to correct the cyclic link list (not circular linked list which is where the last elements points back to the first - this would be a specific instance of cyclic lists).

Floyd's cycle detect algorithm, also called the tortoise and hare algorithm as it involves using two pointers/references that move at different speeds, is one way of detecting the cycle. If there is a cycle, the two pointers (say p1 and p2) will end up pointing to the same element after a finite number of steps. Interestingly, it can be proved that the element at which they meet will be the same distance to the start of the loop (continuing to traverse the list in the same, forward direction) as the start of the loop is to the head of the list. That is, if the linear part of the list has k elements, the two pointers will meet inside the loop of length m at a point m-k from the start of the loop or k elements to the 'end' of the loop (of course, it's a loop so it has no 'end' - it's just the 'start' once again). And that gives us a way to find the start of the loop:
Once a cycle has been detected, let p2 remain pointing to the element where the loop for the step above terminated but reset p1 so that it's pointing back to the head of the list. Now, move each pointer one element at a time. Since p2 began inside the loop, it will continue looping. After k steps (equal to the distance of the start of the loop from the head of the list), p1 and p2 will meet again. This will give you a reference to the start of the loop.
It is now easy to set p1 (or p2) to point to the element starting the loop and traverse the loop until p1 ends up pointing back to the starting element. At this point p1 is referencing the 'last' element list and it's next pointer can be set to null.

Here's some quick and dirty Java code assuming a linked list of Nodes where a Node has a next reference. This could be optimized but it should give you the basic idea:

Node slow, fast, start;
fast = slow = head;

//PART I - Detect if a loop exists
while (true)
{
    // fast will always fall off the end of the list if it is linear
    if (fast == null || fast.next == null)
    {
        // no loop
        return;
    }
    else if (fast == slow || fast.next == slow)
    {
        // detected a loop
        break;
    }
    else
    {
        fast = fast.next.next; // move 2 nodes at at time
        slow = slow.next; // move 1 node at a time
    }
}

//PART II - Identify the node that is the start of the loop
fast = head; //reset one of the references to head of list

//until both the references are one short of the common element which is the start of the loop
while(fast.next != slow.next) 
{
    fast = fast.next;
    slow = slow.next;
}

start = fast.next;

//PART III - Eliminate the loop by setting the 'next' pointer 
//of the last element to null
fast = start;
while(fast.next != start)
{
    fast = fast.next;
}

fast.next = null; //break the loop

This explanation might help the why behind Part II:

Assume the length of the cycle is M, and the length of the rest of the linked list is L. Let's figure out what is the position in the cycle when t1/t2 first meet?

Define the first node in the cycle is position 0, following the links we have position 1, 2,..., up to M-1. ( when we walk in the cycle, our current position is (walk_length) mod M, right?) Suppose t1/t2 first meet at position p, then their travel time are the same, (L+k1*M+p)/v = (L+k2*M+p)/2v for some k1
So it concludes that if t1 start from p, t2 start from head and move at the same speed, then will grantee to meet at position 0, the first node of the cycle. QED.

More references:

http://www.quora.com/How-does-Floyds-cycle-finding-algorithm-work
Explain how finding cycle start node in cycle linked list work?
Proof of detecting the start of cycle in linked list
Hristo's answer to this question on this page also quotes an explanation from an interview book

Related Solutions

Java – What are the differences between a HashMap and a Hashtable in Java

There are several differences between HashMap and Hashtable in Java:

Hashtable is synchronized, whereas HashMap is not. This makes HashMap better for non-threaded applications, as unsynchronized Objects typically perform better than synchronized ones.
Hashtable does not allow null keys or values. HashMap allows one null key and any number of null values.
One of HashMap's subclasses is LinkedHashMap, so in the event that you'd want predictable iteration order (which is insertion order by default), you could easily swap out the HashMap for a LinkedHashMap. This wouldn't be as easy if you were using Hashtable.

Since synchronization is not an issue for you, I'd recommend HashMap. If synchronization becomes an issue, you may also look at ConcurrentHashMap.

Java – Is Java “pass-by-reference” or “pass-by-value”

Java is always pass-by-value. Unfortunately, when we deal with objects we are really dealing with object-handles called references which are passed-by-value as well. This terminology and semantics easily confuse many beginners.

It goes like this:

public static void main(String[] args) {
    Dog aDog = new Dog("Max");
    Dog oldDog = aDog;

    // we pass the object to foo
    foo(aDog);
    // aDog variable is still pointing to the "Max" dog when foo(...) returns
    aDog.getName().equals("Max"); // true
    aDog.getName().equals("Fifi"); // false
    aDog == oldDog; // true
}

public static void foo(Dog d) {
    d.getName().equals("Max"); // true
    // change d inside of foo() to point to a new Dog instance "Fifi"
    d = new Dog("Fifi");
    d.getName().equals("Fifi"); // true
}

In the example above aDog.getName() will still return "Max". The value aDog within main is not changed in the function foo with the Dog "Fifi" as the object reference is passed by value. If it were passed by reference, then the aDog.getName() in main would return "Fifi" after the call to foo.

Likewise:

public static void main(String[] args) {
    Dog aDog = new Dog("Max");
    Dog oldDog = aDog;

    foo(aDog);
    // when foo(...) returns, the name of the dog has been changed to "Fifi"
    aDog.getName().equals("Fifi"); // true
    // but it is still the same dog:
    aDog == oldDog; // true
}

public static void foo(Dog d) {
    d.getName().equals("Max"); // true
    // this changes the name of d to be "Fifi"
    d.setName("Fifi");
}

In the above example, Fifi is the dog's name after call to foo(aDog) because the object's name was set inside of foo(...). Any operations that foo performs on d are such that, for all practical purposes, they are performed on aDog, but it is not possible to change the value of the variable aDog itself.

For more information on pass by reference and pass by value, consult the following SO answer: https://stackoverflow.com/a/430958/6005228. This explains more thoroughly the semantics and history behind the two and also explains why Java and many other modern languages appear to do both in certain cases.

Best Answer

Related Solutions

Java – What are the differences between a HashMap and a Hashtable in Java

Java – Is Java “pass-by-reference” or “pass-by-value”

Related Topic