C# Code Quality – Preferred Approach to Terminate Reading Loop

ccode-qualityreadability

When you have to iterate a reader where the number of items to read is unknown, and the only way to do is it to keep reading until you hit the end.

This is often the place you need an endless loop.

  1. There is the always true that indicates there must be a break or return statement somewhere inside the block.

    int offset = 0;
    while(true)
    {
        Record r = Read(offset);
        if(r == null)
        {
            break;
        }
        // do work
        offset++;
    }
    
  2. There is the double read for loop method.

    Record r = Read(0);
    for(int offset = 0; r != null; offset++)
    {
        r = Read(offset);
        if(r != null)
        {
            // do work
        }
    }
    
  3. There is the single read while loop. Not all languages support this method.

    int offset = 0;
    Record r = null;
    while((r = Read(++offset)) != null)
    {
        // do work
    }
    

I'm wondering which approach is the least likely to introduce a bug, most readable and commonly used.

Every time I have to write one of these I think "there has to be a better way".

Best Answer

I would take a step back here. You're concentrating on the picky details of the code but missing the larger picture. Let's take a look at one of your example loops:

int offset = 0;
while(true)
{
    Record r = Read(offset);
    if(r == null)
    {
        break;
    }
    // do work
    offset++;
}

What is the meaning of this code? The meaning is "do some work to each record in a file". But that is not what the code looks like. The code looks like "maintain an offset. Open a file. Enter a loop with no end condition. Read a record. Test for nullity." All that before we get to the work! The question you should be asking is "how can I make this code's appearance match its semantics?" This code should be:

foreach(Record record in RecordsFromFile())
    DoWork(record);

Now the code reads like its intention. Separate your mechanisms from your semantics. In your original code you mix up the mechanism -- the details of the loop -- with the semantics -- the work done to each record.

Now we have to implement RecordsFromFile(). What's the best way of implementing that? Who cares? That's not the code that anyone is going to be looking at. It's basic mechanism code and its ten lines long. Write it however you want. How about this?

public IEnumerable<Record> RecordsFromFile()
{
    int offset = 0;
    while(true)
    {
        Record record = Read(offset);
        if (record == null) yield break;
        yield return record;
        offset += 1;
    }
}

Now that we are manipulating a lazily computed sequence of records all sorts of scenarios become possible:

foreach(Record record in RecordsFromFile().Take(10))
    DoWork(record);

foreach(Record record in RecordsFromFile().OrderBy(r=>r.LastName))
    DoWork(record);

foreach(Record record in RecordsFromFile().Where(r=>r.City == "London")
    DoWork(record);

And so on.

Any time you write a loop, ask yourself "does this loop read like a mechanism or like the meaning of the code?" If the answer is "like a mechanism", then try to move that mechanism to its own method, and write the code to make the meaning more visible.

Related Topic