First, I would definitely frame the first version as a for-loop:
for (List<String> currentStrings = getCurrentStrings();
currentStrings.size() > 0; // if your List has an isEmpty() prefer it
currentStrings = getCurrentStrings()) {
...
}
Unfortunately there's no idiomatic way in C++, Java or C# that I know of to get rid of the duplication between initializer and incrementer. I personally like abstracting the looping pattern into an Iterable
or Enumerable
or whatever your language provides. But in the end, that just moves the duplication into a reusable place. Here's a C# example:
IEnumerable<T> ValidResults<T>(Func<T> grab, Func<bool, T> validate) {
for (T t = grab(); validate(t); t = grab()) {
yield return t;
}
}
// != null is a common condition
IEnumerable<T> NonNullResults<T>(Func<T> grab) where T : class {
return ValidResults(grab, t => t != null);
}
Now you can do this:
foreach(var currentStrings in NonNullResults(getCurrentStrings)) {
...
}
C#'s yield
makes writing this easy; it's uglier in Java or C++.
C++ culture is more accepting of assignment-in-condition than the other languages, and implicit boolean conversions are actually used in some idioms, e.g. type queries:
if (Derived* d = dynamic_cast<Derived*>(base)) {...}
The above relies on the implicit conversion of pointers to bool and is idiomatic. Here's another:
std::string s;
while (std::getline(std::cin, s)) {...}
This modifies within the condition.
The common pattern, however, is that the condition itself is trivial, usually relying completely on some implicit conversion to bool. Since collections don't do that, putting an empty test there would be considered less idiomatic.
C culture is even more accepting, with the fgetc
loop idiom looking like this:
int c;
while((c = fgetc(stream)) != EOF) {...}
But in higher-level languages, this is frowned upon, because with the higher level usually comes lesser acceptance of tricky code.
In a nutshell: What your teacher probably meant is that the semantics of while
is pretty much the same in most languages, while the semantics of for
may change considerably (see discussion below). Hence, abstract language independent proof are more reliable with a while
, but one should be careful that a proof with a for
loop may not match the semantics of the for
loop in many languages.
Your question is not precise enough (though that may not be your
fault).
The point is that, afaik, there is no official, ISO supported
standard, or otherwise officially accepted reference definition of
for
and while
loops. The definition depends on the programming
language.
Hence you cannot make any general statement regarding their
equivalence before you have defined precisely what each can do. I
adress that more precisely, since it is one of the main argument used
in other answers (and the discussion will be useful in what follows).
On intertranslatability of for
and while
loops
Summary: it depends on the programming language, but is always
possible a long as you can have one infinite loop and a way to get out
of it.
But you can make such a statement for a specific programming language,
and the answer will depend on the features fo the language.
That also means that there is no general proof, but only one for each
programming language.
One thing that is generally true is that a while
loop can generally
mimic a for
loop, because the while loop can do the exit condition
testing of the for loop, doing the initialisation of the control
variable with an assignment before entering the loop, and doing the
incrementation at the end of the loop body, so that
for i from 1 by 2 to 10 do { xxx }
becomes
i=1
while i≤11 do { xxx; i←i+2 }
This more or less works for most languages, but it is not as obvious as
it seem, and there may be many "details" to worry about.
For example, in many languages, the for
loop evaluates it 3
arguments (initial value, increment, and final value) as strict
arguments, evaluated once before entering the loop, while others will
take then as thunk arguments to be reevaluated at each turn, or
possibly as lazy argument to be evaluated only when first needed.
Another point may be that the increment variable may be local to the
for
loop, or have to be a local variable of the function where the
loop appears.
Depending on such issues, the translation of a for
to a while
may
vary widely, though it is usually possible to achieve it.
The same holds for the converse, thranslating a while
into a for
loop.
Th first problem is that a while
loop will always reevaluate the
exit condition at each turn. But some for
loops do not provide for a
condition that is reevaluated at each turn, other than comparison of
the control variable with some fixed value computed on loop entry.
Then the translation is not possible unless there is some other mean
to jump out of the loop on some arbitrary conditions.
That is achievable with various devices, usually starting with a
conditional statement testing the condition, followed by an a jump out
implemented, as available, by a loop exit statement, a return
statement (after encapsulating the loop in a function), a goto
statement or an exception raising.
In other words, it is again very dependent on languages, and possibly
on subtle features of languages.
This say, as answered by @milleniumbug, the intertranslation is easy
in the language C, because a for
lopp is essentially a while
loop
plus some extra for an incremented control variable.
But this does not necessarily apply to other languages, and most
likely not in the same way.
This being said, programming languages are usually supposed to have
Turing power with only one of these loops, since all you need for it
is one infinite loop. So, as long as you have some way of looping for
ever, and possibly deciding to stop, you are pretty sure you can mimic
any other construct ... but not necessarily easily.
Regarding proofs
Summary: There is no reason known to me to assert that proofs should be significantly harder with one or the other (unless some weird feature of the language).
There is probably a misunderstanding, or your teacher had his mind on
something else.
Formals semantics can be defined for the various kinds of loops
defined in programming languages, and then used for proving
properties.
It may be, again depending on the language, that conducting formal
proofs regarding programs may be more complex in some cases. But that
depends on the language.
I cannot imagine a reason why proofs should be significantly harder with one construct more than with the other. The for
loop may be more complex since it can offer, as in C, all that is done with a while
plus other things. But if you did it with a while
, you would have to add the extras in some other form.
I could use the formal general argument of intertranslatability, as long as there is the possibility for a single infinite loop. I will however refrain from doing that, as the constructions involved are nothing you want to deal with in a proof, and it would clearly be an unfair statement, at least in practice.
Following the above discussion, however, we have seen that the difficulties for intertranslatability come from the great variability of the for
loop from language to language. Hence the following conclusion which is probably the right answer:
One possibility to understand your teacher's statement is that the semantics of the while
loop is pretty much the same in all programming languages, while the syntax and semantics of the for
loop can vary significantly from language to language. Hence, it is possible to make general "abstract" proofs with while
loops that have language independent semantics to a good extent, while this is not possible for the for
loop that has syntax and semantics changing too much from language to language. But this does not apply within a given language, when the semantics of both are precisely defined.
My best suggestion is that you should ask your teacher what he precisely meant, and whether he can give you an example. Misphrasing or misunderstanding is a common event.
Best Answer
Well, since a code smell is something that makes you take a second look at it, which you yourself are doing, I'd say it definitely qualifies. However, code smells don't automatically need removing, just a hard look to make sure it's really the best way to solve the problem.
In this particular case, the reason you don't often see code like that is it can cause the loop to never terminate under certain input conditions. You're also commingling two different responsibilities into the loop: detecting section starts and processing a section. I would try to have one loop that only detects section boundaries, and once the entire section is known, pass the section contents to another function for processing.
It's also possibly a sign your format is complex enough that a homegrown parser is going to have difficulty catching all the boundary conditions. You might want to look into a full parser like bison or antlr.