Why Old Programming Languages Are Still Revised

language-design

This question is not, "Why do people still use old programming languages?" I understand that quite well. In fact the two programming languages I know best are C and Scheme, both of which date back to the 70s.

Recently I was reading about the changes in C99 and C11 versus C89 (which seems to still be the most-used version of C in practice and the version I learned from K&R). Looking around, it seems like every programming language in heavy use gets a new specification at least once per decade or so. Even Fortran is still getting new revisions, despite the fact that most people using it are still using FORTRAN 77.

Contrast this with the approach of, say, the typesetting system TeX. In 1989, with the release of TeX 3.0, Donald Knuth declared that TeX was feature-complete and future releases would contain only bug fixes. Even beyond this, he has stated that upon his death, "all remaining bugs will become features" and absolutely no further updates will be made. Others are free to fork TeX and have done so, but the resulting systems are renamed to indicate that they are different from the official TeX. This is not because Knuth thinks TeX is perfect, but because he understands the value of a stable, predictable system that will do the same thing in fifty years that it does now.

Why do most programming language designers not follow the same principle? Of course, when a language is relatively new, it makes sense that it will go through a period of rapid change before settling down. And no one can really object to minor changes that don't do much more than codify existing pseudo-standards or correct unintended readings. But when a language still seems to need improvement after ten or twenty years, why not just fork it or start over, rather than try to change what is already in use? If some people really want to do object-oriented programming in Fortran, why not create "Objective Fortran" for that purpose, and leave Fortran itself alone?

I suppose one could say that, regardless of future revisions, C89 is already a standard and nothing stops people from continuing to use it. This is sort of true, but connotations do have consequences. GCC will, in pedantic mode, warn about syntax that is either deprecated or has a subtly different meaning in C99, which means C89 programmers can't just totally ignore the new standard. So there must be some benefit in C99 that is sufficient to impose this overhead on everyone who uses the language.

This is a real question, not an invitation to argue. Obviously I do have an opinion on this, but at the moment I'm just trying to understand why this isn't just how things are done already. I suppose the question is:

What are the (real or perceived) advantages of updating a language standard, as opposed to creating a new language based on the old?

Best Answer

I think the motivation for language designers to revise existing languages is to introduce innovation while ensuring that their target developer community stays together and adopts the new language: moving an existing community to a new revision of an existing language is more effective than creating a new community around a new language. Of course, this forces some developers to adopt the new standard even if they were fine with the old one: in a community you sometimes have to impose certain decisions on a minority if you want to keep the community together.

Also, consider that a general-purpose language tries to serve as many programmers as possible, and often it is applied in new areas it wasn't designed for. So instead of aiming for simplicity and stability of design, the community can choose to incorporate new ideas (even from other languages) as the language moves to new application areas. In such a scenario, you cannot expect to get it right at the first attempt.

This means that languages can undergo deep change over the years and the latest revision may look very different from the first one. The name of the language is not kept for technical reasons, but because the community of developers agrees to use an old name for a new language. So the name of the programming language identifies the community of its users rather than the language itself.

IMO the motivation why many developers find this acceptable (or even desirable) is that a gradual transition to a slightly different language is easier and less confusing than a jump into a completely new language that would take them more time and effort to master. Consider that there are a number of developers that have one or two favourite languages and are not very keen on learning new (radically different) languages. And even for the ones who do like learning new stuff, learning a new programming language is always a hard and time consuming activity.

Also, it can be preferable to be part of a large community and rich ecosystem than of a very small community using a lesser known language. So, when the community decides to move on, many members decide to follow to avoid isolation.

As a side comment, I think that the argument of allowing evolution while maintaining compatibility with legacy code is rather weak: Java can call C code, Scala can easily integrate with Java code, C# can integrate with C++. There are many examples that show that you can easily interface with legacy code written in another language without the need of source code compatibility.

NOTE

From some answers and comments I seem to understand that some readers have interpreted the question as "Why do programming languages need to evolve."

I think this is not the main point of the question, since it is obvious that programming languages need to evolve and why (new requirements, new ideas). The question is rather "Why does this evolution have to happen inside a programming language instead of spawning many new languages?"

Related Solutions

Literate programming, good/bad design methodology

This would yield automatic insertion of the code, instead of a separate function compilation and subsequent requirement of an inter-procedural compilation optimization step to obtain the equivalent speed

This is irrelevant. Has been for decades. You can remove it from the question, since it makes no sense with modern compilers to subvert their optimizers.

So I would like to get feedback of why one should consider this a bad/good design methodology?

There is no downside to literate programming. (I expect dozens of -1 votes for that sentiment.) As a practitioner, I've never seen any problems.

There are some arguments against which all amount to "programming in a higher-level language is be subverted by tweaking the resulting code." Right. In the same way that programming in C++ is subverted by tweaking the .o file that gets produced. It's true, but irrelevant.

Writing literate programs merely means combining high-level and detailed (code-level) design into one document, written with a suitable toolset that produces compiler-friendly files and people-friendly files.

When Knuth invented literate programming, mainstream OO languages didn't exist. Therefore a great deal of the original web and weave tools allowed him to create class-like definitions for abstract data types.

Much of that is irrelevant nowadays. Literate Programming tools can be quite simple if they're focused on modern, high-level object-oriented (or functional) programming languages. There is less need for elaborate workarounds because of the limitations of C (or Pascal or Assembler).

The approach to writing literate programs is no different from the approach to writing illiterate programs. It still requires careful design, unit testing, and neat coding. The only extra work is writing explanations in addition to writing code.

For this reason only -- the need to write coherent explanations -- literate programming is difficult for some programmers. There are a fair number of programmers who are successful (their code passes all the unit tests, is neat and easy to understand) but can't seem to write a coherent sentence in their native language. Don't know why this is true.

There are a very large number of programmers who appear to be only marginally successful and then only by accident. (There are enough bad questions in Stack Overflow that indicate that many programmers struggle to even grasp the fundamentals.) For programmers who ask largely incoherent stack overflow questions, we know they can't really do a good job of literate programming, because they can't do a good job of programming.

Why is x=x++ undefined

Maybe you should first answer the question why it should be defined? Is there any advantage in programming style, readability, maintainability or performance by allowing such expressions with additional side effects? Is

y = x++ + ++x;

Best Answer

Related Solutions

Literate programming, good/bad design methodology

Why is x=x++ undefined

Related Topic