Java – Do modern languages still use parser generators

haskelljavaparsingrubyswift-language

I was researching about the gcc compiler suite on wikipedia here, when this came up:

GCC started out using LALR parsers generated with Bison, but gradually switched to hand-written recursive-descent parsers; for C++ in 2004, and for C and Objective-C in 2006. Currently all front ends use hand-written recursive-descent parsers

So by that last sentence, (and for as much as I trust wikipedia) I can definitely say that
"C (gcc), C++ (g++), Objective-C, Objective-C++, Fortran (gfortran), Java (gcj), Ada (GNAT), Go (gccgo), Pascal (gpc),… Mercury, Modula-2, Modula-3, PL/I, D (gdc), and VHDL (ghdl)" are all front-ends that no longer use a parser generator. That is, they all use hand-written parsers.

My question then is, is this practice ubiquitous? Specifically, I'm looking for exact answers to "does the standard/official implementation of x have a hand-written parser" for x in [Python, Swift, Ruby, Java, Scala, ML, Haskell]? (Actually, information on any other languages is also welcome here.) I'm sure I can find this on my own after a lot of digging. But I'm also sure this is easily answerable by the community. Thanks!

Best Answer

AFAIK, GCC use hand-written parsers in particular to improve syntactic error diagnostics (i.e. giving human meaningful messages on syntax errors).

Parsing theory (and the parsing generators descending from it) is mostly about recognizing and parsing a correct input phrase. But we expect from compilers that they give a meaningful error message (and that they are able to parse meaningfully the rest of the input after the syntactic error), for some incorrect input.

Also, old legacy languages -like C11 or C++11- (which are conceptually old, even if their latest revision is only three years old) are not at all context-free. Dealing with that context sensitiveness in grammars for parser generators (i.e. bison or even menhir) is boringly difficult.