GCC Parser – Why Did GCC Switch from Bison to a Recursive Descent Parser?

ccompilerparsing

Was there a language change that required it or some practical reason why Bison was no longer appropriate or optimal?

I saw on wikipedia that they switched, referring to the GCC 3.4 and GCC 4.1 release notes.

These release notes state:

A hand-written recursive-descent C++ parser has replaced the
YACC-derived C++ parser from previous GCC releases. The new parser
contains much improved infrastructure needed for better parsing of C++
source codes, handling of extensions, and clean separation (where
possible) between proper semantics analysis and parsing. The new
parser fixes many bugs that were found in the old parser.

And:

The old Bison-based C and Objective-C parser has been replaced by a
new, faster hand-written recursive-descent parser

What I would like to know is what actual problems were they having and why it was impossible / impractical to solve using Bison

Best Answer

GCC switched to hand-written parsing because error messages are more meaningful when using recursive descent techniques, as I explained here.

Also, C++ is becoming such a (syntactically) complex language to parse that using parser generators is not worthwhile for it.

At last, the bulk of the work of a real compiler is not parsing, it is optimizing. GCC middle end optimization passes are much more complex than its parsing.

(BTW you can customize GCC e.g. with plugins or using MELT, but you cannot really extend the syntax of the language it is accepting - except by adding attributes and pragmas).

Related Topic