I was wondering why C++ is a good choice to write a compiler. Of course C is good for this purpose too, because many compilers are written either in C or C++ but I am more interested in C++ this time. Any good reasons ? I was looking for that in the Internet, but I cannot find any good reasons.
Why Use C++ to Write a Compiler
ccompiler
Related Solutions
q1. pypy is the interpreter, a RPython program which can interpret Python code, there is no output language, so we can't consider it as a compiler, right?
PyPy is similar to CPython, both has a compiler+interpreter. CPython has a compiler written in C that compiles Python to Python VM bytecode then executes the bytecode in an interpreter written in C. PyPy has a compiler written in RPython that compiles Python to Python VM bytecode, then executes it in PyPy Interpreter written in RPython.
q2. Can compiler py2rpy exist, transforming all Python programs to RPython? In which language it's written is irrelevant. If yes, we get another compiler py2c. What's the difference between pypy and py2rpy in nature? Is py2rpy much harder to write than pypy?
Can a compiler py2rpy exists? Theoretically yes. Turing completeness guarantees so.
One method to construct py2rpy
is to simply include the source code of a Python interpreter written in RPython in the generated source code. An example of py2rpy compiler, written in Bash:
// suppose that /pypy/source/ contains the source code for pypy (i.e. Python -> Nothing RPython)
cp /pypy/source/ /tmp/py2rpy/pypy/
// suppose $inputfile contains an arbitrary Python source code
cp $inputfile /tmp/py2rpy/prog.py
// generate the main.rpy
echo "import pypy; pypy.execfile('prog.py')" > /tmp/py2rpy/main.rpy
cp /tmp/py2rpy/ $outputdir
now whenever you need to translate a Python code to RPython code, you call this script, which produces -- in the $outputdir -- an RPython main.rpy
, the RPython's Python Interpreter source code, and a binary blob prog.py. And then you can execute the generated RPython script by calling rpython main.rpy
.
(note: since I'm not familiar with rpython project, the syntax for calling the rpython interpreter, the ability to import pypy and do pypy.execfile, and the .rpy extension is purely made up, but I think you get the point)
q3. Is there some general rules or theory available about this?
Yes, any Turing Complete language can theoretically be translated to any Turing Complete language. Some languages may be much more difficult to translate than other languages, but if the question is "is it possible?", the answer is "yes"
q4. ...
There is no question here.
They can, as shown by new languages that do.
But a design decision was made all those years ago (when the C compiler was multiple independent stages) and now to maintain compatibility the pre-processor has to act in a certain way to make sure old code compiles as expected.
As C++ inherits the way it processes header files from C it maintained the same techniques. We are supporting a old design decision. But changing the way it works is too risky lots of code could potentially break. So now we have to teach new users of the language how to use include guards.
There are a couple of tricks with header files were you deliberately include it multiple times (this does actually provide a useful feature). Though if we redesigned the paradigm from scratch we could make this the non-default way to include files.
Best Answer
C++ has two sides to it. It has a low-level development side which makes it seem like a natural language for doing low level thing like code generation. It also has a high-level side (which C does not) that lets you structure a complex application (like a compiler) in a logical, object oriented way, while still maintaining performance. Because it has both the low and high level aspects to it, it's a good choice for large application which require low-level features or performance.