q1. pypy is the interpreter, a RPython program which can interpret Python code, there is no output language, so we can't consider it as a compiler, right?
PyPy is similar to CPython, both has a compiler+interpreter. CPython has a compiler written in C that compiles Python to Python VM bytecode then executes the bytecode in an interpreter written in C. PyPy has a compiler written in RPython that compiles Python to Python VM bytecode, then executes it in PyPy Interpreter written in RPython.
q2. Can compiler py2rpy exist, transforming all Python programs to RPython? In which language it's written is irrelevant. If yes, we get another compiler py2c. What's the difference between pypy and py2rpy in nature? Is py2rpy much harder to write than pypy?
Can a compiler py2rpy exists? Theoretically yes. Turing completeness guarantees so.
One method to construct py2rpy
is to simply include the source code of a Python interpreter written in RPython in the generated source code. An example of py2rpy compiler, written in Bash:
// suppose that /pypy/source/ contains the source code for pypy (i.e. Python -> Nothing RPython)
cp /pypy/source/ /tmp/py2rpy/pypy/
// suppose $inputfile contains an arbitrary Python source code
cp $inputfile /tmp/py2rpy/prog.py
// generate the main.rpy
echo "import pypy; pypy.execfile('prog.py')" > /tmp/py2rpy/main.rpy
cp /tmp/py2rpy/ $outputdir
now whenever you need to translate a Python code to RPython code, you call this script, which produces -- in the $outputdir -- an RPython main.rpy
, the RPython's Python Interpreter source code, and a binary blob prog.py. And then you can execute the generated RPython script by calling rpython main.rpy
.
(note: since I'm not familiar with rpython project, the syntax for calling the rpython interpreter, the ability to import pypy and do pypy.execfile, and the .rpy extension is purely made up, but I think you get the point)
q3. Is there some general rules or theory available about this?
Yes, any Turing Complete language can theoretically be translated to any Turing Complete language. Some languages may be much more difficult to translate than other languages, but if the question is "is it possible?", the answer is "yes"
q4. ...
There is no question here.
I wouldn't get too hung up on the terminology. Google has some good definitions:
Design: (search for 'definition of design')
noun:
a plan or drawing produced to show the look and function or workings of a building, garment, or other object before it is built or made.
verb:
decide upon the look and functioning of (a building, garment, or other object), typically by making a detailed drawing of it.
Actual drawings sometimes exist (think flowcharts), but most designs I'm familiar with are typically written descriptions.
Implementation:
the process of putting a decision or plan into effect; execution.
As for your specific questions:
- They mean pretty much what the dictionary says they mean. "Designing a thing" means figuring out how it's going to work, possibly what it looks like, etc. "The design" is the output of the process of "Designing a thing". "Implementing a design" means actually doing the work to convert the idea (the design) into something real.
- Short answer, "yes". I would substitute "designing a system" for "design of a system" and "implementing a system" for "implementation of a system", but you have the right idea.
- If you asked this question by itself, it would be closed as 'too broad' within minutes. The process of designing anything will have common steps including collecting requirements, identifying possible solutions, analyzing those solutions, etc. Likewise, the process of implementing anything will also have certain steps in common including the actual construction, verification that the construction is correct, etc.
Best Answer
Think about how a human language works. In theory, there's some kind of document(s) that lays down what the rules are for what constitutes "English". There's a set of definitions for words, rules for how grammar works to assemble those words into meaningful sentences, and so forth. Each of us communicates in English using our own ideas of what those words mean and how English grammar works.
So we can analogize this to programming languages. You have a language like Python. The "people" who "speak" Python in this analogy are not programmers; the "speakers" of Python are what we call "implementations". These are the tools which understand Python and make the computer actually do what the Python instructions say to do.
CPython is one such "speaker" of Python. Once, it was the only Python implementation. CPython is a specific implementation that is largely written in C. Jython is an implementation of Python written in Java. They both effectively do the same thing, understanding the same grammar for the same purpose. But they do it in different ways.
Note that in the first paragraph, I only said that "in theory" languages have some document defining what they mean. That's because there are many human languages that don't have formal rules. And even within formal rules, there will always be dialects, neologisms, and other things as languages evolve naturally. Note that this makes it possible for one person who speaks one language to fail to understand someone who ostensibly speaks the same language.
This happens because the two people, the two "implementations," disagree about what the language actually is.
Programming languages tend to take one of two routes with regard to definition, or "standardization". They can have a de-facto standard or a de-jure standard. Indeed, the word "implementation" in this context means to "implement" a "standard".
Languages with a de-jure standard have a hard document(s) that lays down everything about what the language is. Hopefully, it formally specifies everything about the language, detailing in depth the behavior of every syntactic construct. If two implementations differ in behavior when given the same code, then one of the following is happening:
Note that there is a huge difference between a formal specification for a language and reference documentation.
A de-facto standard is where you define the language by picking one implementation and saying "whatever that thing says is language X is language X". This means that if two implementations differ, the de-facto standard one is what is correct.
This also means that if the de-facto standard has quirks in it, every implementation of that language must reproduce those quirks. For example, if the de-facto standard has a hard limit of only 32 function parameters for some reason, then implementations of that language which allow more function parameters are technically wrong.
If the de-facto standard has (non-crashing) bugs in it... well, that's the thing: it can't have bugs. "Bug" is defined relative to the standard. And the de-facto standard implementation is the standard; "bugs" in it are therefore features, until the next version comes out that changes things. And such things are, on some level, changes to language features, not bug fixes.
Many languages start with de-facto standards and evolve towards de-jure standards. Both C# and Java started without formal language specifications, but both of them have them now. C++ was an ill-defined mish-mash of stuff until C++98 formalized what it meant to be C++.
Python is kind of half-and-half at this point. It has a language reference document that proports to define the language, but it isn't a "formal specification". From the language reference:
So in the case of ambiguity, you generally would defer to CPython's behavior. Then again, there are plenty of formal specifications that have ambiguous areas as well.