Python – the definition of implementation in programing languages? What is CPython

implementationspython

I came across this word "implementation". CPython is one of the most common implementations of Python. What exactly is an implementation?

I researched a bit on how a Python code runs. First, it is compiled and converted into bytecode and then, using PVM or interpreter (not sure), it is converted into machine language which is then executed by CPU.

In all of these procedures, where is CPython (or Jython, PyPy, etc.)?

Are PVM and interpreter the same?

Secondly, I read somewhere that PVM is written in C. Is it true? Is that what it means when they say the implementation is in C?

Best Answer

Think about how a human language works. In theory, there's some kind of document(s) that lays down what the rules are for what constitutes "English". There's a set of definitions for words, rules for how grammar works to assemble those words into meaningful sentences, and so forth. Each of us communicates in English using our own ideas of what those words mean and how English grammar works.

So we can analogize this to programming languages. You have a language like Python. The "people" who "speak" Python in this analogy are not programmers; the "speakers" of Python are what we call "implementations". These are the tools which understand Python and make the computer actually do what the Python instructions say to do.

CPython is one such "speaker" of Python. Once, it was the only Python implementation. CPython is a specific implementation that is largely written in C. Jython is an implementation of Python written in Java. They both effectively do the same thing, understanding the same grammar for the same purpose. But they do it in different ways.

Note that in the first paragraph, I only said that "in theory" languages have some document defining what they mean. That's because there are many human languages that don't have formal rules. And even within formal rules, there will always be dialects, neologisms, and other things as languages evolve naturally. Note that this makes it possible for one person who speaks one language to fail to understand someone who ostensibly speaks the same language.

This happens because the two people, the two "implementations," disagree about what the language actually is.

Programming languages tend to take one of two routes with regard to definition, or "standardization". They can have a de-facto standard or a de-jure standard. Indeed, the word "implementation" in this context means to "implement" a "standard".

Languages with a de-jure standard have a hard document(s) that lays down everything about what the language is. Hopefully, it formally specifies everything about the language, detailing in depth the behavior of every syntactic construct. If two implementations differ in behavior when given the same code, then one of the following is happening:

One or both of them is not implementing the standard correctly.
The standard is poorly specified for that particular piece of language. That is, it doesn't say what the behavior ought to be, or is confusingly worded such that it is ambiguous as to what should actually happen.
The standard explicitly says that the behavior of the code is either defined by the implementation or completely undefined (C++ loves saying this).

Note that there is a huge difference between a formal specification for a language and reference documentation.

A de-facto standard is where you define the language by picking one implementation and saying "whatever that thing says is language X is language X". This means that if two implementations differ, the de-facto standard one is what is correct.

This also means that if the de-facto standard has quirks in it, every implementation of that language must reproduce those quirks. For example, if the de-facto standard has a hard limit of only 32 function parameters for some reason, then implementations of that language which allow more function parameters are technically wrong.

If the de-facto standard has (non-crashing) bugs in it... well, that's the thing: it can't have bugs. "Bug" is defined relative to the standard. And the de-facto standard implementation is the standard; "bugs" in it are therefore features, until the next version comes out that changes things. And such things are, on some level, changes to language features, not bug fixes.

Many languages start with de-facto standards and evolve towards de-jure standards. Both C# and Java started without formal language specifications, but both of them have them now. C++ was an ill-defined mish-mash of stuff until C++98 formalized what it meant to be C++.

Python is kind of half-and-half at this point. It has a language reference document that proports to define the language, but it isn't a "formal specification". From the language reference:

I chose to use English rather than formal specifications for everything except syntax and lexical analysis. This should make the document more understandable to the average reader, but will leave room for ambiguities. Consequently, if you were coming from Mars and tried to re-implement Python from this document alone, you might have to guess things and in fact you would probably end up implementing quite a different language.

So in the case of ambiguity, you generally would defer to CPython's behavior. Then again, there are plenty of formal specifications that have ambiguous areas as well.

Related Solutions

General Rules for Writing a Compiler in Python

q1. pypy is the interpreter, a RPython program which can interpret Python code, there is no output language, so we can't consider it as a compiler, right?

PyPy is similar to CPython, both has a compiler+interpreter. CPython has a compiler written in C that compiles Python to Python VM bytecode then executes the bytecode in an interpreter written in C. PyPy has a compiler written in RPython that compiles Python to Python VM bytecode, then executes it in PyPy Interpreter written in RPython.

q2. Can compiler py2rpy exist, transforming all Python programs to RPython? In which language it's written is irrelevant. If yes, we get another compiler py2c. What's the difference between pypy and py2rpy in nature? Is py2rpy much harder to write than pypy?

Can a compiler py2rpy exists? Theoretically yes. Turing completeness guarantees so.

One method to construct py2rpy is to simply include the source code of a Python interpreter written in RPython in the generated source code. An example of py2rpy compiler, written in Bash:

// suppose that /pypy/source/ contains the source code for pypy (i.e. Python -> Nothing RPython)
cp /pypy/source/ /tmp/py2rpy/pypy/

// suppose $inputfile contains an arbitrary Python source code
cp $inputfile /tmp/py2rpy/prog.py

// generate the main.rpy
echo "import pypy; pypy.execfile('prog.py')" > /tmp/py2rpy/main.rpy

cp /tmp/py2rpy/ $outputdir

now whenever you need to translate a Python code to RPython code, you call this script, which produces -- in the $outputdir -- an RPython main.rpy, the RPython's Python Interpreter source code, and a binary blob prog.py. And then you can execute the generated RPython script by calling rpython main.rpy.

(note: since I'm not familiar with rpython project, the syntax for calling the rpython interpreter, the ability to import pypy and do pypy.execfile, and the .rpy extension is purely made up, but I think you get the point)

q3. Is there some general rules or theory available about this?

Yes, any Turing Complete language can theoretically be translated to any Turing Complete language. Some languages may be much more difficult to translate than other languages, but if the question is "is it possible?", the answer is "yes"

q4. ...

There is no question here.

Design – What do design and implementation mean

I wouldn't get too hung up on the terminology. Google has some good definitions:

Design: (search for 'definition of design')

noun:

a plan or drawing produced to show the look and function or workings of a building, garment, or other object before it is built or made.

verb:

decide upon the look and functioning of (a building, garment, or other object), typically by making a detailed drawing of it.

Actual drawings sometimes exist (think flowcharts), but most designs I'm familiar with are typically written descriptions.

Implementation:

the process of putting a decision or plan into effect; execution.

As for your specific questions:

They mean pretty much what the dictionary says they mean. "Designing a thing" means figuring out how it's going to work, possibly what it looks like, etc. "The design" is the output of the process of "Designing a thing". "Implementing a design" means actually doing the work to convert the idea (the design) into something real.
Short answer, "yes". I would substitute "designing a system" for "design of a system" and "implementing a system" for "implementation of a system", but you have the right idea.
If you asked this question by itself, it would be closed as 'too broad' within minutes. The process of designing anything will have common steps including collecting requirements, identifying possible solutions, analyzing those solutions, etc. Likewise, the process of implementing anything will also have certain steps in common including the actual construction, verification that the construction is correct, etc.

Best Answer

Related Solutions

General Rules for Writing a Compiler in Python

Design – What do design and implementation mean

Related Topic