Why Doesn’t Python Need a Compiler?

compilerprogramming-languagespython

Just wondering (now that I've started with C++ which needs a compiler) why Python doesn't need a compiler?

I just enter the code, save it as an exec, and run it. In C++ I have to make builds and all of that other fun stuff.

Best Answer

Python has a compiler! You just don't notice it because it runs automatically. You can tell it's there, though: look at the .pyc (or .pyo if you have the optimizer turned on) files that are generated for modules that you import.

Also, it does not compile to the native machine's code. Instead, it compiles to a byte code that is used by a virtual machine. The virtual machine is itself a compiled program. This is very similar to how Java works; so similar, in fact, that there is a Python variant (Jython) that compiles to the Java Virtual Machine's byte code instead! There's also IronPython, which compiles to Microsoft's CLR (used by .NET). (The normal Python byte code compiler is sometimes called CPython to disambiguate it from these alternatives.)

C++ needs to expose its compilation process because the language itself is incomplete; it does not specify everything the linker needs to know to build your program, nor can it specify compile options portably (some compilers let you use #pragma, but that's not standard). So you have to do the rest of the work with makefiles and possibly auto hell (autoconf/automake/libtool). This is really just a holdover from how C did it. And C did it that way because it made the compiler simple, which is one main reason it is so popular (anyone could crank out a simple C compiler in the 80's).


Some things that can affect the compiler's or linker's operation but are not specified within C or C++'s syntax:

  • dependency resolution
  • external library requirements (including dependency order)
  • optimizer level
  • warning settings
  • language specification version
  • linker mappings (which section goes where in the final program)
  • target architecture

Some of these can be detected, but they can't be specified; e.g. I can detect which C++ is in use with __cplusplus, but I can't specify that C++98 is the one used for my code within the code itself; I have to pass it as a flag to the compiler in the Makefile, or make a setting in a dialog.

While you might think that a "dependency resolution" system exists in the compiler, automatically generating dependency records, these records only say which header files a given source file uses. They cannot indicate what additional source code modules are required to link into an executable program, because there is no standard way in C or C++ to indicate that a given header file is the interface definition for another source code module as opposed to just a bunch of lines you want to show up in multiple places so you don't repeat yourself. There are traditions in file naming conventions, but these are not known or enforced by the compiler and linker.

Several of these can be set using #pragma, but this is non-standard, and I was speaking of the standard. All of these things could be specified by a standard, but have not been in the interest of backward compatibility. The prevailing wisdom is that makefiles and IDEs aren't broke, so don't fix them.

Python handles all this in the language. For example, import specifies an explicit module dependency, implies the dependency tree, and modules are not split into header and source files (i.e. interface and implementation).

Related Topic