C++ vs Python – Technical Limitations Affecting Python Script Performance

clanguage-featuresperformancepython

I'm a long-time Python user. A few years ago, I started learning C++ to see what it could offer in terms of speed. During this time, I would continue to use Python as a tool for prototyping. This, it seemed, was a good system: agile development with Python, fast execution in C++.

Recently, I've been using Python more and more again, and learning how to avoid all of the pitfalls and anti-patterns that I was quick to use in my earlier years with the language. It's my understanding that using certain features (list comprehensions, enumerations, etc.) can increase performance.

But are there technical limitations or language features that prevent my Python script from being as fast as an equivalent C++ program?

Best Answer

I kind of hit this wall myself when I took a full-time Python programming job a couple years ago. I love Python, I really do, but when I started to do some performance tuning, I had some rude shocks.

The strict Pythonistas can correct me, but here are the things I found, painted in very broad strokes.

  • Python memory usage is kind of scary. Python represents everything as a dict -- which is extremely powerful, but has a result that even simple data types are gigantic. I remember the character "a" took 28 bytes of memory. If you're using big data structures in Python, make sure to rely on numpy or scipy, because they are backed by direct byte-array implementation.

That has a performance impact, because it means there are extra levels of indirection at run time, in addition to slogging around huge amounts of memory compared to other languages.

  • Python does have a global interpreter lock, which means that for the most part, processes are running single-threaded. There may be libraries that distribute tasks across processes, but we were spinning up 32 or so instances of our python script and running each single threaded.

Others can talk to the execution model, but Python is a compile-at-runtime and then interpreted, which means it doesn't go all the way to machine code. That also has a performance impact. You can easily link in C or C++ modules, or find them, but if you just run straight up Python, it's going to have a performance hit.

Now, in web service benchmarks, Python compares favorably to the other compile-at-runtime languages like Ruby or PHP. But it's pretty far behind most of the compiled languages. Even the languages that compile to intermediate language and run in a VM (like Java or C#) do much, much better.

Here is a really interesting set of benchmark tests that I refer to occasionally:

http://www.techempower.com/benchmarks/

(All that said, I still love Python dearly, and if I get the chance to choose the language I'm working in, it's my first choice. Most of the time, I'm not constrained by crazy throughput requirements anyway.)