There is no strict reason why a bytecode based language like C# or Java that has a JIT cannot be as fast as C++ code. However C++ code used to be significantly faster for a long time, and also today still is in many cases. This is mainly due to the more advanced JIT optimizations being complicated to implement, and the really cool ones are only arriving just now.
So C++ is faster, in many cases. But this is only part of the answer. The cases where C++ is actually faster, are highly optimized programs, where expert programmers thoroughly optimized the hell out of the code. This is not only very time consuming (and thus expensive), but also commonly leads to errors due to over-optimizations.
On the other hand, code in interpreted languages gets faster in later versions of the runtime (.NET CLR or Java VM), without you doing anything. And there are a lot of useful optimizations JIT compilers can do that are simply impossible in languages with pointers. Also, some argue that garbage collection should generally be as fast or faster as manual memory management, and in many cases it is. You can generally implement and achieve all of this in C++ or C, but it's going to be much more complicated and error prone.
As Donald Knuth said, "premature optimization is the root of all evil". If you really know for sure that your application will mostly consist of very performance critical arithmetic, and that it will be the bottleneck, and it's certainly going to be faster in C++, and you're sure that C++ won't conflict with your other requirements, go for C++. In any other case, concentrate on first implementing your application correctly in whatever language suits you best, then find performance bottlenecks if it runs too slow, and then think about how to optimize the code. In the worst case, you might need to call out to C code through a foreign function interface, so you'll still have the ability to write critical parts in lower level language.
Keep in mind that it's relatively easy to optimize a correct program, but much harder to correct an optimized program.
Giving actual percentages of speed advantages is impossible, it largely depends on your code. In many cases, the programming language implementation isn't even the bottleneck. Take the benchmarks at http://benchmarksgame.alioth.debian.org/ with a great deal of scepticism, as these largely test arithmetic code, which is most likely not similar to your code at all.
Rough results from the following benchmark: 2x write, 3x read.
Here's a simple benchmark in python you can adapt to your purposes, I was looking at how well each would perform simply setting/retrieving values:
#!/usr/bin/env python2.7
import sys, time
from pymongo import Connection
import redis
# connect to redis & mongodb
redis = redis.Redis()
mongo = Connection().test
collection = mongo['test']
collection.ensure_index('key', unique=True)
def mongo_set(data):
for k, v in data.iteritems():
collection.insert({'key': k, 'value': v})
def mongo_get(data):
for k in data.iterkeys():
val = collection.find_one({'key': k}, fields=('value',)).get('value')
def redis_set(data):
for k, v in data.iteritems():
redis.set(k, v)
def redis_get(data):
for k in data.iterkeys():
val = redis.get(k)
def do_tests(num, tests):
# setup dict with key/values to retrieve
data = {'key' + str(i): 'val' + str(i)*100 for i in range(num)}
# run tests
for test in tests:
start = time.time()
test(data)
elapsed = time.time() - start
print "Completed %s: %d ops in %.2f seconds : %.1f ops/sec" % (test.__name__, num, elapsed, num / elapsed)
if __name__ == '__main__':
num = 1000 if len(sys.argv) == 1 else int(sys.argv[1])
tests = [mongo_set, mongo_get, redis_set, redis_get] # order of tests is significant here!
do_tests(num, tests)
Results for with mongodb 1.8.1 and redis 2.2.5 and latest pymongo/redis-py:
$ ./cache_benchmark.py 10000
Completed mongo_set: 10000 ops in 1.40 seconds : 7167.6 ops/sec
Completed mongo_get: 10000 ops in 2.38 seconds : 4206.2 ops/sec
Completed redis_set: 10000 ops in 0.78 seconds : 12752.6 ops/sec
Completed redis_get: 10000 ops in 0.89 seconds : 11277.0 ops/sec
Take the results with a grain of salt of course! If you are programming in another language, using other clients/different implementations, etc your results will vary wildy. Not to mention your usage will be completely different! Your best bet is to benchmark them yourself, in precisely the manner you are intending to use them. As a corollary you'll probably figure out the best way to make use of each. Always benchmark for yourself!
Best Answer
First, let's compare apples with apples: Reads and writes with MongoDB are like single reads and writes by primary key on a table with no non-clustered indexes in an RDBMS.
So lets benchmark exactly that: http://mysqlha.blogspot.de/2010/09/mysql-versus-mongodb-yet-another-silly.html
And it turns out, the speed difference in a fair comparison of exactly the same primitive operation is not big. In fact, MySQL is slightly faster. I'd say, they are equivalent.
Why? Because actually, both systems are doing similar things in this particular benchmark. Returning a single row, searched by primary key, is actually not that much work. It is a very fast operation. I suspect that cross-process communication overheads are a big part of it.
My guess is, that the more tuned code in MySQL outweighs the slightly less systematic overheads of MongoDB (no logical locks and probably some other small things).
This leads to an interesting conclusion: You can use MySQL like a document database and get excellent performance out of it.
If the interviewer said: "We don't care about documents or styles, we just need a much faster database, do you think we should use MySQL or MongoDB?", what would I answer?
I'd recommend to disregard performance for a moment and look at the relative strength of the two systems. Things like scaling (way up) and replication come to mind for MongoDB. For MySQL, there are a lot more features like rich queries, concurrency models, better tooling and maturity and lots more.
Basically, you can trade features for performance. Are willing to do that? That is a choice that cannot be made generally. If you opt for performance at any cost, consider tuning MySQL first before adding another technology.
Here is what happens when a client retrieves a single row/document by primary key. I'll annotate the differences between both systems:
There are only two additional steps for typical SQL-bases RDBMS'es. That's why there isn't really a difference.