B trees vs binary trees

b-treebinary treeperformance

If I am implementing in-memory(RAM) search operation with b trees, then would it be better in terms of caching or some other effects when compared with binary trees?

What I know is-

binary search tress---O(log n)
btrees ---------------O(c log n)

there was a lot of discussion regarding that on various blogs.

Best Answer

Algorithmic complexity is the same, since O(logb n) = O(c log n) = O(log n) but the constant factors, which are hidden in big-O notation, could vary noticeably, depending on implementation and hardware.

B-trees were designed for platter hard disks, which have a large access time (moving the head into position) after which an entire physical sector is read. Making the B-tree nodes as large as the sector minimizes the number of access times and maximizes the useful data out of each read operation.

But if you are working out of memory you have a negligible access time, therefore a better comparison is to count the number of single words accessed by your algorithm.

For example, let's plan a data structure to store 220 keys of 1 word each, for a total of 4MiB of raw data on a 32bit machine.

A binary search tree will have 220 nodes, each holding one key and two pointers (3 words). Depth will be log2(220) = 20. The average search will have to read the key and one of the pointers from each node in its path, from the root all the way down = 40 words.

A B-tree made for hard disks will have 4kB nodes. Each node could be stored internally as a sorted array of key and pointer couples, between 256 and 512 of them. What is the average search going to look like? Considering an average 3/4 fill, each node will contain 384 entries, and its internal binary search will have to visit on average log2(384) = 5.95 keys. The average depth will be log384(220) = 2.33, so our search will have to read on average 2.33 times 5.95 keys, or about 14 words.

In the case of a low-fanout (branching factor) B-tree, with each node holding between 16 and 32 keys, the average fill will be 24 keys, the average depth log24(220) = 4.36, the binary search in each node will make log2(24) = 4.58 comparisons, and the overall average search will have to read about 20 words.

Keep in mind that the last two data structures achieve a better result than binary trees because they optimize read operations over modifications. To insert a key into one of these B-trees you would have to rewrite on average an entire 384-word or 24-word node, if not more than one, while in the binary-tree case a write operation would still only need to touch up to 40 words.

(Previously I had it wrong. Thanks to @virco and @Groo for pointing out my mistake in the comments.)

In any case, it seems that memory-only B-trees with a low fanout appear to perform better than binary trees in practice.

32 keys per node in particular seems to be a sweet spot for current architectures, both 32- and 64-bit. Many newer languages and libraries are using 32-key B-trees as a built-in data structure, alongside or as a replacement for hash tables and arrays. This usage was spearheaded by Clojure and other functional languages, but was subsequently taken up by more mainstream languages such as Javascript, with the recent focus on immutable data structures (eg. Immutable.js)

This result may be explained not only by counting the number of words being read from memory, but the cache misses too, which are read operations that cause the CPU to stall and wait for the RAM. If the caching architecture can fetch chunks of RAM that contain an entire B-tree node at a time, we get the same optimization that has been used sucessfully for disk-based mass storage.

In the case of hard-disk optimized data structures, we would use B-trees with nodes as large as the physical disk sector, to minimize disk access times. In this case, we are using B-trees with nodes as large as the read operation that is performed by the Level 3 cache against the RAM, to minimize cache misses.