The simple thoughts of Jonathan J Hunt

Btrees Are the New Black

Self-balancing trees are an important class of data structures. In most textbooks, red/black binary trees are introduced first, and then btrees come later, usually mentioned as a data structure used in file systems. Both data structures have asymptotically identical performance ($O(\log n)$ time for insertion, lookup and deletion). Here, I will show why, despite your textbook’s ordering, btrees may often be a better choice.

Btrees (or variants) are often used in file systems because they make better use of devices which read and write blocks of data (such as hard drives or flash). They do this by trading some extra space and computation for increased data locality. Each node in a btree stores between $t$ and $2t - 1$ keys (except for the root node, which might contain less entries). Search may require examining each key in the nodes on the search path and any change to a node may require shifting all of the entries to maintain the sort. However, you will notice that all of these opererations are data local.

What else is block-like these days? Modern (misleadingly named) random access memory. Modern CPUs have multiple layers of caching to improve the performance dealing with the (relatively) slow main memory. My Intel Core 2 Duo has a 64 byte cache line size, meaning that requesting the contents of a (correctly aligned) memory address makes examining the contents of memory address within 64 bytes much cheaper (a full cache miss costs ~165 cycles). Another advantage is that btrees result in less calls to the memory allocator since each node stores more entries.

The full description of red/black trees and btrees is outside the scope of this article. For our purposes we just need a brief description of what the two tree types look like.

A red-black tree is a balanced binary tree. Each node in the tree stores one key to value mapping and points to the left (nodes with a key that is smaller than the current node) and right (nodes that are larger than the current node). As you can see, looking up an item in the tree involves branching at each node to the left or right until termination.

A btree, on the other hand, allows more entries per node (as above, between $t$ and $2t-1$ where $t$ is a parameter of the btree). Each non-leaf node has up to $2t$ children, each child tree containing the nodes intermediate in value between the constituent entries. Searching in a btree requires examining, on average, half of the entries in a node to determine which child to descend into. On the other hand, the depth of the tree is decreased since, even in the worst case, each node has at least $t$ entries (excluding the root node).

For many, binary trees are the standard choice for a tree data structure. With modern architectures this might not be the most efficient choice. Here, we examine the speed of inserting and then retrieving 107 elements. The mapping we are considering is from randomly chosen integers to void pointers.

Unlike binary trees, btrees have a tunable parameter $t$, which specifies the number of elements stored in each node. So first we examine the best choice of the $t$ parameter.

The graph shows the user CPU time for different choices of $t$. As you can see, the optimal choice of $t$ is far-away from a binary tree. The minimum runtime was at $t = 24$ (give or take some noise, but that’s close enough for our purposes). This means nodes store between $24$ to $47$ keys.

We then compared a btree with $t = 24$ against two other commonly chosen data structures. A dynamically sized hash map and a red-black tree. In all cases, I used a straightforward C implmentation and compiled with standard optimizations for benchmarking.

Unsuprisingly, a hash map performs far and above the rest. This is to be expected, mapping is exactly what hash maps are for and, in most situations, they should perform insertions and lookups with amortized $O(1)$ time complexity. However, for situations where you wish to preserve order, or avoid any insertion ever taking more than $O(\log n)$, a tree may be a better choice. For those situations you can see that a well-tuned btree was outperforming a red/black tree by more than 2 times.

As memory architectures begin to behave more like block devices than random access memory, it is worth revisiting algorithm choices that were made when memory was uniform. As a bonus I found a btree easier to reason about and implement (the number of lines needed to implement both a btree and red/black was very similar though).

All of the code for the algorithms are available on github.


  • Fixed typos. Thanks to a helpful readers for pointing them out.

  • Clarified wording and the amortized analysis in the paragraph on hash trees.

  • Discussion on hacker news. I learned that Google provides a STL-like implementation of btrees for similar reasons to those discussed here.

  • Corrected the date of the blog post.