Indices

The main goal of a database is to persist data on the disk. However, we do need to search and retrieve data efficiently from what we have stored. A database index is a data structure that helps in quickly locating data with specific attributes (keys). Most index implementations use balanced N-ary tree variants, such as the B+ tree to implement the index efficiently.

A B+ tree is an N-ary tree, like a B tree, but the difference is that the data structure contains only the keys—the values are stored externally. The primary value add of the B+ tree over a binary tree is the high fanout (pointers to child nodes) at each node. This allows for more efficient searches in the keyspace. This is crucial for databases, as more searches means more I/O operations (which are much more expensive than memory accesses). The reason for storing only keys (which is the difference between a B tree and a B+ tree) is that it allows much more search information to be packed into one disk block—thereby improving cache efficiency and reducing I/O operations on disks. The leaves of the B+ tree are often linked with one another to form a linked list; this enables efficient range queries or ordered iterations.

The following diagram shows a B+ tree with a maximum degree of 3, and 6 keys inserted:

For more details, please refer to http://www.cburch.com/cs/340/reading/btree/index.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset