Data Structures Talk – Binary Search Trees

Today I will explain a data structure that is efficient for searching and sorting. You may have already guessed it, it’s the Binary Search Tree (BST). This blog will be based upon the assumption that the reader is already familiar with binary trees. Let’s begin.

Binary search trees are a subset of binary trees. They are basically binary trees with some extra restriction. This BST property is that every node, n, has a rule that: all nodes in the left subtree have smaller values than n, as well as, all nodes in the right subtree have larger values than n.

Here is an example of a Binary Search Tree.

You may be wondering now, how does this make searching efficient in a BST? We come to the observation that: if we are searching for a number, x, when we compare x to the root node and if x is not equal to the root, then we can eliminate one of the subtrees to search in. More precisely, if x is smaller than the root, then we search the left subtree, as it contains all the numbers smaller than the root node. Likewise, if x is larger than the root node, then we search the right subtree. If we run out of nodes to search, then x is non existent in the tree. This is the algorithm used to search a BST.

Comparing BST search to a linear search, we can further analyze the efficiency of this data structure. When searching in a linear list or a binary tree, the worst case is that we will have to make comparisons equal to the total number of nodes. When searching in a BST, the worst case is dependent on the height of the tree.

However, there are many ways of composing a BST, each with different heights. Here is an image displaying a bad BST, on the left, vs a good BST, on the right.

Hence, it’s important to minimize the height of the tree. A complete binary tree, also known as a balanced tree, with n nodes is a binary tree such that every level is full, with the exception of the bottom level which should be filled in left to right. A complete binary tree has the minimum height of any binary tree with n nodes. This minimum height, h, satisfies the following expression:  h ≤ log2 n. Which can be proven simply by using induction.

Therefore, if the binary search tree is also a complete binary tree, searching the BST can take log(n) steps, which is much faster than linear time.

There are many ways to represent a BST, and I challenge you to build the necessary classes and methods needed. It may come in handy one day. I hope you learned a thing or two about the binary search tree data structure, and until next time, peace. ✌


Leave a comment