Lowest Common Ancestor (BST)

The Binary Search Tree (BST) is a tree like data structure that allows for the quick lookup of data. They work by storing inserted data in a tree like structure that uses the “divide and conquer” approach to locate data. Each node in the tree has, at most, two children. The left hand side (lhs) node will always contain a value that is less than it’s parent. The right hand side (rhs) node will always contain a value that is greater-than or equal to it’s parent.

Using this property we can very quickly traverse the tree to find a value, with the number of look-ups required being no greater than the maximum height of the tree. If we assume that the tree is well balanced this gives us an amortized time complexity of order O(log n). In other words, the look-up time complexity is logarithmic. It’s important to know; however, that this only holds true if the tree is balanced. If it’s not the worse case asymptotic time complexity is order O(n), or linear.

Why is this? Simply because in the worse case an unordered tree is just a link list. Or, put another way, a linked list can be thought of as a special case tree; one that has just one branch and no divergences. Imagine a situation where we have a BST that isn’t self balancing and we inject data into it in an ascending order. Each value inserted will be put to the left of the last, which is just another way of saying we’ve created a list of sorted numbers!

So, from all this we can conclude that for a BST to be useful it really needs to be balanced. It just so happens that there are a number of BST implementations that will take care of this, automatically, when data is inserted. They have a “self balancing” property. Examples of such trees are the AVL Tree (named after the initialise of the inventors) and the Red/Black tree (often the basis of the std::set and std::map).

Of course, this article isn’t specifically about BST implementations but it is important to have a basic understanding of how a BST works before moving on to the topic in hand; namely, how to find the Lowest Common Ancestor of two values in our BST?!

I guess the first thing we should do is actually define what we mean by Lowest Common Ancestor. Put simply, the Lowest Common Ancestor of any node in a BST is the node where traversal of each of the child nodes diverges down different branches. This might sound complicated but it is anything but.

Let’s look as a simple BST.

In this example, the Lowest Common Ancestor of 1 and 7 is 3. This is because node 3 is the node where traversal in locating these two nodes, 1 and 7, diverges into separate branches. Our task is to figure out a simple algorithm for finding the Lowest Common Ancestor that will work for any BST of any size. Sounds complicated, right? Actually, it’s about a simple an operation as things get when it comes to messing with trees. Here’s how it works…

We start at the root node and check to see if 1 and 7 are less than it’s value. If they are we haven’t found the LCA but we do know it’s somewhere down the left hand side branch so we can now move down to that node. At the next node we perform the same check to see if 1 and 7 are less than it’s value. This time around only one of them is less. At this point we can stop since we’ve now found the LCA.

We know this because at this point one number can be found to the Lhs of the current node and the other can be found on the Rhs of the current node. In other words at this point the traversal will diverge. Let’s think about it a little more abstractly by introducing a little pseudo code

This loop will continue to traverse the tree one node at a time, either going left because both values can be found to the left of the current node or going right because bot values can be found to the right of the current node. As soon either of these statements is no longer true we have found the Lowest Common Ancestor.

And now, the real code…

ideone

 

Hopefully, this post has shown that finding the Lowest Common Ancestor of a BST is really not actually that hard. Like most algorithms, once you know and understand the logic behind how it works it’s pretty simple stuff. Of course, knowing is 90% of the battle! 🙂