Lowest Common Ancestor (non-BST)

In the earlier article, Lowest Common Ancestor (BST), I discussed how you can use the special ordering of a Binary Search Tree to quickly and easily identify the Lowest Common Ancestor of two nodes. Of course, not all trees are BSTs and so in this article we’ll look at a way of finding the LCA in a non-BST.

So, just to keep things simple I am actually going to use the same code as before, but with a modified find_lca function. This means the tree is actually a BST; however, it is important to note that this is an irrelevancy since we’re not making use of the BST’s ordering properties and that this algorithm would work on any binary tree.

There are a number of ways this can be done. One naive way is to perform a recursive Depth First Search (DFS) to ensure that the two nodes we’re looking for are both either on the left or the right of the current node. If they’re on the left we perform a recursive search on the left child. If they’re both on the right we perform a recursive search on the right child. If they fall either side we’ve found the LCA.

The problem with this approach is the search time is quadratic, order O(n^2). Why? Because we have to keep searching the same nodes over and over, only dropping down one level in the tree each time, until we find the LCA. Can we do better? Well, yes we can – as long as we’re prepared to trade a little time for a little space.

Since the DFS is generally performed recursively (although it can be performed iteratively using a real stack) we can, as the stack unwinds, store the node at each level. This will, effectively, allow us to trace out the route from the node in question back to the root node.

If we do this for both nodes we’ll have two lists. If we then compare those lists side-by-side we’ll see that the nodes, up to a certain point, match. Where the last item in the list that matches is the point of divergence and, this, the LCA.

Consider the following tree:

From the tree, above, we can note the following:

  • If we were to dispatch a search for [1] we’d end up with a list of 8->3->1.
  • If we were to dispatch a search for [1] we’d end up with a list of 8->3->6->7.
  • If we compare these two lists we see that 3 is the point of divergence and the LCA!

We’ve effectively traded a little liner space, order O(n), for a quadratic time complexity. The overall time complexity of this is now, also, liner; corresponding to the number of nodes in the tree.  I’d say that’s a pretty reasonable trade-off.

Here’s the code.

ideone

That’s all folks 🙂