In the earlier article, Lowest Common Ancestor (BST), I discussed how you can use the special ordering of a Binary Search Tree to quickly and easily identify the Lowest Common Ancestor of two nodes. Of course, not all trees are BSTs and so in this article we’ll look at a way of finding the LCA in a non-BST.

So, just to keep things simple I am actually going to use the same code as before, but with a modified find_lca function. This means the tree is actually a BST; however, it is important to note that this is an irrelevancy since we’re not making use of the BST’s ordering properties and that this algorithm would work on any binary tree.

There are a number of ways this can be done. One naive way is to perform a recursive Depth First Search (DFS) to ensure that the two nodes we’re looking for are both either on the left or the right of the current node. If they’re on the left we perform a recursive search on the left child. If they’re both on the right we perform a recursive search on the right child. If they fall either side we’ve found the LCA.

The problem with this approach is the search time is quadratic, order O(n^2). Why? Because we have to keep searching the same nodes over and over, only dropping down one level in the tree each time, until we find the LCA. Can we do better? Well, yes we can – as long as we’re prepared to trade a little time for a little space.

Since the DFS is generally performed recursively (although it can be performed iteratively using a real stack) we can, as the stack unwinds, store the node at each level. This will, effectively, allow us to trace out the route from the node in question back to the root node.

If we do this for both nodes we’ll have two lists. If we then compare those lists side-by-side we’ll see that the nodes, up to a certain point, match. Where the last item in the list that matches is the point of divergence and, this, the LCA.

Consider the following tree:

8 / (3) 9 / [1] 6 / 4 [7]

From the tree, above, we can note the following:

- If we were to dispatch a search for [1] we’d end up with a list of 8->3->1.
- If we were to dispatch a search for [1] we’d end up with a list of 8->3->6->7.
- If we compare these two lists we see that 3 is the point of divergence and the LCA!

We’ve effectively traded a little liner space, order O(n), for a quadratic time complexity. The overall time complexity of this is now, also, liner; corresponding to the number of nodes in the tree. I’d say that’s a pretty reasonable trade-off.

Here’s the code.

/** * Lowest Common Ancestor (non-BST) * */ #include #include #include #include #include namespace evilrix { namespace mostlycoding { /** * @brief A node object for our tree, below. * * @tparam T Generic type parameter representing our data. */ template struct Node { using Data = T; ///< The data using PNode = std::shared_ptr<Node>; ///< The node /** * @brief Initializes a new instance of the main class. * * @param data (Optional) the data. */ Node(Data const & data = 0) : data(data) {} PNode plhs; ///< The plhs PNode prhs; ///< The prhs Data data; ///< The data }; /** * @brief A tree, implemented as a BST. * * @tparam T T Generic type parameter representing our data. */ template class Tree { public: using Data = T; ///< The data using PNode = std::shared_ptr<Node>; ///< The node /** * @brief Initializes a new instance of the main class. */ Tree() {} /** * @brief Inserts the given data. * * @param data The data. */ void insert(Data const & data) { PNode * pproot = &proot_; // see my note in the "find_lca" function as to why I am using an // iterative rather than recursive traversal approach. while (*pproot) { pproot = data < (*pproot)->data ? &(*pproot)->plhs : &(*pproot)->prhs; } (*pproot) = PNode(new Node(data)); } /** * @brief Uses a DFS to find a route to the node with data value 'd' * * @param d The Data node to file. * @param [in,out] route The route. * @param ppnode The start node. * * @return true if it succeeds, false if it fails. */ bool find_route(Data const & d, std::vector & route, PNode const * ppnode) const { if (!ppnode || !(*ppnode)) { return false; } if ((*ppnode)->data == d) { route.push_back(*ppnode); return true; } if (find_route(d, route, &(*ppnode)->plhs)) { route.push_back(*ppnode); return true; } if (find_route(d, route, &(*ppnode)->prhs)) { route.push_back(*ppnode); return true; } return false; } /** * @brief Searches for the first lca. * * @param x The Data find. * @param y The Data find. * * @return The found lca. */ PNode find_lca(Data const & x, Data const & y) const { std::vector routex; find_route(x, routex, &proot_); std::vector routey; find_route(y, routey, &proot_); PNode result; while ( !routex.empty() && !routey.empty() && routex.back()->data == routey.back()->data ) { result = routex.back(); routex.pop_back(); routey.pop_back(); } return result; } PNode proot_; }; } } using namespace evilrix::mostlycoding; /** * @brief Main entry-point for this application. * * @return Exit-code for the process - 0 for success, else an error code. */ int main(/* int argc, char * argv[] */) { /* * 8 * / * (3) 9 * / * [1] 6 * / * 4 [7] */ Tree tree; tree.insert(8); tree.insert(3); tree.insert(1); tree.insert(6); tree.insert(4); tree.insert(7); tree.insert(9); Tree::PNode pnode1 = tree.find_lca(1, 7); std::cout << (pnode1 ? std::to_string(pnode1->data) : "(null)") << std::endl; }

That’s all folks 🙂