Distance between two nodes of a binary tree

AS
Ackshaey Singh
Difficulty Expert
Time Complexity
O(n)
Space Complexity
O(n)
Binary tree Lowest common ancestor
Amazon Facebook Linkedin

You're in an Amazon interview, and the interviewer draws a binary tree on the whiteboard. "Given two nodes in this tree," they say, "how would you find the number of edges between them?" This problem, also known as "Find Distance Between Two Nodes in a Binary Tree" on other platforms, combines two fundamental tree skills: finding the Lowest Common Ancestor and computing root-to-node distances. It's a favorite at companies like Amazon, Facebook, and LinkedIn because it tests whether you can break a complex tree problem into simpler sub-problems.

TL;DR

Find the Lowest Common Ancestor (LCA) of the two target nodes, then apply the formula: distance(n1, n2) = dist(root, n1) + dist(root, n2) - 2 * dist(root, LCA). The LCA is found by recursively checking whether each subtree contains one or both targets. Root-to-node distance is computed by counting edges from the root down to the target. The whole approach runs in O(n) time and O(n) space.

Why This Problem Matters

This problem sits at the intersection of two important tree concepts: the Lowest Common Ancestor and path computation. Interviewers love it because solving it requires you to decompose a hard problem into manageable pieces. If you can find the LCA and compute distances from the root, the final answer falls right out of a simple formula. That decomposition skill transfers directly to harder tree problems you'll encounter in interviews and production code.

Binary Trees: Setting the Stage

Let's make sure we're aligned on the tree structure we're working with. Here's the binary tree from our problem:

Loading visualization...

Each node holds an integer value, and we're guaranteed that both target values exist somewhere in this tree. The distance between two nodes is the minimum number of edges on the path connecting them, not the number of nodes.

Understanding the Problem

Given: The root of a binary tree and two integer values n1 and n2 Find: The minimum number of edges between the nodes containing n1 and n2 Guarantee: Both values exist in the tree, and all values are unique

Let's look at a few examples using the tree above:

Example 1: distance(4, 5) = 2

The path goes 4 -> 2 -> 5, crossing 2 edges.

Loading visualization...

Example 2: distance(2, 6) = 3

The path goes 2 -> 1 -> 3 -> 6, crossing 3 edges. Notice how the path must go up through the root before going back down.

Loading visualization...

Example 3: distance(4, 6) = 4

The path goes 4 -> 2 -> 1 -> 3 -> 6, crossing 4 edges.

Loading visualization...

The Pattern

Notice something? Every path between two nodes goes up from one node to some common ancestor, then back down to the other node. That common ancestor is the key to solving this problem efficiently.

Solution Approach

The Core Insight: Lowest Common Ancestor

The Lowest Common Ancestor (LCA) of two nodes is the deepest node in the tree that is an ancestor of both. For nodes 4 and 6, the LCA is node 1 (the root). For nodes 4 and 5, the LCA is node 2.

Once we know the LCA, we can compute the distance with a clean formula:

distance(n1, n2) = dist(root, n1) + dist(root, n2) - 2 * dist(root, LCA)

Why does this work? The paths from root to n1 and root to n2 both pass through the LCA. The segment from root to LCA gets counted twice in the sum, so we subtract it twice. What's left is exactly the path from n1 up to the LCA and back down to n2.

Loading visualization...

For nodes 4 and 6: dist(root, 4) + dist(root, 6) - 2 * dist(root, 1) = 2 + 2 - 2*0 = 4. That matches our earlier answer.

Breaking It Down

We need three helper pieces:

  1. Find the LCA of the two target nodes
  2. Compute the distance from the root to any given node
  3. Combine using the formula above

Finding the LCA

The LCA algorithm works by recursively searching the tree. At each node, we check three things:

  • Does the left subtree contain a target? (left)
  • Does the right subtree contain a target? (right)
  • Is the current node itself a target? (mid)

If any two of these three flags are true, the current node is the LCA. Think about why: if targets are in both subtrees, this node is where the paths diverge. If this node is a target and one target is below it, this node is an ancestor of the other target, making it the LCA.

Let's trace through finding the LCA of nodes 4 and 6:

Loading visualization...

At node 1, left=true (found 4 in the left subtree) and right=true (found 6 in the right subtree). Two flags are true, so node 1 is the LCA.

Computing Root-to-Node Distance

The rootDistance helper recursively walks down the tree, counting edges. If the current node matches the target, return the step count. Otherwise, search both subtrees and return the maximum (the non-matching subtree returns -1, so max picks the valid path).

Implementation

Prefer a different language? Jump to solutions in other languages.

Here's the complete solution in Java:

public class Solution {
  public int distance(TreeNode root, int n1, int n2) {
    // Store the LCA in a single-element array (mutable container)
    TreeNode[] lcaHolder = new TreeNode[1];
    getLca(root, n1, n2, lcaHolder);

    // Apply the distance formula
    return rootDistance(root, n1, 0)
             + rootDistance(root, n2, 0)
             - 2 * (rootDistance(root, lcaHolder[0].data, 0));
  }

  /**
   * Returns the distance (edge count) from the root to the node
   * whose data equals n. Returns -1 if the node is not found.
   */
  private int rootDistance(TreeNode node, int n, int stepsSoFar) {
    if (node == null) return -1;
    else if (node.data == n) return stepsSoFar;
    else return Math.max(
        rootDistance(node.left, n, stepsSoFar + 1),
        rootDistance(node.right, n, stepsSoFar + 1)
      );
  }

  /**
   * Finds the LCA of nodes with values n1 and n2.
   * Stores the result in lcaContainer[0].
   */
  private boolean getLca(TreeNode node, int n1, int n2, TreeNode[] lcaContainer) {
    if (node == null) return false;
    else {
      boolean left = getLca(node.left, n1, n2, lcaContainer);
      boolean right = getLca(node.right, n1, n2, lcaContainer);
      boolean mid = node.data == n1 || node.data == n2;

      if ((left && right) || (left && mid) || (right && mid))
        lcaContainer[0] = node;

      return left || mid || right;
    }
  }
}

A few things to notice about this implementation:

The lcaHolder array: Java doesn't have pass-by-reference for objects in the way we need here. We use a single-element array as a mutable container, so the getLca method can "return" the LCA node through a side effect while also returning a boolean indicating whether a target was found.

The rootDistance trick with Math.max: When searching for a node, only one subtree will contain it. The subtree that doesn't contain the target returns -1. Since any valid distance is non-negative, Math.max naturally selects the correct path.

The three-flag LCA check: The condition (left && right) || (left && mid) || (right && mid) covers all cases. If both subtrees contain a target, we're the LCA. If we are a target and one subtree contains the other target, we're the LCA.

Walkthrough

Let's trace through distance(root, 2, 6) on the tree [1, 2, 3, 4, 5, 6, 7]:

Step 1 - Find the LCA:

  • At node 1: recurse left (node 2) and right (node 3)
  • At node 2: mid=true (node 2 is a target). Recurse left (4) and right (5), both return false. One flag true, return true.
  • At node 3: recurse left (6) and right (7)
  • At node 6: mid=true (node 6 is a target). Return true.
  • At node 3: left=true from node 6. One flag true, return true.
  • Back at node 1: left=true (from node 2), right=true (from node 3). Two flags! LCA = node 1.

Step 2 - Compute distances:

  • rootDistance(root, 2, 0): root is 1, go left to 2 with stepsSoFar=1. Found! Return 1.
  • rootDistance(root, 6, 0): root is 1, go right to 3 (step 1), go left to 6 (step 2). Found! Return 2.
  • rootDistance(root, 1, 0): root is 1. Found immediately! Return 0.

Step 3 - Apply the formula: distance = 1 + 2 - 2 * 0 = 3

Complexity Analysis

Time Complexity: O(n)

We traverse the tree up to three times: once to find the LCA, and up to twice more for the root-to-node distances. Each traversal visits every node at most once. Three O(n) passes is still O(n).

Space Complexity: O(n)

The space comes from the recursive call stack. In the worst case (a completely skewed tree), the recursion depth equals the number of nodes. For a balanced tree, this drops to O(log n). The lcaHolder array uses O(1) extra space, which is negligible.

Common Pitfalls

  1. Forgetting the 2 * dist(root, LCA) subtraction: Without it, you're computing the sum of root-to-node distances, not the node-to-node distance. The shared root-to-LCA path must be removed.

  2. LCA detection errors: A common mistake is only checking left && right and forgetting the cases where one target is an ancestor of the other (the mid flag). If node n1 is an ancestor of n2, then at n1 we'd have mid=true and left=true or right=true.

  3. Returning -1 vs. 0 for null nodes in rootDistance: Returning 0 for null nodes would give incorrect distances. We return -1 to signal "not found," and Math.max naturally filters it out.

  4. Confusing edges vs. nodes: The problem asks for edges (the connections between nodes), not the count of nodes on the path. Our rootDistance function correctly counts edges by starting stepsSoFar at 0 for the root.

Interview Tips

When solving this in an interview:

  1. Start by identifying the sub-problems: Tell your interviewer, "I can break this into finding the LCA and computing root distances." That shows structured thinking.

  2. Draw the formula on the whiteboard: Sketch the tree, mark the LCA, and show how the root-to-node paths overlap. The formula will feel intuitive once you visualize the overlapping segments.

  3. Mention the BFS alternative: You could also treat the tree as an undirected graph and BFS from one node to the other. Mention this to show breadth of knowledge, then explain why the LCA approach is cleaner.

  4. Handle the edge case where n1 is an ancestor of n2: Walk through this scenario to show you understand the mid flag in the LCA algorithm.

  5. If you get stuck on the LCA: Start with the simpler rootDistance function. Once you have that, the overall solution becomes "just find the meeting point and apply the formula."

Key Takeaways

  • The distance between two nodes in a binary tree can be decomposed using the LCA: dist(n1, n2) = dist(root, n1) + dist(root, n2) - 2 * dist(root, LCA).
  • The LCA is found by checking three boolean flags at each node (left subtree found target, right subtree found target, current node is target). Two true flags means we found the LCA.
  • The rootDistance helper uses Math.max to filter out the -1 returned by the subtree that doesn't contain the target, keeping the code clean.
  • This problem builds directly on the Lowest Common Ancestor problem. If you're not comfortable with LCA, solve that first before attempting this one.
  • Space complexity is O(n) worst case due to recursion, but O(log n) for balanced trees. Mention this distinction in interviews.

Solutions in Other Languages

Python

class Solution:
    def distance(self, root, n1, n2):
        lca_holder = []
        self.get_lca(root, n1, n2, lca_holder)
        return (self.root_distance(root, n1, 0)
                + self.root_distance(root, n2, 0)
                - 2 * self.root_distance(root, lca_holder[0].data, 0))

    def root_distance(self, node, n, steps_so_far):
        if node is None:
            return -1
        elif node.data == n:
            return steps_so_far
        else:
            return max(self.root_distance(node.left, n, steps_so_far + 1),
                       self.root_distance(node.right, n, steps_so_far + 1))

    def get_lca(self, node, n1, n2, lca_container):
        if node is None:
            return False
        left = self.get_lca(node.left, n1, n2, lca_container)
        right = self.get_lca(node.right, n1, n2, lca_container)
        mid = node.data == n1 or node.data == n2
        if (left and right) or (left and mid) or (right and mid):
            lca_container.append(node)
        return left or mid or right

JavaScript

class Solution {
  distance(root, n1, n2) {
    const lcaHolder = [];
    this.getLca(root, n1, n2, lcaHolder);
    return this.rootDistance(root, n1, 0)
      + this.rootDistance(root, n2, 0)
      - 2 * (this.rootDistance(root, lcaHolder[0].data, 0));
  }

  rootDistance(node, n, stepsSoFar) {
    if (node === null) return -1;
    else if (node.data === n) return stepsSoFar;
    else return Math.max(
        this.rootDistance(node.left, n, stepsSoFar + 1),
        this.rootDistance(node.right, n, stepsSoFar + 1)
      );
  }

  getLca(node, n1, n2, lcaContainer) {
    if (node === null) return false;
    const left = this.getLca(node.left, n1, n2, lcaContainer);
    const right = this.getLca(node.right, n1, n2, lcaContainer);
    const mid = node.data === n1 || node.data === n2;
    if ((left && right) || (left && mid) || (right && mid))
      lcaContainer[0] = node;
    return left || mid || right;
  }
}

TypeScript

class Solution {
  distance(root: TreeNode | null, n1: number, n2: number): number {
    const lcaHolder: (TreeNode | null)[] = [];
    this.getLca(root, n1, n2, lcaHolder);
    return this.rootDistance(root, n1, 0)
      + this.rootDistance(root, n2, 0)
      - 2 * (this.rootDistance(root, lcaHolder[0]!.data, 0));
  }

  rootDistance(node: TreeNode | null, n: number, stepsSoFar: number): number {
    if (node === null) return -1;
    else if (node.data === n) return stepsSoFar;
    else return Math.max(
        this.rootDistance(node.left, n, stepsSoFar + 1),
        this.rootDistance(node.right, n, stepsSoFar + 1)
      );
  }

  getLca(node: TreeNode | null, n1: number, n2: number, lcaContainer: (TreeNode | null)[]): boolean {
    if (node === null) return false;
    const left = this.getLca(node.left, n1, n2, lcaContainer);
    const right = this.getLca(node.right, n1, n2, lcaContainer);
    const mid = node.data === n1 || node.data === n2;
    if ((left && right) || (left && mid) || (right && mid))
      lcaContainer[0] = node;
    return left || mid || right;
  }
}

C++

class Solution {
public:
  int distance(TreeNode *root, int n1, int n2) {
    int lca = 0;
    getLca(root, n1, n2, &lca);
    return rootDistance(root, n1, 0) + rootDistance(root, n2, 0)
           - 2 * (rootDistance(root, lca, 0));
  }

private:
  int rootDistance(TreeNode *node, int n, int stepsSoFar) {
    if (node == nullptr) return -1;
    else if (node->data == n) return stepsSoFar;
    else return std::max(
        rootDistance(node->left, n, stepsSoFar + 1),
        rootDistance(node->right, n, stepsSoFar + 1));
  }

  bool getLca(TreeNode *node, int n1, int n2, int *lcaRef) {
    if (node == nullptr) return false;
    bool left = getLca(node->left, n1, n2, lcaRef);
    bool right = getLca(node->right, n1, n2, lcaRef);
    bool mid = node->data == n1 || node->data == n2;
    if ((left && right) || (left && mid) || (right && mid))
      *lcaRef = node->data;
    return left || right || mid;
  }
};

The C++ version stores the LCA's data value directly (an int passed by pointer) instead of a TreeNode reference, since raw pointer manipulation is more natural in C++.

Scala

class Solution {
  def distance(root: TreeNode, n1: Int, n2: Int): Int = {
    val lcaHolder = new Array[TreeNode](1)
    getLca(root, n1, n2, lcaHolder)
    rootDistance(root, n1, 0) +
      rootDistance(root, n2, 0) -
      2 * rootDistance(root, lcaHolder(0).data, 0)
  }

  private def rootDistance(node: TreeNode, n: Int, stepsSoFar: Int): Int = {
    if (node == null) -1
    else if (node.data == n) stepsSoFar
    else Math.max(
      rootDistance(node.left, n, stepsSoFar + 1),
      rootDistance(node.right, n, stepsSoFar + 1)
    )
  }

  private def getLca(node: TreeNode, n1: Int, n2: Int,
    lcaContainer: Array[TreeNode]): Boolean = {
    if (node == null) false
    else {
      val left = getLca(node.left, n1, n2, lcaContainer)
      val right = getLca(node.right, n1, n2, lcaContainer)
      val mid = node.data == n1 || node.data == n2
      if ((left && right) || (left && mid) || (right && mid))
        lcaContainer(0) = node
      left || mid || right
    }
  }
}

Kotlin

class Solution {
    fun distance(root: TreeNode?, n1: Int, n2: Int): Int {
        val lcaHolder = arrayOfNulls<TreeNode>(1)
        getLca(root, n1, n2, lcaHolder)
        return rootDistance(root, n1, 0)
                 + rootDistance(root, n2, 0)
                 - 2 * (rootDistance(root, lcaHolder[0]!!.data, 0))
    }

    private fun rootDistance(node: TreeNode?, n: Int, stepsSoFar: Int): Int {
        if (node == null) return -1
        else if (node.data == n) return stepsSoFar
        else return maxOf(
            rootDistance(node.left, n, stepsSoFar + 1),
            rootDistance(node.right, n, stepsSoFar + 1)
        )
    }

    private fun getLca(node: TreeNode?, n1: Int, n2: Int,
        lcaContainer: Array<TreeNode?>): Boolean {
        if (node == null) return false
        val left = getLca(node.left, n1, n2, lcaContainer)
        val right = getLca(node.right, n1, n2, lcaContainer)
        val mid = node.data == n1 || node.data == n2
        if ((left && right) || (left && mid) || (right && mid))
            lcaContainer[0] = node
        return left || mid || right
    }
}

Swift

class Solution {
    func distance(_ root: TreeNode?, _ n1: Int, _ n2: Int) -> Int {
        var lcaHolder: [TreeNode?] = [nil]
        getLca(root, n1, n2, &lcaHolder)
        return rootDistance(root, n1, 0)
                 + rootDistance(root, n2, 0)
                 - 2 * (rootDistance(root, lcaHolder[0]!.data, 0))
    }

    private func rootDistance(_ node: TreeNode?, _ n: Int, _ stepsSoFar: Int) -> Int {
        if node == nil { return -1 }
        else if node!.data == n { return stepsSoFar }
        else { return max(
            rootDistance(node!.left, n, stepsSoFar + 1),
            rootDistance(node!.right, n, stepsSoFar + 1)
        )}
    }

    @discardableResult
    private func getLca(_ node: TreeNode?, _ n1: Int, _ n2: Int,
        _ lcaContainer: inout [TreeNode?]) -> Bool {
        if node == nil { return false }
        let left = getLca(node!.left, n1, n2, &lcaContainer)
        let right = getLca(node!.right, n1, n2, &lcaContainer)
        let mid = node!.data == n1 || node!.data == n2
        if (left && right) || (left && mid) || (right && mid) {
            lcaContainer[0] = node
        }
        return left || mid || right
    }
}

Swift uses inout parameters for pass-by-reference, and @discardableResult suppresses the warning when the boolean return value is unused.

Ruby

class Solution
  def distance(root, n1, n2)
    lca_holder = [nil]
    get_lca(root, n1, n2, lca_holder)
    root_distance(root, n1, 0) + root_distance(root, n2, 0) -
      2 * root_distance(root, lca_holder[0].data, 0)
  end

  def root_distance(node, n, steps_so_far)
    return -1 if node.nil?
    return steps_so_far if node.data == n
    [root_distance(node.left, n, steps_so_far + 1),
     root_distance(node.right, n, steps_so_far + 1)].max
  end

  def get_lca(node, n1, n2, lca_container)
    return false if node.nil?
    left = get_lca(node.left, n1, n2, lca_container)
    right = get_lca(node.right, n1, n2, lca_container)
    mid = node.data == n1 || node.data == n2
    lca_container[0] = node if (left && right) || (left && mid) || (right && mid)
    left || mid || right
  end
end

C#

public class Solution {
    public int Distance(TreeNode? root, int n1, int n2) {
        TreeNode?[] lcaHolder = new TreeNode?[1];
        GetLca(root, n1, n2, lcaHolder);
        return RootDistance(root, n1, 0)
                 + RootDistance(root, n2, 0)
                 - 2 * (RootDistance(root, lcaHolder[0]!.data, 0));
    }

    private int RootDistance(TreeNode? node, int n, int stepsSoFar) {
        if (node == null) return -1;
        else if (node.data == n) return stepsSoFar;
        else return Math.Max(
            RootDistance(node.left, n, stepsSoFar + 1),
            RootDistance(node.right, n, stepsSoFar + 1)
        );
    }

    private bool GetLca(TreeNode? node, int n1, int n2, TreeNode?[] lcaContainer) {
        if (node == null) return false;
        bool left = GetLca(node.left, n1, n2, lcaContainer);
        bool right = GetLca(node.right, n1, n2, lcaContainer);
        bool mid = node.data == n1 || node.data == n2;
        if ((left && right) || (left && mid) || (right && mid))
            lcaContainer[0] = node;
        return left || mid || right;
    }
}

Dart

class Solution {
  int distance(TreeNode? root, int n1, int n2) {
    List<TreeNode?> lcaHolder = [null];
    _getLca(root, n1, n2, lcaHolder);
    return _rootDistance(root, n1, 0)
             + _rootDistance(root, n2, 0)
             - 2 * (_rootDistance(root, lcaHolder[0]!.data, 0));
  }

  int _rootDistance(TreeNode? node, int n, int stepsSoFar) {
    if (node == null) return -1;
    else if (node.data == n) return stepsSoFar;
    else return max(
        _rootDistance(node.left, n, stepsSoFar + 1),
        _rootDistance(node.right, n, stepsSoFar + 1)
      );
  }

  bool _getLca(TreeNode? node, int n1, int n2, List<TreeNode?> lcaContainer) {
    if (node == null) return false;
    bool left = _getLca(node.left, n1, n2, lcaContainer);
    bool right = _getLca(node.right, n1, n2, lcaContainer);
    bool mid = node.data == n1 || node.data == n2;
    if ((left && right) || (left && mid) || (right && mid))
      lcaContainer[0] = node;
    return left || mid || right;
  }
}

PHP

class Solution {
    public function distance(?TreeNode $root, int $n1, int $n2): int {
        $lcaHolder = [];
        $this->getLca($root, $n1, $n2, $lcaHolder);
        return $this->rootDistance($root, $n1, 0)
            + $this->rootDistance($root, $n2, 0)
            - 2 * ($this->rootDistance($root, $lcaHolder[0]->data, 0));
    }

    private function rootDistance(?TreeNode $node, int $n, int $stepsSoFar): int {
        if ($node === null) return -1;
        else if ($node->data === $n) return $stepsSoFar;
        else return max(
            $this->rootDistance($node->left, $n, $stepsSoFar + 1),
            $this->rootDistance($node->right, $n, $stepsSoFar + 1)
        );
    }

    private function getLca(?TreeNode $node, int $n1, int $n2, array &$lcaContainer): bool {
        if ($node === null) return false;
        $left = $this->getLca($node->left, $n1, $n2, $lcaContainer);
        $right = $this->getLca($node->right, $n1, $n2, $lcaContainer);
        $mid = $node->data === $n1 || $node->data === $n2;
        if (($left && $right) || ($left && $mid) || ($right && $mid))
            $lcaContainer[0] = $node;
        return $left || $mid || $right;
    }
}

PHP uses pass-by-reference (&$lcaContainer) to allow the helper to modify the caller's array.

Rust

impl Solution {
    pub fn distance(&self, root: Option<Box<TreeNode>>, n1: i32, n2: i32) -> i32 {
        let mut lca_data = 0;
        Self::get_lca(&root, n1, n2, &mut lca_data);
        Self::root_distance(&root, n1, 0)
            + Self::root_distance(&root, n2, 0)
            - 2 * Self::root_distance(&root, lca_data, 0)
    }

    fn root_distance(node: &Option<Box<TreeNode>>, n: i32, steps_so_far: i32) -> i32 {
        match node {
            None => -1,
            Some(current) => {
                if current.data == n {
                    return steps_so_far;
                }
                std::cmp::max(
                    Self::root_distance(&current.left, n, steps_so_far + 1),
                    Self::root_distance(&current.right, n, steps_so_far + 1),
                )
            }
        }
    }

    fn get_lca(node: &Option<Box<TreeNode>>, n1: i32, n2: i32, lca_data: &mut i32) -> bool {
        match node {
            None => false,
            Some(current) => {
                let left = Self::get_lca(&current.left, n1, n2, lca_data);
                let right = Self::get_lca(&current.right, n1, n2, lca_data);
                let mid = current.data == n1 || current.data == n2;
                if (left && right) || (left && mid) || (right && mid) {
                    *lca_data = current.data;
                }
                left || mid || right
            }
        }
    }
}

The Rust version uses &mut i32 to pass the LCA data value by mutable reference, and match on Option<Box<TreeNode>> for clean null handling.

Related Problems

Once you're comfortable with this problem, try these to build on the same skills:

  • Lowest Common Ancestor (the building block for this problem)
  • Diameter of Binary Tree (another LCA-adjacent distance problem)
  • Distance of a Node from the Root of a Binary Tree (the simpler helper used here)
  • Height of a Binary Tree (foundational tree recursion)

This problem and hundreds of others are available on Firecode, where over 50,000 engineers have prepared for technical interviews and landed roles at top tech companies. If this problem felt challenging, try the LCA problem first, then come back. If it felt comfortable, try the diameter problem next for a tougher variation.

Frequently Asked Questions

What is the distance between two nodes in a binary tree?
The distance between two nodes is the minimum number of edges you must traverse to get from one node to the other. For example, in a complete binary tree [1,2,3,4,5,6,7], the distance between sibling nodes 4 and 5 is 2 (4->2->5), while the distance between nodes on opposite subtrees like 4 and 6 is 4 (4->2->1->3->6).
Why do we need the Lowest Common Ancestor to find the distance?
The LCA is the deepest node that is an ancestor of both target nodes. Any path between two nodes in a tree must pass through their LCA. By finding the LCA, we can decompose the problem: distance(n1, n2) = distance(root, n1) + distance(root, n2) - 2 * distance(root, LCA). This avoids having to search for a direct path through the tree.
What is the time complexity of the LCA-based distance algorithm?
The time complexity is O(n) where n is the number of nodes. Finding the LCA requires one traversal of the tree, and computing the distance from the root to each node requires up to one traversal each. In total, we visit each node a constant number of times, giving O(n) overall.
What is the space complexity of this approach?
The space complexity is O(n) in the worst case due to the recursive call stack. For a skewed tree (essentially a linked list), the recursion depth equals the number of nodes. For a balanced tree, the space is O(log n).
How does the distance formula work mathematically?
The formula distance(n1, n2) = dist(root, n1) + dist(root, n2) - 2 * dist(root, LCA) works because the root-to-n1 path and the root-to-n2 path both pass through the LCA. The shared segment from root to LCA is counted twice in the sum, so we subtract it twice. What remains is the path length from LCA to n1 plus the path length from LCA to n2, which is the direct distance between the two nodes.
Can this problem be solved without finding the LCA?
Yes, you can use BFS from one node to find the other, treating the tree as an undirected graph. This also runs in O(n) time and space. However, the LCA approach is more elegant and is the expected solution in interviews because it demonstrates deeper understanding of tree properties. It also naturally extends to answering multiple distance queries efficiently.
What happens when one node is an ancestor of the other?
The algorithm handles this correctly. If node n1 is an ancestor of n2, then the LCA is n1 itself. The formula still works: dist(root, n1) + dist(root, n2) - 2 * dist(root, n1) simplifies to dist(root, n2) - dist(root, n1), which is the number of edges between them.
How does this relate to the Lowest Common Ancestor problem?
Finding the distance between two nodes is essentially the LCA problem plus a distance calculation step. If you have already solved the LCA problem, you only need to add a helper function to compute root-to-node distances. This makes it a natural follow-up question in interviews after the LCA problem.
What are real-world applications of node distance in trees?
Node distance calculations appear in network routing (finding hop count between routers), file system operations (computing relative paths between directories), phylogenetic trees in biology (measuring evolutionary distance between species), and organizational hierarchies (finding management chain length between two employees).
How would you optimize this for multiple distance queries on the same tree?
For multiple queries, precompute the depth of every node and use a precomputed LCA structure (like binary lifting or Euler tour + sparse table). Binary lifting preprocesses the tree in O(n log n) time and answers each LCA query in O(log n), making repeated distance queries much faster than recomputing from scratch each time.