Binary tree decompression

AS
Ackshaey Singh
Difficulty Hard
Time Complexity
O(n)
Space Complexity
O(n)
Binary tree String Queue
Dropbox Facebook

You're sitting in a Dropbox phone screen and the interviewer says, "We store file system metadata as serialized trees. Given a compressed string, can you reconstruct the original tree?" You know binary trees. You know BFS. But the twist here is the compression format, where missing subtrees are encoded as a single asterisk rather than padding out every null position to the last level. This problem, also known as "Deserialize Binary Tree from Level-Order String" on other platforms, tests whether you can combine string parsing with level-order reconstruction in a clean, efficient way.

TL;DR

Split the compressed string on commas to produce a list of tokens. Convert each token to a TreeNode (or null for *). Use a queue seeded with the root node, then iterate through the remaining tokens: for each parent dequeued, assign its left child from the next token, then its right child from the following token. Enqueue any non-null children. This runs in O(n) time and O(n) space.

Prefer a different language? Jump to solutions in other languages.

Why This Problem Matters

Serialization and deserialization sit at the heart of real systems. Every time Dropbox syncs your file tree, or Facebook stores a comment thread, some form of tree serialization is at work. Interviewers at these companies love this problem because it tests three things at once: your comfort with binary trees, your ability to use a queue for BFS, and your attention to tricky encoding details. Nail this, and you're well prepared for a wide family of tree reconstruction problems.

Understanding the Serialization Format

The input is a comma-separated string where each token is either an integer (a node value) or an asterisk * (a missing node). Tokens appear in level-order, left to right, top to bottom.

The compressed part is important. Consider a right-skewed tree:

Loading visualization...

The serialized string is "1,*,2,*,3". Without compression, a full level-order encoding would be "1,*,2,*,*,*,3" because you would need to represent the missing children of node 1's nonexistent left subtree. The compression drops those trailing nulls that carry no structural information.

Similarly, a left-skewed tree:

Loading visualization...

is "1,2,*,3,*", not "1,2,*,3,*,*,*". Once we see * for node 1's right child, we know that entire right subtree is empty and do not need to encode its missing children.

Reading the Examples

For the simple case "1,2,3", the result is a complete tree with root 1, left child 2, right child 3:

Loading visualization...

For "1,2,3,*,4,*,5", the asterisks indicate that node 2 has no left child and node 3 has no left child:

Loading visualization...

And a full balanced tree "1,2,3,4,5,6,7" reconstructs perfectly:

Loading visualization...

Edge Cases to Keep in Mind

  1. Single node: "1" produces a lone root with no children
  2. Right-skewed: "1,*,2,*,3" where every left child is null
  3. Left-skewed: "1,2,*,3,*" where every right child is null
  4. Full tree: No asterisks at all when every level is complete

Solution Approach

The serialized string was created by visiting nodes in level-order (BFS). To reverse the process, we reconstruct in the same order. The core idea is straightforward: use a queue to track which parent nodes still need children assigned, then consume tokens one by one, alternating between left and right child assignments.

The Algorithm

  1. Split the string on commas
  2. Convert each token to a TreeNode (or null for *)
  3. Initialize a queue with the root node
  4. Walk through the remaining tokens with a counter that alternates between 0 (left child) and 1 (right child)
  5. When the counter is 0, dequeue the next parent and assign its left child
  6. When the counter is 1, assign the right child and reset the counter
  7. Enqueue any non-null child so it can later receive its own children

Tracing Through an Example

Let's walk through "1,2,3,*,5,6,*" step by step.

Step 1: Parse tokens into [1, 2, 3, *, 5, 6, *]. Create root node 1, enqueue it. Queue: [1]

Loading visualization...

Step 2: Token 2 (index 1). Counter is 0, so dequeue parent 1 and set 1.left = 2. Counter becomes 1. Enqueue 2. Queue: [2]

Loading visualization...

Step 3: Token 3 (index 2). Counter is 1, so set 1.right = 3. Counter resets to 0. Enqueue 3. Queue: [2, 3]

Loading visualization...

Step 4: Token * (index 3). Counter is 0, so dequeue parent 2 and set 2.left = null. Counter becomes 1. Do not enqueue null.

Step 5: Token 5 (index 4). Counter is 1, so set 2.right = 5. Counter resets to 0. Enqueue 5. Queue: [3, 5]

Loading visualization...

Step 6: Token 6 (index 5). Counter is 0, so dequeue parent 3 and set 3.left = 6. Counter becomes 1. Enqueue 6. Queue: [5, 6]

Step 7: Token * (index 6). Counter is 1, so set 3.right = null. Counter resets to 0. Do not enqueue null.

Final tree:

Loading visualization...

The queue still holds nodes 5 and 6, but we have no more tokens, so we stop. Those leaf nodes simply never receive children.

Implementation

Here is the Java solution:

import java.util.*;
import java.util.stream.Collectors;

public class Solution {
  public TreeNode decompress(String compressed) {
    // Convert comma-separated tokens into TreeNode or null
    List<TreeNode> treeNodes =
      Arrays.stream(compressed.split(","))
        .map(s -> s.equals("*") ? null : new TreeNode(Integer.parseInt(s)))
        .collect(Collectors.toList());

    // Queue for level-order reconstruction
    Deque<TreeNode> queue = new LinkedList<>();
    TreeNode node = treeNodes.get(0);
    queue.offer(node);
    int childrenAdded = 0;

    for (int i = 1; i < treeNodes.size(); i++) {
      TreeNode nextNode = treeNodes.get(i);

      if (childrenAdded == 0) {
        // Dequeue next parent, assign left child
        node = queue.poll();
        node.left = nextNode;
        childrenAdded++;
      } else {
        // Assign right child, reset counter
        node.right = nextNode;
        childrenAdded = 0;
      }

      // Only enqueue non-null nodes
      if (nextNode != null) queue.offer(nextNode);
    }

    return treeNodes.get(0);
  }
}

The childrenAdded counter is the key bookkeeping mechanism. It alternates between 0 and 1, telling us whether the next token is a left child (triggering a dequeue of the next parent) or a right child (staying with the current parent). This single integer replaces what would otherwise require more complex state tracking.

Complexity Analysis

Time Complexity: O(n)

We split the string once in O(n) time. We then iterate through the token list once, performing O(1) work per token: one queue operation and one pointer assignment. Total: O(n), where n is the number of tokens in the serialized string.

Space Complexity: O(n)

The treeNodes list stores n elements. The queue holds at most one full level of the tree at a time, which is at most n/2 for a balanced tree. The output tree itself uses O(n) space. All together, this is O(n).

Could We Do Better?

Not really. We need to read every token at least once, so O(n) time is a lower bound. We also need to store the entire output tree, so O(n) space is required. This solution is optimal on both fronts.

Common Pitfalls

  1. Forgetting to skip null children in the queue: If you enqueue null values, the algorithm will try to assign children to a nonexistent node and crash with a NullPointerException. Always check nextNode != null before enqueueing.

  2. Off-by-one on the counter logic: The counter must start at 0 and alternate between 0 (left) and 1 (right). Starting at 1 or resetting at the wrong time shifts all children by one position.

  3. Confusing compressed vs. uncompressed formats: Some candidates try to pad the string to a full binary tree representation first. That is unnecessary and wasteful. The queue-based approach handles the compressed format directly.

  4. Using a stack instead of a queue: Since the tokens are in level-order (BFS order), you must process parents in FIFO order. Using a stack reverses the parent processing order and produces an incorrect tree.

Interview Tips

When presenting this solution in an interview:

  1. Clarify the format first: Ask whether asterisks represent null nodes and whether the string is always valid. Confirm that the tree uses a compressed level-order encoding.

  2. Sketch the reconstruction: Draw the tree for "1,2,3,*,5" on the whiteboard. Show the queue state at each step. This demonstrates you understand BFS reconstruction, not just the code.

  3. Mention the symmetry: Point out that serialization and deserialization are inverse operations. If the interviewer asks a follow-up about serialization, you can describe the reverse process using the same BFS pattern.

  4. Discuss tradeoffs: Level-order serialization is compact for balanced trees but can be wasteful for skewed trees. Preorder with null markers is another common format. Mentioning alternatives shows breadth of knowledge.

Key Takeaways

  • The queue-based BFS approach mirrors the level-order encoding, making reconstruction natural and straightforward.
  • A single childrenAdded counter handles the alternation between left and right child assignment without complex state.
  • Always skip null children when enqueueing, otherwise the algorithm will attempt to assign children to nonexistent nodes.
  • The compressed format avoids encoding null subtrees beyond the first missing node, so your algorithm must consume tokens in order rather than trying to index into a full binary tree array.
  • Both time and space complexity are O(n), which is optimal since every token must be read and every node must be stored.

Solutions in Other Languages

Python

from collections import deque

class Solution:
    def decompress(self, compressed: str) -> TreeNode:
        tree_nodes = [None if s == "*" else TreeNode(int(s))
                      for s in compressed.split(",")]
        queue = deque()
        node = tree_nodes[0]
        queue.append(node)
        children_added = 0

        for i in range(1, len(tree_nodes)):
            next_node = tree_nodes[i]

            if children_added == 0:
                node = queue.popleft()
                node.left = next_node
                children_added += 1
            else:
                node.right = next_node
                children_added = 0

            if next_node is not None:
                queue.append(next_node)

        return tree_nodes[0]

JavaScript

class Solution {
  decompress(compressed) {
    const treeNodes = compressed
      .split(',')
      .map(s => s === '*' ? null : new TreeNode(parseInt(s)));

    const queue = [];
    let node = treeNodes[0];
    queue.push(node);
    let childrenAdded = 0;

    for (let i = 1; i < treeNodes.length; i++) {
      const nextNode = treeNodes[i];

      if (childrenAdded === 0) {
        node = queue.shift();
        node.left = nextNode;
        childrenAdded++;
      } else {
        node.right = nextNode;
        childrenAdded = 0;
      }

      if (nextNode !== null) queue.push(nextNode);
    }

    return treeNodes[0];
  }
}

TypeScript

class Solution {
  decompress(compressed: string): TreeNode | null {
    const treeNodes: (TreeNode | null)[] = compressed
      .split(',')
      .map(s => s === '*' ? null : new TreeNode(parseInt(s)));

    const queue: TreeNode[] = [];
    let node = treeNodes[0];
    queue.push(node!);
    let childrenAdded = 0;

    for (let i = 1; i < treeNodes.length; i++) {
      const nextNode = treeNodes[i];

      if (childrenAdded === 0) {
        node = queue.shift()!;
        node.left = nextNode;
        childrenAdded++;
      } else {
        node!.right = nextNode;
        childrenAdded = 0;
      }

      if (nextNode !== null) queue.push(nextNode!);
    }

    return treeNodes[0];
  }
}

C++

#include <queue>
#include <vector>
#include <string>

class Solution {
public:
  TreeNode* decompress(const std::string& compressed) {
    std::vector<std::string> nodesData = split(compressed, ',');
    std::queue<TreeNode*> queue;
    TreeNode* node = genNode(nodesData[0]);
    TreeNode* root = node;
    queue.push(node);
    int childrenAdded = 0;

    for (int i = 1; i < nodesData.size(); i++) {
      TreeNode* nextNode = genNode(nodesData[i]);

      if (childrenAdded == 0) {
        node = queue.front();
        queue.pop();
        node->left = nextNode;
        childrenAdded++;
      } else {
        node->right = nextNode;
        childrenAdded = 0;
      }

      if (nextNode != nullptr) queue.push(nextNode);
    }

    return root;
  }

private:
  static TreeNode* genNode(const std::string& s) {
    return s == "*" ? nullptr : new TreeNode(stoi(s));
  }
};

Scala

import scala.collection.mutable

class Solution {
  def decompress(compressed: String): TreeNode = {
    val treeNodes = compressed.split(",").map {
      case "*" => null
      case s => TreeNode(s.toInt)
    }

    val queue = mutable.Queue[TreeNode]()
    var node: TreeNode = treeNodes(0)
    queue.enqueue(node)
    var childrenAdded = 0

    for (i <- 1 until treeNodes.length) {
      val nextNode = treeNodes(i)

      if (childrenAdded == 0) {
        node = queue.dequeue
        node.left = nextNode
        childrenAdded += 1
      } else {
        node.right = nextNode
        childrenAdded = 0
      }

      if (nextNode != null) queue.enqueue(nextNode)
    }

    treeNodes(0)
  }
}

Kotlin

import java.util.*

class Solution {
    fun decompress(compressed: String): TreeNode? {
        val treeNodes = compressed.split(",").map { s ->
            if (s == "*") null else TreeNode(s.toInt())
        }

        val queue = LinkedList<TreeNode?>()
        var node = treeNodes[0]
        queue.add(node)
        var childrenAdded = 0

        for (i in 1 until treeNodes.size) {
            val nextNode = treeNodes[i]

            if (childrenAdded == 0) {
                node = queue.removeFirst()
                node?.left = nextNode
                childrenAdded++
            } else {
                node?.right = nextNode
                childrenAdded = 0
            }

            if (nextNode != null) queue.add(nextNode)
        }

        return treeNodes[0]
    }
}

Ruby

class Solution
  def decompress(compressed)
    tree_nodes = compressed.split(',').map { |s| s == '*' ? nil : TreeNode.new(s.to_i) }
    queue = []
    node = tree_nodes[0]
    queue.push(node)
    children_added = 0

    (1...tree_nodes.length).each do |i|
      next_node = tree_nodes[i]

      if children_added == 0
        node = queue.shift
        node.left = next_node
        children_added += 1
      else
        node.right = next_node
        children_added = 0
      end

      queue.push(next_node) if next_node
    end

    tree_nodes[0]
  end
end

Swift

class Solution {
    func decompress(_ compressed: String) -> TreeNode? {
        let treeNodes = compressed.split(separator: ",").map {
            $0 == "*" ? nil : TreeNode(Int($0)!)
        }

        var queue: [TreeNode] = []
        var node = treeNodes[0]
        queue.append(node!)
        var childrenAdded = 0

        for i in 1..<treeNodes.count {
            let nextNode = treeNodes[i]

            if childrenAdded == 0 {
                node = queue.removeFirst()
                node!.left = nextNode
                childrenAdded += 1
            } else {
                node!.right = nextNode
                childrenAdded = 0
            }

            if let nextNode = nextNode { queue.append(nextNode) }
        }

        return treeNodes[0]
    }
}

Rust

The Rust implementation uses raw pointers for tree construction since Box<TreeNode> ownership makes shared mutable access tricky during BFS reconstruction:

pub struct Solution;

impl Solution {
    pub fn decompress(&self, compressed: String) -> Option<Box<TreeNode>> {
        let tree_nodes: Vec<Option<Box<TreeNode>>> = compressed
            .split(',')
            .map(|s| {
                if s == "*" { None }
                else { Some(Box::new(TreeNode::new(s.parse::<i32>().unwrap()))) }
            })
            .collect();

        let mut ptrs: Vec<Option<*mut TreeNode>> = tree_nodes
            .into_iter()
            .map(|opt| opt.map(|b| Box::into_raw(b)))
            .collect();

        let mut queue: VecDeque<*mut TreeNode> = VecDeque::new();
        if let Some(root_ptr) = ptrs[0] {
            queue.push_back(root_ptr);
        }

        let mut children_added = 0;
        let mut current: *mut TreeNode = std::ptr::null_mut();

        for i in 1..ptrs.len() {
            if children_added == 0 {
                current = queue.pop_front().unwrap();
                if let Some(child_ptr) = ptrs[i].take() {
                    unsafe {
                        (*current).left = Some(Box::from_raw(child_ptr));
                        queue.push_back(
                            (*current).left.as_mut().unwrap().as_mut() as *mut TreeNode
                        );
                    }
                }
                children_added += 1;
            } else {
                if let Some(child_ptr) = ptrs[i].take() {
                    unsafe {
                        (*current).right = Some(Box::from_raw(child_ptr));
                        queue.push_back(
                            (*current).right.as_mut().unwrap().as_mut() as *mut TreeNode
                        );
                    }
                }
                children_added = 0;
            }
        }

        ptrs[0].map(|ptr| unsafe { Box::from_raw(ptr) })
    }
}

Dart

import 'dart:collection';

class Solution {
  TreeNode? decompress(String compressed) {
    List<TreeNode?> treeNodes = compressed
        .split(",")
        .map((s) => s == "*" ? null : TreeNode(int.parse(s)))
        .toList();

    Queue<TreeNode> queue = Queue();
    TreeNode? node = treeNodes[0];
    queue.add(node!);
    int childrenAdded = 0;

    for (int i = 1; i < treeNodes.length; i++) {
      TreeNode? nextNode = treeNodes[i];

      if (childrenAdded == 0) {
        node = queue.removeFirst();
        node!.left = nextNode;
        childrenAdded++;
      } else {
        node!.right = nextNode;
        childrenAdded = 0;
      }

      if (nextNode != null) queue.add(nextNode);
    }

    return treeNodes[0];
  }
}

PHP

class Solution {
    public function decompress(string $compressed): ?TreeNode {
        $treeNodes = array_map(
            fn($s) => $s === "*" ? null : new TreeNode(intval($s)),
            explode(",", $compressed)
        );

        $queue = [];
        $node = $treeNodes[0];
        $queue[] = $node;
        $childrenAdded = 0;

        for ($i = 1; $i < count($treeNodes); $i++) {
            $nextNode = $treeNodes[$i];

            if ($childrenAdded === 0) {
                $node = array_shift($queue);
                $node->left = $nextNode;
                $childrenAdded++;
            } else {
                $node->right = $nextNode;
                $childrenAdded = 0;
            }

            if ($nextNode !== null) $queue[] = $nextNode;
        }

        return $treeNodes[0];
    }
}

C#

using System.Linq;

public class Solution {
    public TreeNode? Decompress(string compressed) {
        var treeNodes = compressed.Split(',')
            .Select(s => s == "*" ? null : new TreeNode(int.Parse(s)))
            .ToList();

        var queue = new Queue<TreeNode>();
        TreeNode node = treeNodes[0]!;
        queue.Enqueue(node);
        int childrenAdded = 0;

        for (int i = 1; i < treeNodes.Count; i++) {
            TreeNode? nextNode = treeNodes[i];

            if (childrenAdded == 0) {
                node = queue.Dequeue();
                node.left = nextNode;
                childrenAdded++;
            } else {
                node.right = nextNode;
                childrenAdded = 0;
            }

            if (nextNode != null) queue.Enqueue(nextNode);
        }

        return treeNodes[0];
    }
}

Practice Makes Perfect

Binary tree serialization is one piece of a larger family of tree problems that interviewers love. Once you are comfortable with this, try these related challenges to strengthen your tree skills:

  • Level-order traversal (the forward direction of this problem)
  • Binary tree from preorder and inorder traversal
  • Construct a BST from a sorted array
  • Serialize and deserialize a binary tree (the bidirectional version)

This problem and many others are available on Firecode, where over 50,000 engineers have prepared for technical interviews and landed roles at top tech companies. Consistent practice with problems like this builds the pattern recognition that makes interview day feel familiar rather than stressful.

Frequently Asked Questions

What is binary tree serialization and deserialization?
Serialization converts a binary tree into a string format for storage or transmission. Deserialization reconstructs the original tree from that string. Level-order serialization uses BFS to visit nodes left to right, encoding null children as asterisks (*) to preserve tree structure.
What is the time complexity of deserializing a binary tree from a level-order string?
The time complexity is O(n) where n is the number of tokens in the serialized string. Each token is processed exactly once: parsed into a TreeNode or null, then linked to its parent via the queue. No repeated traversals are needed.
What is the space complexity of queue-based binary tree deserialization?
The space complexity is O(n) where n is the number of nodes. The queue holds at most one full level of the tree at a time, which for a balanced tree is roughly n/2 nodes. The output tree itself also requires O(n) space for its nodes.
How does the compression in this serialization format work?
The compression eliminates trailing asterisks that carry no structural information. If node 1's left subtree is null, we encode one asterisk for that missing child, but we do not encode the missing grandchildren of that null subtree. This reduces the string length from 2^d - 1 tokens down to only the tokens needed to reconstruct the tree.
Why use a queue instead of recursion for level-order deserialization?
Level-order serialization encodes nodes breadth-first, so deserialization must also process tokens breadth-first. A queue naturally supports this by holding parent nodes waiting for their children. Recursion follows depth-first order, which does not align with the token ordering in the serialized string.
How do you handle null nodes during tree reconstruction?
When a token is an asterisk (*), create a null reference instead of a TreeNode. Assign it as the left or right child of the current parent. Since null nodes have no children, do not add them to the queue. This prevents the algorithm from trying to assign children to nonexistent nodes.
What are the edge cases for binary tree deserialization?
Key edge cases include: a single node tree (string contains just one number), right-skewed trees (alternating asterisks and values), left-skewed trees, and complete binary trees. The algorithm handles all of these correctly because the queue-based approach processes every token in order regardless of tree shape.
How does this compare to other tree serialization formats like preorder or parenthesized notation?
Level-order serialization with compression is space-efficient for balanced trees and simple to implement with a queue. Preorder serialization with null markers works well with recursive reconstruction. Parenthesized notation like Lisp-style s-expressions is human-readable but verbose. Level-order is common in competitive programming and interview settings.
Can this algorithm handle very large trees efficiently?
Yes. The algorithm processes each token once in O(1) time, giving O(n) total. Memory usage is bounded by the tree size plus a queue holding at most one level of width. For a tree with 10,000 nodes the algorithm completes almost instantly. The main constraint is memory for storing the tree itself.
How is binary tree serialization used in real-world systems?
Tree serialization appears in database storage (B-trees persisted to disk), network protocols (transmitting hierarchical data), file systems (directory structures), and distributed systems (sending tree-structured data between services). Companies like Dropbox and Facebook use similar techniques for structured data transfer and caching.