Binary tree serialization and deserialization
You're in a Dropbox interview, and the interviewer draws a binary tree on the whiteboard. "We need to save this tree to disk and load it back later," they say. "How would you convert it to a string, and then reconstruct it from that string?" This is the classic binary tree serialization and deserialization problem, sometimes called "Serialize and Deserialize Binary Tree" on other platforms. It tests your ability to design a data format, apply tree traversals, and handle edge cases around null nodes.
TL;DR
Serialize a binary tree by performing a recursive pre-order traversal, recording each node's value and using a sentinel (like *) for null children. Join the values into a comma-separated string. To deserialize, split the string into tokens, then recursively consume tokens from the front: create a node for each value, return null for each *, and recursively assign left and right children. Both operations run in O(n) time and O(n) space.
Why This Problem Matters
Serialization is everywhere in software engineering. Every time you save a data structure to a file, send it over a network, or cache it in Redis, you are serializing and deserializing. This problem strips that concept down to its core: given a recursive structure, design a flat representation that preserves all the information needed to reconstruct it. Companies like Dropbox and Twitter ask this because it combines tree traversal, string manipulation, and design thinking into a single question.
Binary Trees: A Quick Refresher
A binary tree is a hierarchical data structure where each node has at most two children. Each node stores a value and pointers to its left and right child.
Loading visualization...
Key properties:
- Root: The topmost node (node 1 above)
- Leaf nodes: Nodes with no children (nodes 4 and 5)
- Null children: A node can have zero, one, or two children. Missing children are null.
- Shape matters: Two trees with the same values but different structure are different trees.
That last point is critical for this problem. We need our serialization to capture not just the values, but the exact shape of the tree.
Understanding the Problem
Given: The root of a binary tree (with up to 10,000 nodes) Implement:
serialize(root)- Convert the tree to a stringdeserialize(string)- Reconstruct the original tree from the string Goal:deserialize(serialize(tree))should produce an identical copy of the original tree
Here is the example from the problem:
Loading visualization...
The tree above has nodes 1, 2, 3, 4, and 5. Node 2 has no left child (null) but has a right child (4). Node 3 has no left child but has a right child (5). Our serialization needs to capture all of this.
Why Null Nodes Matter
Consider these two trees. Both contain nodes 6 and 4:
Loading visualization...
If we only stored "6,4", we could not tell whether 4 is the left child or right child of 6. By encoding null positions, we produce "6,4,*,*,*" for a left-child-only tree, which unambiguously describes the structure.
Edge Cases
- Empty tree (null root): Serializes to
"*" - Single node: Serializes to
"4,*,*" - Complete binary tree: Every level is full, no wasted null markers until the leaves
- Skewed tree: All nodes on one side, produces many
*markers
Solution Approach: Pre-Order Traversal
Several traversal orders could work, but pre-order (root, left, right) is the most natural choice. The root appears first in the serialized output, which makes deserialization straightforward: the first token is always the root.
The Serialization Algorithm
- Visit the current node
- If it is null, record
* - Otherwise, record its value, then recursively serialize the left subtree, then the right subtree
- Join everything with commas
Let's trace through the simple tree [6, 4, 8]:
Loading visualization...
Loading visualization...
The pre-order traversal visits 6 first, then goes left to 4, records 4's null children as *, backtracks, goes right to 8, and records 8's null children. The result is "6,4,*,*,8,*,*".
The Deserialization Algorithm
Deserialization reverses the process. Split the string by commas to get a list of tokens, then consume them one by one:
- Pop the first token from the list
- If it is
*, return null - Otherwise, create a new node with that value
- Recursively build the left subtree (which consumes more tokens)
- Recursively build the right subtree (which consumes the remaining tokens)
- Return the node
The recursive calls naturally consume the right number of tokens because each subtree, including its null terminators, forms a contiguous block in the pre-order sequence.
Loading visualization...
Prefer a different language? Jump to solutions in other languages.
Implementation
Here is the Java solution:
import java.util.Arrays;
import java.util.LinkedList;
import java.util.stream.Collectors;
public class Solution {
String serialize(TreeNode root) {
LinkedList<TreeNode> preOrderNodes = new LinkedList<>();
serializePreOrder(root, preOrderNodes);
return preOrderNodes
.stream()
.map(t -> t == null ? "*" : String.valueOf(t.data))
.collect(Collectors.joining(","));
}
void serializePreOrder(TreeNode node, LinkedList<TreeNode> preOrderNodes) {
preOrderNodes.add(node);
if (node != null) {
serializePreOrder(node.left, preOrderNodes);
serializePreOrder(node.right, preOrderNodes);
}
}
TreeNode deserialize(String serialized) {
return deserializePreOrder(
Arrays.stream(serialized.split(","))
.map(s -> s.equals("*") ? null : new TreeNode(Integer.parseInt(s)))
.collect(Collectors.toCollection(LinkedList::new))
);
}
TreeNode deserializePreOrder(LinkedList<TreeNode> preOrderNodes) {
TreeNode node = preOrderNodes.remove();
if (node != null) {
node.left = deserializePreOrder(preOrderNodes);
node.right = deserializePreOrder(preOrderNodes);
}
return node;
}
}
Walking Through the Code
Serialization builds a LinkedList of TreeNode references (including nulls) via pre-order traversal. After traversal completes, it maps each entry to either its value or "*" and joins them with commas.
Deserialization splits the string back into tokens, converts each to a TreeNode (or null for "*"), and stores them in a LinkedList. The deserializePreOrder method removes nodes from the front of the list. Because LinkedList is passed by reference, each recursive call advances the position in the list automatically. When the method encounters a null node, it returns immediately without consuming additional tokens.
Tracing a Larger Example
For the complete binary tree [1, 2, 3, 4, 5, 6, 7]:
Loading visualization...
Serialization produces: "1,2,4,*,*,5,*,*,3,6,*,*,7,*,*"
Deserialization reads tokens left to right:
- Pop
1, create root - Pop
2, assign asroot.left - Pop
4, assign asnode(2).left - Pop
*and*for node(4)'s children - Pop
5, assign asnode(2).right - Pop
*and*for node(5)'s children - Pop
3, assign asroot.right - And so on for nodes 6 and 7
Each subtree consumes exactly its share of tokens, and the tree is rebuilt perfectly.
Complexity Analysis
Time Complexity: O(n)
Serialization visits every node once during pre-order traversal. The string joining step processes n values plus the null markers. Deserialization processes each token exactly once. Both operations are linear in the total number of tokens, which is 2n + 1 for a tree with n nodes (each node contributes one value, and there are n + 1 null markers).
Space Complexity: O(n)
The serialized string itself is O(n). The recursion stack during traversal is O(h) where h is the tree height (O(log n) for balanced trees, O(n) for skewed trees). The intermediate list of nodes is O(n). Overall space is O(n).
Alternative Approaches
- BFS (level-order) serialization: Serialize level by level using a queue. Produces output like
"1,2,3,#,4,#,5". More intuitive for array-based tree representations but requires more bookkeeping. - In-order + pre-order: Store both traversals to reconstruct the tree. Works but uses twice the space and does not handle duplicate values well.
- Parenthesized format: Something like
"1(2()(4))(3()(5))". Compact but harder to parse.
Pre-order with null markers is the cleanest approach for interviews because it uses a single traversal and produces a format that maps directly to a simple recursive parser.
Common Pitfalls
-
Forgetting null markers: Without them, you cannot distinguish between structurally different trees that share the same values. This is the most frequent mistake I see.
-
Mismatched traversal orders: If you serialize with pre-order but try to deserialize as if it were in-order, the tree will be garbled. Both methods must use the same traversal.
-
Not consuming from the front: During deserialization, tokens must be consumed from the front of the list (not the back). Using
remove()on aLinkedListremoves the first element. Usingpop()would remove the last. -
Multi-digit values: If your split logic is wrong,
"12,3"might be parsed as"1","2","3". Always split on the delimiter, not character by character.
Interview Tips
When you encounter this problem in an interview:
-
Clarify the format: Ask whether there are constraints on the serialization format. Most interviewers are flexible, but some want a specific encoding.
-
Start with serialization: It is the easier half. Explain your traversal choice and how you handle nulls.
-
Draw the deserialization: Sketch the token list and show how consuming from the front rebuilds the tree. This makes the "magic" of the recursive approach tangible.
-
Mention alternatives: Briefly note that BFS would also work, and explain why you chose pre-order (simpler recursion, cleaner code).
-
Test with edge cases: Walk through the empty tree (just
"*") and a single node ("4,*,*") to prove your solution handles boundaries.
Solutions in Other Languages
Python
class Solution:
def serialize(self, root):
pre_order_nodes = []
self.serialize_pre_order(root, pre_order_nodes)
return ",".join(["*" if t is None else str(t.data)
for t in pre_order_nodes])
def serialize_pre_order(self, node, pre_order_nodes):
pre_order_nodes.append(node)
if node is not None:
self.serialize_pre_order(node.left, pre_order_nodes)
self.serialize_pre_order(node.right, pre_order_nodes)
def deserialize(self, serialized):
return self.deserialize_pre_order(
[None if s == "*" else TreeNode(int(s))
for s in serialized.split(",")]
)
def deserialize_pre_order(self, pre_order_nodes):
node = pre_order_nodes.pop(0)
if node is not None:
node.left = self.deserialize_pre_order(pre_order_nodes)
node.right = self.deserialize_pre_order(pre_order_nodes)
return node
JavaScript
class Solution {
serialize(root) {
const preOrderNodes = [];
this.serializePreOrder(root, preOrderNodes);
return preOrderNodes
.map(t => t === null ? "*" : t.data.toString())
.join(",");
}
serializePreOrder(node, preOrderNodes) {
preOrderNodes.push(node);
if (node !== null) {
this.serializePreOrder(node.left, preOrderNodes);
this.serializePreOrder(node.right, preOrderNodes);
}
}
deserialize(serialized) {
return this.deserializePreOrder(
serialized.split(",")
.map(s => s === "*" ? null : new TreeNode(parseInt(s)))
);
}
deserializePreOrder(preOrderNodes) {
const node = preOrderNodes.shift();
if (node !== null) {
node.left = this.deserializePreOrder(preOrderNodes);
node.right = this.deserializePreOrder(preOrderNodes);
}
return node;
}
}
TypeScript
class Solution {
serialize(root: TreeNode | null): string {
const preOrderNodes: (TreeNode | null)[] = [];
this.serializePreOrder(root, preOrderNodes);
return preOrderNodes
.map(t => t === null ? "*" : t.data.toString())
.join(",");
}
serializePreOrder(node: TreeNode | null, preOrderNodes: (TreeNode | null)[]): void {
preOrderNodes.push(node);
if (node !== null) {
this.serializePreOrder(node.left, preOrderNodes);
this.serializePreOrder(node.right, preOrderNodes);
}
}
deserialize(serialized: string): TreeNode | null {
return this.deserializePreOrder(
serialized.split(",")
.map(s => s === "*" ? null : new TreeNode(parseInt(s)))
);
}
deserializePreOrder(preOrderNodes: (TreeNode | null)[]): TreeNode | null {
const node = preOrderNodes.shift()!;
if (node !== null) {
node.left = this.deserializePreOrder(preOrderNodes);
node.right = this.deserializePreOrder(preOrderNodes);
}
return node;
}
}
C++
#include <iostream>
#include <string>
#include <vector>
#include <sstream>
using namespace std;
class Solution {
public:
string serialize(TreeNode* root) {
vector<TreeNode*> preOrderNodes;
serializePreOrder(root, preOrderNodes);
string result;
for (int i = 0; i < preOrderNodes.size(); i++) {
if (i > 0) result += ",";
result += (preOrderNodes[i] == nullptr ? "*" : to_string(preOrderNodes[i]->data));
}
return result;
}
void serializePreOrder(TreeNode* node, vector<TreeNode*>& preOrderNodes) {
preOrderNodes.push_back(node);
if (node != nullptr) {
serializePreOrder(node->left, preOrderNodes);
serializePreOrder(node->right, preOrderNodes);
}
}
TreeNode* deserialize(string serialized) {
vector<TreeNode*> nodes;
stringstream ss(serialized);
string token;
while (getline(ss, token, ',')) {
nodes.push_back(token == "*" ? nullptr : new TreeNode(stoi(token)));
}
int index = 0;
return deserializePreOrder(nodes, index);
}
TreeNode* deserializePreOrder(vector<TreeNode*>& preOrderNodes, int& index) {
TreeNode* node = preOrderNodes[index++];
if (node != nullptr) {
node->left = deserializePreOrder(preOrderNodes, index);
node->right = deserializePreOrder(preOrderNodes, index);
}
return node;
}
};
Note that the C++ version uses an index reference instead of removing elements from the front of a vector, which avoids O(n) shifts per removal.
Scala
import scala.collection.mutable
class Solution {
def serialize(root: TreeNode): String = {
val preOrderNodes = mutable.ArrayBuffer[TreeNode]()
serializePreOrder(root, preOrderNodes)
preOrderNodes
.map(t => if (t == null) "*" else t.data.toString)
.mkString(",")
}
def serializePreOrder(node: TreeNode, preOrderNodes: mutable.ArrayBuffer[TreeNode]): Unit = {
preOrderNodes.append(node)
if (node != null) {
serializePreOrder(node.left, preOrderNodes)
serializePreOrder(node.right, preOrderNodes)
}
}
def deserialize(serialized: String): TreeNode = {
deserializePreOrder(
mutable.Queue.from(serialized
.split(",")
.map(s => if (s == "*") null else TreeNode(s.toInt)))
)
}
def deserializePreOrder(preOrderNodes: mutable.Queue[TreeNode]): TreeNode = {
val node = preOrderNodes.removeHead()
if (node != null) {
node.left = deserializePreOrder(preOrderNodes)
node.right = deserializePreOrder(preOrderNodes)
}
node
}
}
Kotlin
import java.util.*
class Solution {
fun serialize(root: TreeNode?): String {
val preOrderNodes = LinkedList<TreeNode?>()
serializePreOrder(root, preOrderNodes)
return preOrderNodes
.map { if (it == null) "*" else it.data.toString() }
.joinToString(",")
}
fun serializePreOrder(node: TreeNode?, preOrderNodes: LinkedList<TreeNode?>) {
preOrderNodes.add(node)
if (node != null) {
serializePreOrder(node.left, preOrderNodes)
serializePreOrder(node.right, preOrderNodes)
}
}
fun deserialize(serialized: String): TreeNode? {
return deserializePreOrder(
serialized.split(",")
.map { if (it == "*") null else TreeNode(it.toInt()) }
.toCollection(LinkedList())
)
}
fun deserializePreOrder(preOrderNodes: LinkedList<TreeNode?>): TreeNode? {
val node = preOrderNodes.remove()
if (node != null) {
node.left = deserializePreOrder(preOrderNodes)
node.right = deserializePreOrder(preOrderNodes)
}
return node
}
}
Swift
import Foundation
class Solution {
func serialize(_ root: TreeNode?) -> String {
var preOrderNodes = [TreeNode?]()
serializePreOrder(root, &preOrderNodes)
return preOrderNodes
.map { $0 == nil ? "*" : String($0!.data) }
.joined(separator: ",")
}
private func serializePreOrder(_ node: TreeNode?, _ preOrderNodes: inout [TreeNode?]) {
preOrderNodes.append(node)
if node != nil {
serializePreOrder(node!.left, &preOrderNodes)
serializePreOrder(node!.right, &preOrderNodes)
}
}
func deserialize(_ serialized: String) -> TreeNode? {
var nodes = serialized.components(separatedBy: ",")
.map { $0 == "*" ? nil : TreeNode(Int($0)!) }
return deserializePreOrder(&nodes)
}
private func deserializePreOrder(_ preOrderNodes: inout [TreeNode?]) -> TreeNode? {
let node = preOrderNodes.removeFirst()
if node != nil {
node!.left = deserializePreOrder(&preOrderNodes)
node!.right = deserializePreOrder(&preOrderNodes)
}
return node
}
}
Ruby
class Solution
def serialize(root)
pre_order_nodes = []
serialize_pre_order(root, pre_order_nodes)
pre_order_nodes.map { |node| node.nil? ? '*' : node.data.to_s }.join(',')
end
def serialize_pre_order(node, pre_order_nodes)
pre_order_nodes << node
unless node.nil?
serialize_pre_order(node.left, pre_order_nodes)
serialize_pre_order(node.right, pre_order_nodes)
end
end
def deserialize(serialized)
nodes = serialized.split(',').map { |s| s == '*' ? nil : TreeNode.new(s.to_i) }
deserialize_pre_order(nodes)
end
def deserialize_pre_order(pre_order_nodes)
node = pre_order_nodes.shift
unless node.nil?
node.left = deserialize_pre_order(pre_order_nodes)
node.right = deserialize_pre_order(pre_order_nodes)
end
node
end
end
Rust
pub struct Solution;
impl Solution {
fn serialize(root: &Option<Box<TreeNode>>) -> String {
let mut pre_order_values: Vec<String> = Vec::new();
Self::serialize_pre_order(root, &mut pre_order_values);
pre_order_values.join(",")
}
fn serialize_pre_order(node: &Option<Box<TreeNode>>, pre_order_values: &mut Vec<String>) {
match node {
None => {
pre_order_values.push("*".to_string());
}
Some(current) => {
pre_order_values.push(current.data.to_string());
Self::serialize_pre_order(¤t.left, pre_order_values);
Self::serialize_pre_order(¤t.right, pre_order_values);
}
}
}
fn deserialize(serialized: &str) -> Option<Box<TreeNode>> {
let mut tokens: VecDeque<&str> = serialized.split(',').collect();
Self::deserialize_pre_order(&mut tokens)
}
fn deserialize_pre_order(tokens: &mut VecDeque<&str>) -> Option<Box<TreeNode>> {
let token = tokens.pop_front()?;
if token == "*" {
return None;
}
let mut node = Box::new(TreeNode::new(token.parse::<i32>().unwrap()));
node.left = Self::deserialize_pre_order(tokens);
node.right = Self::deserialize_pre_order(tokens);
Some(node)
}
}
The Rust version uses pattern matching in serialization and VecDeque for efficient front-removal during deserialization.
Dart
class Solution {
String serialize(TreeNode? root) {
List<TreeNode?> preOrderNodes = [];
_serializePreOrder(root, preOrderNodes);
return preOrderNodes
.map((t) => t == null ? "*" : t.data.toString())
.join(",");
}
void _serializePreOrder(TreeNode? node, List<TreeNode?> preOrderNodes) {
preOrderNodes.add(node);
if (node != null) {
_serializePreOrder(node.left, preOrderNodes);
_serializePreOrder(node.right, preOrderNodes);
}
}
TreeNode? deserialize(String serialized) {
return _deserializePreOrder(
serialized.split(",")
.map((s) => s == "*" ? null : TreeNode(int.parse(s)))
.toList(),
);
}
TreeNode? _deserializePreOrder(List<TreeNode?> preOrderNodes) {
TreeNode? node = preOrderNodes.removeAt(0);
if (node != null) {
node.left = _deserializePreOrder(preOrderNodes);
node.right = _deserializePreOrder(preOrderNodes);
}
return node;
}
}
C#
using System.Linq;
using System.Collections.Generic;
public class Solution {
public string Serialize(TreeNode? root) {
var preOrderNodes = new List<TreeNode?>();
SerializePreOrder(root, preOrderNodes);
return string.Join(",", preOrderNodes.Select(t => t == null ? "*" : t.data.ToString()));
}
private void SerializePreOrder(TreeNode? node, List<TreeNode?> preOrderNodes) {
preOrderNodes.Add(node);
if (node != null) {
SerializePreOrder(node.left, preOrderNodes);
SerializePreOrder(node.right, preOrderNodes);
}
}
public TreeNode? Deserialize(string serialized) {
return DeserializePreOrder(
new LinkedList<TreeNode?>(
serialized.Split(',')
.Select(s => s == "*" ? null : new TreeNode(int.Parse(s)))
)
);
}
private TreeNode? DeserializePreOrder(LinkedList<TreeNode?> preOrderNodes) {
var node = preOrderNodes.First!.Value;
preOrderNodes.RemoveFirst();
if (node != null) {
node.left = DeserializePreOrder(preOrderNodes);
node.right = DeserializePreOrder(preOrderNodes);
}
return node;
}
}
PHP
class Solution {
public function serialize(?TreeNode $root): string {
$preOrderNodes = [];
$this->serializePreOrder($root, $preOrderNodes);
return implode(',', array_map(
fn($t) => $t === null ? '*' : strval($t->data),
$preOrderNodes
));
}
private function serializePreOrder(?TreeNode $node, array &$preOrderNodes): void {
$preOrderNodes[] = $node;
if ($node !== null) {
$this->serializePreOrder($node->left, $preOrderNodes);
$this->serializePreOrder($node->right, $preOrderNodes);
}
}
public function deserialize(string $serialized): ?TreeNode {
$nodes = array_map(
fn($s) => $s === '*' ? null : new TreeNode(intval($s)),
explode(',', $serialized)
);
return $this->deserializePreOrder($nodes);
}
private function deserializePreOrder(array &$preOrderNodes): ?TreeNode {
$node = array_shift($preOrderNodes);
if ($node !== null) {
$node->left = $this->deserializePreOrder($preOrderNodes);
$node->right = $this->deserializePreOrder($preOrderNodes);
}
return $node;
}
}
Key Takeaways
- Pre-order traversal with null markers is the standard approach for binary tree serialization because the root-first ordering maps directly to a recursive deserializer.
- Null markers are not optional. Without them, structurally different trees serialize to the same string, making reconstruction impossible.
- Both serialize and deserialize run in O(n) time and O(n) space. The recursion stack adds O(h) where h is the tree height, but h is at most n.
- During deserialization, consuming tokens from the front of a shared list lets each recursive call automatically advance past the tokens belonging to its subtree.
- Test your solution with edge cases: empty tree, single node, left-only subtree, and a complete binary tree. These catch null-handling bugs and off-by-one errors.
Practice Makes Perfect
Once you are comfortable with serialization and deserialization, try these related problems to strengthen your tree skills:
- Binary tree decompression (reconstructing a tree from a compressed format)
- Level-order traversal (which is the basis for BFS-style serialization)
- Lowest common ancestor (another problem that requires deep understanding of tree structure)
This problem and hundreds of others are available on Firecode, where over 50,000 users have prepared for technical interviews and landed roles at top tech companies. Whether you are practicing for your first interview or sharpening your skills for a senior role, mastering tree serialization will serve you well.