Remove duplicate nodes from a linked list
"Remove duplicate nodes from a linked list." This problem, also known as Remove Duplicates from Sorted List in its simpler sorted variant, is one of the cleanest introductions to combining linked list traversal with set-based lookups. It strips the linked list manipulation pattern down to its essence: walk the list, remember what you have seen, and unlink anything you have seen before.
TL;DR
Traverse the linked list with two pointers: iterator (current node) and previous (the node before it). Maintain a HashSet of values you have already seen. If the current value is in the set, remove the node by pointing previous.next to iterator.next. Otherwise, add the value to the set and advance previous. Return the original head.
Why This Problem Matters
Linked list pointer manipulation is a staple of coding interviews because it tests a skill that does not come naturally to most developers: reasoning about mutable references in a sequential data structure. Adding a HashSet on top of that tests whether you can combine two data structures to solve a problem efficiently. This exact pattern, using a set to deduplicate while traversing a sequence, appears in dozens of other problems: removing duplicates from arrays, finding unique elements in streams, and detecting first non-repeating characters.
Understanding the Problem
Given the head ListNode of a singly linked list, remove all subsequent duplicate nodes, keeping only the first occurrence of each value. Return the head of the modified list.
removeDuplicates([1, 2, 3, 1, 2]) -> [1, 2, 3]
removeDuplicates([1, 1, 1, 1, 1]) -> [1]
removeDuplicates([1, 2, 3]) -> [1, 2, 3]
removeDuplicates([]) -> []
Edge Cases
- Empty list: Return null immediately (the loop never executes).
- Single node: One value gets added to the set, no duplicates possible. Return head.
- All duplicates: Every node after the first is removed. The list shrinks to one node.
- No duplicates: Every value is unique. The list remains unchanged.
Solution Approach
The key insight is that we need to remember which values we have already encountered. A HashSet answers the question "have I seen this before?" in O(1) time. We walk the list once, and at each node, we either add its value to the set (first occurrence) or unlink it from the chain (duplicate).
We maintain two pointers as we traverse:
iterator: The node we are currently examiningprevious: The last non-duplicate node we passed
When we find a duplicate, we bypass it by pointing previous.next to iterator.next, effectively removing the duplicate from the chain without physically deleting it.
Let's trace through the algorithm on [1, 2, 3, 1, 2]:
Loading visualization...
Notice how when we encounter the second 1 at step 3, we already have {1, 2, 3} in our visited set. We skip the duplicate by setting previous.next = iterator.next. The same happens with the second 2 at step 4.
All Duplicates Case
When every node has the same value, only the first survives:
Loading visualization...
No Duplicates Case
When all values are unique, the algorithm simply adds each one to the set without removing anything:
Loading visualization...
Implementation
Prefer a different language? Jump to solutions in other languages.
import java.util.HashSet;
import java.util.Set;
public class Solution {
public ListNode removeDuplicates(ListNode head) {
Set<Integer> visited = new HashSet<>();
ListNode iterator = head, previous = null;
while (iterator != null) {
if (visited.contains(iterator.data)) {
previous.next = iterator.next;
} else {
visited.add(iterator.data);
previous = iterator;
}
iterator = iterator.next;
}
return head;
}
}
The logic breaks down into three parts:
-
Initialize: Create a
HashSetto track seen values, and set up two pointers.iteratorstarts at the head,previousstarts as null (nothing comes before the first node). -
Traverse and decide: For each node, check if its value is in the set. If yes, bypass it by pointing
previous.nexttoiterator.next. If no, add the value to the set and updatepreviousto point to this node. -
Return: After the loop finishes,
headstill points to the first node (which is never a duplicate of itself), so return it directly.
A subtle but important detail: when we remove a duplicate, previous stays where it is. Only iterator advances. This ensures previous always points to the last kept node, ready to bypass the next duplicate if one follows immediately.
Complexity Analysis
Time: O(n). We visit each node exactly once. Each HashSet operation (contains, add) runs in O(1) average time. Total work is proportional to the number of nodes.
Space: O(n). In the worst case (all unique values), the HashSet stores every value in the list. This is the tradeoff for achieving linear time on an unsorted list. Without the set, you would need nested loops at O(n^2) time.
Common Pitfalls
-
Updating
previouson every iteration: If you always setprevious = iterator, you lose the ability to remove consecutive duplicates. Only advancepreviouswhen the current node is not a duplicate. -
Worrying about
previousbeing null: On the first iteration,previousis null. But the first node can never be a duplicate (the set is empty), so we always enter theelsebranch first and setprevious = iterator. No null pointer risk. -
Confusing this with the sorted-list variant: The sorted version does not need a set because duplicates are adjacent. If you reach for the no-set approach on an unsorted list, you will miss non-adjacent duplicates entirely.
-
Returning
previousoriteratorinstead ofhead: The first node never changes. Always return the originalhead.
Interview Tips
When presenting this solution:
- Start by clarifying whether the list is sorted or unsorted. This determines whether you need the HashSet.
- Explain the two-pointer pattern: "
previoustracks the last kept node so I can unlink duplicates." - Mention the time-space tradeoff: O(n) time with O(n) space using a set, versus O(n^2) time with O(1) space using nested loops.
- Draw the pointer manipulation for a removal step on the whiteboard. Showing
previous.next = iterator.nextmakes the logic immediately clear.
Key Takeaways
- A HashSet provides O(1) lookups, making single-pass deduplication possible in O(n) time. Without it, you would need O(n^2) nested loops.
- The two-pointer pattern (iterator + previous) is essential for singly linked list deletions. You cannot remove a node without access to the node before it.
- This algorithm works on both sorted and unsorted lists. The sorted-only variant is simpler (no set needed) but less general.
- Space complexity is O(n) in the worst case because the set may store every value. This is a common and acceptable tradeoff in interview settings.
- Edge cases (empty list, single node, all duplicates) are handled naturally by the loop without special-case code.
Practice and Related Problems
Once you have this pattern down, try these natural progressions:
- Reverse a linked list (another core pointer-manipulation problem)
- Detect a cycle in a linked list (uses the fast/slow pointer variant)
- Merge two sorted linked lists (combines traversal with comparison logic)
This problem and hundreds of others are available on Firecode, where spaced repetition ensures you internalize patterns like set-based deduplication rather than just memorizing solutions. Building strong linked list intuition here pays dividends across your entire interview preparation.
Solutions in Other Languages
Python
class Solution:
def remove_duplicates(self, head):
visited = set()
iterator, previous = head, None
while iterator is not None:
if iterator.data in visited:
previous.next = iterator.next
else:
visited.add(iterator.data)
previous = iterator
iterator = iterator.next
return head
JavaScript
class Solution {
removeDuplicates(head) {
const visited = new Set();
let iterator = head, previous = null;
while (iterator !== null) {
if (visited.has(iterator.data)) {
previous.next = iterator.next;
} else {
visited.add(iterator.data);
previous = iterator;
}
iterator = iterator.next;
}
return head;
}
}
TypeScript
class Solution {
removeDuplicates(head: ListNode | null): ListNode | null {
const visited = new Set<number>();
let iterator = head, previous: ListNode | null = null;
while (iterator !== null) {
if (visited.has(iterator.data)) {
previous!.next = iterator.next;
} else {
visited.add(iterator.data);
previous = iterator;
}
iterator = iterator.next;
}
return head;
}
}
C++
#include <unordered_set>
class Solution {
public:
ListNode *removeDuplicates(ListNode *head) {
std::unordered_set<int> visited;
ListNode *iterator = head, *previous = nullptr;
while (iterator != nullptr) {
if (visited.find(iterator->data) != visited.end()) {
previous->next = iterator->next;
} else {
visited.insert(iterator->data);
previous = iterator;
}
iterator = iterator->next;
}
return head;
}
};
Go
package solution
func (s *Solution) RemoveDuplicates(head *ListNode) *ListNode {
visited := make(map[int]bool)
iterator, previous := head, (*ListNode)(nil)
for iterator != nil {
if visited[iterator.Data] {
previous.Next = iterator.Next
} else {
visited[iterator.Data] = true
previous = iterator
}
iterator = iterator.Next
}
return head
}
Scala
import scala.collection.mutable
class Solution {
def removeDuplicates(head: ListNode): ListNode = {
val visited = mutable.Set[Int]()
var iterator = head
var previous: ListNode = null
while (iterator != null) {
if (visited.contains(iterator.data)) {
previous.next = iterator.next
} else {
visited.add(iterator.data)
previous = iterator
}
iterator = iterator.next
}
head
}
}
Kotlin
import java.util.HashSet
class Solution {
fun removeDuplicates(head: ListNode?): ListNode? {
val visited = HashSet<Int>()
var iterator = head
var previous: ListNode? = null
while (iterator != null) {
if (visited.contains(iterator.data)) {
previous?.next = iterator.next
} else {
visited.add(iterator.data)
previous = iterator
}
iterator = iterator.next
}
return head
}
}
Swift
import Foundation
class Solution {
func removeDuplicates(_ head: ListNode?) -> ListNode? {
var visited = Set<Int>()
var iterator = head
var previous: ListNode? = nil
while iterator != nil {
if visited.contains(iterator!.data) {
previous?.next = iterator!.next
} else {
visited.insert(iterator!.data)
previous = iterator
}
iterator = iterator!.next
}
return head
}
}
Rust
impl Solution {
pub fn remove_duplicates(&self, head: Option<Box<ListNode>>) -> Option<Box<ListNode>> {
let mut visited = HashSet::new();
let mut dummy = Box::new(ListNode::new(0));
dummy.next = head;
let mut current = &mut dummy;
while current.next.is_some() {
if visited.contains(¤t.next.as_ref().unwrap().data) {
let removed = current.next.take().unwrap();
current.next = removed.next;
} else {
visited.insert(current.next.as_ref().unwrap().data);
current = current.next.as_mut().unwrap();
}
}
dummy.next
}
}
C#
using System.Collections.Generic;
public class Solution {
public ListNode? removeDuplicates(ListNode? head) {
var visited = new HashSet<int>();
ListNode? iterator = head, previous = null;
while (iterator != null) {
if (visited.Contains(iterator.data)) {
previous.next = iterator.next;
} else {
visited.Add(iterator.data);
previous = iterator;
}
iterator = iterator.next;
}
return head;
}
}
Dart
class Solution {
ListNode? removeDuplicates(ListNode? head) {
Set<int> visited = {};
ListNode? iterator = head;
ListNode? previous = null;
while (iterator != null) {
if (visited.contains(iterator.data)) {
previous!.next = iterator.next;
} else {
visited.add(iterator.data);
previous = iterator;
}
iterator = iterator.next;
}
return head;
}
}
PHP
class Solution {
public function removeDuplicates(?ListNode $head): ?ListNode {
$visited = [];
$iterator = $head;
$previous = null;
while ($iterator !== null) {
if (isset($visited[$iterator->data])) {
$previous->next = $iterator->next;
} else {
$visited[$iterator->data] = true;
$previous = $iterator;
}
$iterator = $iterator->next;
}
return $head;
}
}
Ruby
require 'set'
class Solution
def remove_duplicates(head)
visited = Set.new
iterator, previous = head, nil
while iterator
if visited.include?(iterator.data)
previous.next = iterator.next
else
visited.add(iterator.data)
previous = iterator
end
iterator = iterator.next
end
head
end
end