Remove duplicate nodes from a linked list

AS
Ackshaey Singh
Difficulty Medium
Time Complexity
O(n)
Space Complexity
O(n)
Linked list Set
Oracle Ibm

"Remove duplicate nodes from a linked list." This problem, also known as Remove Duplicates from Sorted List in its simpler sorted variant, is one of the cleanest introductions to combining linked list traversal with set-based lookups. It strips the linked list manipulation pattern down to its essence: walk the list, remember what you have seen, and unlink anything you have seen before.

TL;DR

Traverse the linked list with two pointers: iterator (current node) and previous (the node before it). Maintain a HashSet of values you have already seen. If the current value is in the set, remove the node by pointing previous.next to iterator.next. Otherwise, add the value to the set and advance previous. Return the original head.

Why This Problem Matters

Linked list pointer manipulation is a staple of coding interviews because it tests a skill that does not come naturally to most developers: reasoning about mutable references in a sequential data structure. Adding a HashSet on top of that tests whether you can combine two data structures to solve a problem efficiently. This exact pattern, using a set to deduplicate while traversing a sequence, appears in dozens of other problems: removing duplicates from arrays, finding unique elements in streams, and detecting first non-repeating characters.

Understanding the Problem

Given the head ListNode of a singly linked list, remove all subsequent duplicate nodes, keeping only the first occurrence of each value. Return the head of the modified list.

removeDuplicates([1, 2, 3, 1, 2]) -> [1, 2, 3]
removeDuplicates([1, 1, 1, 1, 1]) -> [1]
removeDuplicates([1, 2, 3])       -> [1, 2, 3]
removeDuplicates([])               -> []

Edge Cases

  1. Empty list: Return null immediately (the loop never executes).
  2. Single node: One value gets added to the set, no duplicates possible. Return head.
  3. All duplicates: Every node after the first is removed. The list shrinks to one node.
  4. No duplicates: Every value is unique. The list remains unchanged.

Solution Approach

The key insight is that we need to remember which values we have already encountered. A HashSet answers the question "have I seen this before?" in O(1) time. We walk the list once, and at each node, we either add its value to the set (first occurrence) or unlink it from the chain (duplicate).

We maintain two pointers as we traverse:

  • iterator: The node we are currently examining
  • previous: The last non-duplicate node we passed

When we find a duplicate, we bypass it by pointing previous.next to iterator.next, effectively removing the duplicate from the chain without physically deleting it.

Let's trace through the algorithm on [1, 2, 3, 1, 2]:

Loading visualization...

Notice how when we encounter the second 1 at step 3, we already have {1, 2, 3} in our visited set. We skip the duplicate by setting previous.next = iterator.next. The same happens with the second 2 at step 4.

All Duplicates Case

When every node has the same value, only the first survives:

Loading visualization...

No Duplicates Case

When all values are unique, the algorithm simply adds each one to the set without removing anything:

Loading visualization...

Implementation

Prefer a different language? Jump to solutions in other languages.

import java.util.HashSet;
import java.util.Set;

public class Solution {
  public ListNode removeDuplicates(ListNode head) {
    Set<Integer> visited = new HashSet<>();
    ListNode iterator = head, previous = null;

    while (iterator != null) {
      if (visited.contains(iterator.data)) {
        previous.next = iterator.next;
      } else {
        visited.add(iterator.data);
        previous = iterator;
      }
      iterator = iterator.next;
    }

    return head;
  }
}

The logic breaks down into three parts:

  1. Initialize: Create a HashSet to track seen values, and set up two pointers. iterator starts at the head, previous starts as null (nothing comes before the first node).

  2. Traverse and decide: For each node, check if its value is in the set. If yes, bypass it by pointing previous.next to iterator.next. If no, add the value to the set and update previous to point to this node.

  3. Return: After the loop finishes, head still points to the first node (which is never a duplicate of itself), so return it directly.

A subtle but important detail: when we remove a duplicate, previous stays where it is. Only iterator advances. This ensures previous always points to the last kept node, ready to bypass the next duplicate if one follows immediately.

Complexity Analysis

Time: O(n). We visit each node exactly once. Each HashSet operation (contains, add) runs in O(1) average time. Total work is proportional to the number of nodes.

Space: O(n). In the worst case (all unique values), the HashSet stores every value in the list. This is the tradeoff for achieving linear time on an unsorted list. Without the set, you would need nested loops at O(n^2) time.

Common Pitfalls

  1. Updating previous on every iteration: If you always set previous = iterator, you lose the ability to remove consecutive duplicates. Only advance previous when the current node is not a duplicate.

  2. Worrying about previous being null: On the first iteration, previous is null. But the first node can never be a duplicate (the set is empty), so we always enter the else branch first and set previous = iterator. No null pointer risk.

  3. Confusing this with the sorted-list variant: The sorted version does not need a set because duplicates are adjacent. If you reach for the no-set approach on an unsorted list, you will miss non-adjacent duplicates entirely.

  4. Returning previous or iterator instead of head: The first node never changes. Always return the original head.

Interview Tips

When presenting this solution:

  • Start by clarifying whether the list is sorted or unsorted. This determines whether you need the HashSet.
  • Explain the two-pointer pattern: "previous tracks the last kept node so I can unlink duplicates."
  • Mention the time-space tradeoff: O(n) time with O(n) space using a set, versus O(n^2) time with O(1) space using nested loops.
  • Draw the pointer manipulation for a removal step on the whiteboard. Showing previous.next = iterator.next makes the logic immediately clear.

Key Takeaways

  • A HashSet provides O(1) lookups, making single-pass deduplication possible in O(n) time. Without it, you would need O(n^2) nested loops.
  • The two-pointer pattern (iterator + previous) is essential for singly linked list deletions. You cannot remove a node without access to the node before it.
  • This algorithm works on both sorted and unsorted lists. The sorted-only variant is simpler (no set needed) but less general.
  • Space complexity is O(n) in the worst case because the set may store every value. This is a common and acceptable tradeoff in interview settings.
  • Edge cases (empty list, single node, all duplicates) are handled naturally by the loop without special-case code.

Practice and Related Problems

Once you have this pattern down, try these natural progressions:

  • Reverse a linked list (another core pointer-manipulation problem)
  • Detect a cycle in a linked list (uses the fast/slow pointer variant)
  • Merge two sorted linked lists (combines traversal with comparison logic)

This problem and hundreds of others are available on Firecode, where spaced repetition ensures you internalize patterns like set-based deduplication rather than just memorizing solutions. Building strong linked list intuition here pays dividends across your entire interview preparation.

Solutions in Other Languages

Python

class Solution:
    def remove_duplicates(self, head):
        visited = set()
        iterator, previous = head, None

        while iterator is not None:
            if iterator.data in visited:
                previous.next = iterator.next
            else:
                visited.add(iterator.data)
                previous = iterator
            iterator = iterator.next

        return head

JavaScript

class Solution {
  removeDuplicates(head) {
    const visited = new Set();
    let iterator = head, previous = null;

    while (iterator !== null) {
      if (visited.has(iterator.data)) {
        previous.next = iterator.next;
      } else {
        visited.add(iterator.data);
        previous = iterator;
      }
      iterator = iterator.next;
    }

    return head;
  }
}

TypeScript

class Solution {
  removeDuplicates(head: ListNode | null): ListNode | null {
    const visited = new Set<number>();
    let iterator = head, previous: ListNode | null = null;

    while (iterator !== null) {
      if (visited.has(iterator.data)) {
        previous!.next = iterator.next;
      } else {
        visited.add(iterator.data);
        previous = iterator;
      }
      iterator = iterator.next;
    }

    return head;
  }
}

C++

#include <unordered_set>

class Solution {
public:
  ListNode *removeDuplicates(ListNode *head) {
    std::unordered_set<int> visited;
    ListNode *iterator = head, *previous = nullptr;

    while (iterator != nullptr) {
      if (visited.find(iterator->data) != visited.end()) {
        previous->next = iterator->next;
      } else {
        visited.insert(iterator->data);
        previous = iterator;
      }
      iterator = iterator->next;
    }

    return head;
  }
};

Go

package solution

func (s *Solution) RemoveDuplicates(head *ListNode) *ListNode {
    visited := make(map[int]bool)
    iterator, previous := head, (*ListNode)(nil)

    for iterator != nil {
        if visited[iterator.Data] {
            previous.Next = iterator.Next
        } else {
            visited[iterator.Data] = true
            previous = iterator
        }
        iterator = iterator.Next
    }

    return head
}

Scala

import scala.collection.mutable

class Solution {
  def removeDuplicates(head: ListNode): ListNode = {
    val visited = mutable.Set[Int]()
    var iterator = head
    var previous: ListNode = null

    while (iterator != null) {
      if (visited.contains(iterator.data)) {
        previous.next = iterator.next
      } else {
        visited.add(iterator.data)
        previous = iterator
      }
      iterator = iterator.next
    }

    head
  }
}

Kotlin

import java.util.HashSet

class Solution {
    fun removeDuplicates(head: ListNode?): ListNode? {
        val visited = HashSet<Int>()
        var iterator = head
        var previous: ListNode? = null

        while (iterator != null) {
            if (visited.contains(iterator.data)) {
                previous?.next = iterator.next
            } else {
                visited.add(iterator.data)
                previous = iterator
            }
            iterator = iterator.next
        }

        return head
    }
}

Swift

import Foundation

class Solution {
    func removeDuplicates(_ head: ListNode?) -> ListNode? {
        var visited = Set<Int>()
        var iterator = head
        var previous: ListNode? = nil

        while iterator != nil {
            if visited.contains(iterator!.data) {
                previous?.next = iterator!.next
            } else {
                visited.insert(iterator!.data)
                previous = iterator
            }
            iterator = iterator!.next
        }

        return head
    }
}

Rust

impl Solution {
    pub fn remove_duplicates(&self, head: Option<Box<ListNode>>) -> Option<Box<ListNode>> {
        let mut visited = HashSet::new();
        let mut dummy = Box::new(ListNode::new(0));
        dummy.next = head;
        let mut current = &mut dummy;

        while current.next.is_some() {
            if visited.contains(&current.next.as_ref().unwrap().data) {
                let removed = current.next.take().unwrap();
                current.next = removed.next;
            } else {
                visited.insert(current.next.as_ref().unwrap().data);
                current = current.next.as_mut().unwrap();
            }
        }

        dummy.next
    }
}

C#

using System.Collections.Generic;

public class Solution {
    public ListNode? removeDuplicates(ListNode? head) {
        var visited = new HashSet<int>();
        ListNode? iterator = head, previous = null;

        while (iterator != null) {
            if (visited.Contains(iterator.data)) {
                previous.next = iterator.next;
            } else {
                visited.Add(iterator.data);
                previous = iterator;
            }
            iterator = iterator.next;
        }

        return head;
    }
}

Dart

class Solution {
  ListNode? removeDuplicates(ListNode? head) {
    Set<int> visited = {};
    ListNode? iterator = head;
    ListNode? previous = null;

    while (iterator != null) {
      if (visited.contains(iterator.data)) {
        previous!.next = iterator.next;
      } else {
        visited.add(iterator.data);
        previous = iterator;
      }
      iterator = iterator.next;
    }

    return head;
  }
}

PHP

class Solution {
    public function removeDuplicates(?ListNode $head): ?ListNode {
        $visited = [];
        $iterator = $head;
        $previous = null;

        while ($iterator !== null) {
            if (isset($visited[$iterator->data])) {
                $previous->next = $iterator->next;
            } else {
                $visited[$iterator->data] = true;
                $previous = $iterator;
            }
            $iterator = $iterator->next;
        }

        return $head;
    }
}

Ruby

require 'set'

class Solution
  def remove_duplicates(head)
    visited = Set.new
    iterator, previous = head, nil

    while iterator
      if visited.include?(iterator.data)
        previous.next = iterator.next
      else
        visited.add(iterator.data)
        previous = iterator
      end
      iterator = iterator.next
    end

    head
  end
end

Frequently Asked Questions

What data structure is best for tracking duplicates in a linked list?
A HashSet (or equivalent) is the best choice because it provides O(1) average-time lookups and insertions. As you traverse the list, you check whether each node's value is already in the set. If it is, you remove the node. If not, you add the value and move on. This gives you an O(n) overall solution.
Why do we need a 'previous' pointer in this algorithm?
In a singly linked list, you can only traverse forward. When you find a duplicate node that needs to be removed, you must update the previous node's 'next' pointer to skip over the duplicate. Without tracking the previous node, you would have no way to unlink the duplicate from the chain.
What happens when the list is empty or has only one node?
An empty list (null head) passes through the while loop without executing, so the method returns null. A single-node list adds the one value to the visited set and returns the head unchanged. Both are handled naturally by the algorithm without any special-case code.
Does this algorithm work on unsorted linked lists?
Yes. Unlike the two-pointer approach that only works on sorted lists, this HashSet-based approach works regardless of node ordering. It removes all subsequent occurrences of any value, keeping only the first. The tradeoff is O(n) extra space for the set.
Can you remove duplicates from a linked list without extra space?
If the list is sorted, you can use a single pointer comparing each node to its next, achieving O(1) space. For unsorted lists, you can use nested loops (O(n^2) time, O(1) space) by comparing each node against all following nodes. The HashSet approach trades O(n) space for O(n) time.
Why is the time complexity O(n) and not O(n^2)?
The algorithm makes exactly one pass through the list, visiting each node once. At each node, it performs a HashSet lookup (O(1) average) and possibly a removal (O(1) pointer update). Since we visit n nodes and do constant work at each, the total time is O(n).
What is the space complexity and why?
The space complexity is O(n) in the worst case. If every node in the list has a unique value, the HashSet will store all n values. In the best case (all nodes identical), the set holds only one element, but we measure worst-case complexity.
How does this differ from LeetCode's Remove Duplicates from Sorted List?
LeetCode's version (problem 83) assumes the list is already sorted, which means duplicates are always adjacent. That allows a simpler O(1) space solution. This problem makes no assumption about ordering, so you need a HashSet to track seen values across non-adjacent positions in the list.