Pattern-driven string matcher with regex support

AS
Ackshaey Singh
Difficulty Expert
Time Complexity
O(m * n)
Space Complexity
O(m * n)
String Dynamic programming Recursion
Google Amazon Yahoo Microsoft

You are 40 minutes into a Google phone screen when the interviewer says, "Let's talk about pattern matching. Given a string and a pattern that supports dot and star, can you tell me if the entire string matches?" If you have seen this before, you know it is one of the hardest classic dynamic programming problems. If you have not, the interplay between . and * can feel overwhelming at first. This problem is also known as "Regular Expression Matching" on other platforms, and it sits comfortably in the top tier of DP questions asked at Google, Amazon, Yahoo, and Microsoft.

Let's break it apart piece by piece.

TL;DR

Build a 2D boolean table dp[i][j] where each cell answers: "Do the first i characters of string s match the first j characters of pattern p?" The base case is dp[0][0] = true (empty matches empty). For *, either skip the preceding character entirely (zero occurrences) or consume one more character if it matches. For ., treat it as matching any single character. Fill the table row by row and read the answer from dp[m][n]. Time and space are both O(m * n).

Why This Problem Matters

Regex matching is one of those problems that separates candidates who understand DP from those who merely memorize templates. It forces you to reason about two dimensions of state simultaneously, handle a quantifier that modifies the meaning of the character before it, and correctly initialize edge cases. If you can solve this cleanly, problems like wildcard matching, edit distance, and longest common subsequence will feel significantly more approachable.

Understanding the Problem

Given a string s and a pattern p, determine whether p matches the entirety of s. The pattern supports two special characters:

  • . matches any single character
  • * matches zero or more of the character immediately before it

A few examples to anchor your intuition:

isMatch("aa", "a")     => false   // pattern is just "a", needs exactly one 'a'
isMatch("aa", "a*")    => true    // "a*" means zero or more 'a's
isMatch("ab", ".*")    => true    // ".*" means zero or more of any character
isMatch("aab", "c*a*b") => true   // zero c's, two a's, one b

Constraints:

  • s length is between 1 and 20
  • p length is between 1 and 30
  • s contains only lowercase English letters
  • p contains only lowercase letters, ., and *
  • Every * in p is preceded by a valid character

The Tricky Part

The * does not stand alone. It always pairs with the character before it, forming a unit like a* or .*. The unit a* can expand to "", "a", "aa", "aaa", and so on. The unit .* is the most powerful, matching any sequence of any length.

This pairing is what makes naive recursion explode. When you hit a* in the pattern, you face a choice: skip the pair (zero occurrences) or consume one character and stay on the same pattern position (one or more occurrences). Without memoization, these branches multiply exponentially.

Building the DP Table

Defining the State

Let dp[i][j] be true if s[0..i-1] matches p[0..j-1]. We need an (m+1) x (n+1) table where m = s.length and n = p.length. The extra row and column represent the empty string and empty pattern.

Base Cases

Empty string, empty pattern: dp[0][0] = true.

Empty string, non-empty pattern: A pattern like "a*b*" can match an empty string because each x* unit can expand to zero occurrences. We scan the pattern: for every * at position j, set dp[0][j] = dp[0][j-2]. This "looks back" past the x* pair to check whether the pattern before it also matches the empty string.

Non-empty string, empty pattern: Always false. You cannot match characters without a pattern.

Transition Rules

For each cell dp[i][j] where both i and j are at least 1:

  1. If p[j-1] is a letter or .: Check if the current characters match. If p[j-1] == s[i-1] or p[j-1] == '.', then dp[i][j] = dp[i-1][j-1]. Otherwise, dp[i][j] = false.

  2. If p[j-1] is *: Two sub-cases:

    • Zero occurrences: Skip the x* pair entirely. dp[i][j] = dp[i][j-2].
    • One or more occurrences: The character before * must match the current string character. If p[j-2] == s[i-1] or p[j-2] == '.', then we can "consume" one character: dp[i][j] = dp[i][j] || dp[i-1][j].

The dp[i-1][j] lookup is the key insight for one-or-more matches. It says: "If the string without the current character already matched this same pattern (including the *), then adding one more matching character still works."

Watching the Table Fill

Let's trace through s = "aab", p = "c*a*b". The column headers represent each character of the pattern, and the row headers represent each character of the string. Use the step controls to walk through the computation at your own pace:

Loading visualization...

Notice how c* contributes nothing (zero c's) while a* absorbs both a characters. The final cell confirms the match.

The Star Matching Flow

Here is a compact view of how * processes the string "aa" against pattern "a*":

Loading visualization...

The * first tries zero occurrences (checking dp[i][j-2]), then checks if the preceding character matches and pulls from dp[i-1][j] to absorb additional characters.

The Dot-Star Wildcard

The combination .* is the most permissive pattern unit. Since . matches any character and * allows zero or more repetitions, .* matches any string of any length. Here is the final DP table for s = "ab", p = ".*":

Loading visualization...

Every cell along the .* column propagates true downward because . always matches the current character, and dp[i-1][j] carries the match forward.

Implementation

Prefer a different language? Jump to solutions in other languages.

public class Solution {
  public boolean isMatch(String s, String p) {
    int m = s.length();
    int n = p.length();
    boolean[][] dp = new boolean[m + 1][n + 1];
    dp[0][0] = true;

    // Initialize: patterns like a*, a*b*, a*b*c* can match empty string
    for (int j = 1; j <= n; j++) {
      if (p.charAt(j - 1) == '*') {
        dp[0][j] = dp[0][j - 2];
      }
    }

    for (int i = 1; i <= m; i++) {
      for (int j = 1; j <= n; j++) {
        if (p.charAt(j - 1) == '.' || p.charAt(j - 1) == s.charAt(i - 1)) {
          // Direct character match or dot wildcard
          dp[i][j] = dp[i - 1][j - 1];
        } else if (p.charAt(j - 1) == '*') {
          // Zero occurrences of the preceding character
          dp[i][j] = dp[i][j - 2];
          // One or more occurrences if preceding char matches
          if (p.charAt(j - 2) == '.' || p.charAt(j - 2) == s.charAt(i - 1)) {
            dp[i][j] = dp[i][j] || dp[i - 1][j];
          }
        }
      }
    }

    return dp[m][n];
  }
}

Walking Through the Code

The implementation mirrors the transition rules exactly:

  1. Lines 4-5: Create the (m+1) x (n+1) DP table, initialized to false.
  2. Lines 8-11: Handle patterns that can match an empty string. Every * at position j inherits from two positions back.
  3. Lines 14-17: If the pattern character is a literal match or ., carry forward the diagonal value dp[i-1][j-1].
  4. Lines 18-23: If the pattern character is *, first assume zero occurrences (dp[i][j-2]). Then check if one-or-more is possible by verifying the preceding pattern character matches the current string character.

Complexity Analysis

Time Complexity: O(m * n)

We fill an (m+1) x (n+1) table, and each cell requires O(1) work: a couple of character comparisons and table lookups. No cell is computed more than once.

Space Complexity: O(m * n)

The DP table stores (m+1) * (n+1) boolean values. You could optimize to O(n) space by keeping only two rows at a time, but the full table is clearer for interviews and the constraints (m ≤ 20, n ≤ 30) make optimization unnecessary.

Common Pitfalls

  1. Treating * as a standalone wildcard. It is not. * always modifies the character before it. If you process * independently, your transitions will be wrong.

  2. Forgetting the initialization loop. Patterns like "a*b*c*" match the empty string, but only if you correctly propagate dp[0][j] = dp[0][j-2] for each * position.

  3. Off-by-one indexing. The DP table is 1-indexed (row 0 and column 0 represent empty strings), but s and p are 0-indexed. Mixing these up causes subtle bugs. The pattern s.charAt(i - 1) and p.charAt(j - 1) is your friend.

  4. Missing the dp[i-1][j] transition. For * matching one or more characters, you need dp[i-1][j], not dp[i-1][j-1]. The j stays the same because the * can keep consuming characters.

Interview Tips

When you get this problem in an interview:

  1. Start with examples. Work through isMatch("aa", "a*") and isMatch("aab", "c*a*b") on paper. Show the interviewer you understand the mechanics of * before touching code.

  2. Define the DP state clearly. Write dp[i][j] = s[0..i-1] matches p[0..j-1] on the whiteboard. This anchors the entire discussion.

  3. Handle base cases first. The initialization loop for * patterns matching empty strings is easy to forget under pressure. Write it before the main nested loop.

  4. Mention the recursive alternative. If you want to show breadth, note that top-down recursion with memoization also works in O(m * n) time. The bottom-up DP approach is generally preferred because it avoids stack overflow on deep recursion.

  5. Discuss the space optimization. Mentioning that you can reduce space to O(n) with a rolling array shows engineering maturity, even if you do not implement it.

Key Takeaways

  • Build a 2D table where dp[i][j] tracks whether the first i characters of s match the first j characters of p. The answer lives at dp[m][n].
  • The * quantifier creates two branches: skip the x* pair (zero occurrences via dp[i][j-2]) or consume one character if it matches (via dp[i-1][j]).
  • Initialize dp[0][j] for every * in the pattern by propagating from dp[0][j-2], so patterns like "a*b*" correctly match the empty string.
  • The . wildcard simplifies to the same logic as a matching character, just replace the equality check with an unconditional pass.
  • Both time and space are O(m * n). Space can be reduced to O(n) with a rolling array, but the full table is clearer for whiteboard discussions.

Solutions in Other Languages

Python

class Solution:
    def is_match(self, s: str, p: str) -> bool:
        dp = [[False] * (len(p) + 1) for _ in range(len(s) + 1)]
        dp[0][0] = True

        for j in range(2, len(p) + 1):
            if p[j - 1] == '*':
                dp[0][j] = dp[0][j - 2]

        for i in range(1, len(s) + 1):
            for j in range(1, len(p) + 1):
                if p[j - 1] == '.' or p[j - 1] == s[i - 1]:
                    dp[i][j] = dp[i - 1][j - 1]
                elif p[j - 1] == '*':
                    dp[i][j] = dp[i][j - 2] or (dp[i - 1][j] and (p[j - 2] == s[i - 1] or p[j - 2] == '.'))

        return dp[len(s)][len(p)]

JavaScript

class Solution {
  isMatch(s, p) {
    const dp = Array(s.length + 1).fill(false).map(() => Array(p.length + 1).fill(false));
    dp[0][0] = true;

    for (let j = 1; j <= p.length; j++) {
      if (p[j - 1] === '*') {
        dp[0][j] = dp[0][j - 2];
      }
    }

    for (let i = 1; i <= s.length; i++) {
      for (let j = 1; j <= p.length; j++) {
        if (p[j - 1] === '.' || p[j - 1] === s[i - 1]) {
          dp[i][j] = dp[i - 1][j - 1];
        } else if (p[j - 1] === '*') {
          dp[i][j] = dp[i][j - 2] || (dp[i - 1][j] && (p[j - 2] === s[i - 1] || p[j - 2] === '.'));
        }
      }
    }

    return dp[s.length][p.length];
  }
}

TypeScript

class Solution {
  isMatch(s: string, p: string): boolean {
    const dp: boolean[][] = Array(s.length + 1).fill(false).map(() => Array(p.length + 1).fill(false));
    dp[0][0] = true;

    for (let j = 1; j <= p.length; j++) {
      if (p[j - 1] === '*') {
        dp[0][j] = dp[0][j - 2];
      }
    }

    for (let i = 1; i <= s.length; i++) {
      for (let j = 1; j <= p.length; j++) {
        if (p[j - 1] === '.' || p[j - 1] === s[i - 1]) {
          dp[i][j] = dp[i - 1][j - 1];
        } else if (p[j - 1] === '*') {
          dp[i][j] = dp[i][j - 2] || (dp[i - 1][j] && (p[j - 2] === s[i - 1] || p[j - 2] === '.'));
        }
      }
    }

    return dp[s.length][p.length];
  }
}

C++

#include <vector>
#include <string>

class Solution {
public:
  bool isMatch(std::string &s, std::string &p) {
    int m = s.size();
    int n = p.size();
    std::vector<std::vector<bool>> dp(m + 1, std::vector<bool>(n + 1, false));
    dp[0][0] = true;

    for (int j = 1; j <= n; ++j) {
      if (p[j - 1] == '*') {
        dp[0][j] = dp[0][j - 2];
      }
    }

    for (int i = 1; i <= m; ++i) {
      for (int j = 1; j <= n; ++j) {
        if (p[j - 1] == '.' || p[j - 1] == s[i - 1]) {
          dp[i][j] = dp[i - 1][j - 1];
        } else if (p[j - 1] == '*') {
          dp[i][j] = dp[i][j - 2] || ((p[j - 2] == s[i - 1] || p[j - 2] == '.') && dp[i - 1][j]);
        }
      }
    }

    return dp[m][n];
  }
};

Go

func IsMatch(s string, p string) bool {
    dp := make([][]bool, len(s)+1)
    for i := range dp {
        dp[i] = make([]bool, len(p)+1)
    }
    dp[0][0] = true

    for j := 1; j <= len(p); j++ {
        if p[j-1] == '*' {
            dp[0][j] = dp[0][j-2]
        }
    }

    for i := 1; i <= len(s); i++ {
        for j := 1; j <= len(p); j++ {
            if p[j-1] == '.' || p[j-1] == s[i-1] {
                dp[i][j] = dp[i-1][j-1]
            } else if p[j-1] == '*' {
                dp[i][j] = dp[i][j-2] || (dp[i-1][j] && (p[j-2] == s[i-1] || p[j-2] == '.'))
            }
        }
    }

    return dp[len(s)][len(p)]
}

Scala

class Solution {
  def isMatch(s: String, p: String): Boolean = {
    val m = s.length
    val n = p.length
    val dp = Array.fill(m + 1, n + 1)(false)
    dp(0)(0) = true

    for (j <- 1 to n) {
      if (p(j - 1) == '*') {
        dp(0)(j) = dp(0)(j - 2)
      }
    }

    for (i <- 1 to m) {
      for (j <- 1 to n) {
        if (p(j - 1) == '.' || p(j - 1) == s(i - 1)) {
          dp(i)(j) = dp(i - 1)(j - 1)
        } else if (p(j - 1) == '*') {
          dp(i)(j) = dp(i)(j - 2)
          if (p(j - 2) == '.' || p(j - 2) == s(i - 1)) {
            dp(i)(j) = dp(i)(j) || dp(i - 1)(j)
          }
        }
      }
    }

    dp(m)(n)
  }
}

Kotlin

class Solution {
  fun isMatch(s: String, p: String): Boolean {
    val m = s.length
    val n = p.length
    val dp = Array(m + 1) { BooleanArray(n + 1) }
    dp[0][0] = true

    for (j in 1..n) {
      if (p[j - 1] == '*') {
        dp[0][j] = dp[0][j - 2]
      }
    }

    for (i in 1..m) {
      for (j in 1..n) {
        if (p[j - 1] == '.' || p[j - 1] == s[i - 1]) {
          dp[i][j] = dp[i - 1][j - 1]
        } else if (p[j - 1] == '*') {
          dp[i][j] = dp[i][j - 2]
          if (p[j - 2] == '.' || p[j - 2] == s[i - 1]) {
            dp[i][j] = dp[i][j] || dp[i - 1][j]
          }
        }
      }
    }

    return dp[m][n]
  }
}

Ruby

class Solution
  def is_match(s, p)
    string_length = s.length
    pattern_length = p.length
    dp = Array.new(string_length + 1) { Array.new(pattern_length + 1, false) }
    dp[0][0] = true

    (1..pattern_length).each do |j|
      if p[j - 1] == '*'
        dp[0][j] = dp[0][j - 2]
      end
    end

    (1..string_length).each do |i|
      (1..pattern_length).each do |j|
        if p[j - 1] == '.' || p[j - 1] == s[i - 1]
          dp[i][j] = dp[i - 1][j - 1]
        elsif p[j - 1] == '*'
          dp[i][j] = dp[i][j - 2]
          if p[j - 2] == '.' || p[j - 2] == s[i - 1]
            dp[i][j] = dp[i][j] || dp[i - 1][j]
          end
        end
      end
    end

    dp[string_length][pattern_length]
  end
end

Rust

impl Solution {
    pub fn is_match(&self, s: String, p: String) -> bool {
        let m = s.len();
        let n = p.len();
        let s_bytes = s.as_bytes();
        let p_bytes = p.as_bytes();
        let mut dp = vec![vec![false; n + 1]; m + 1];
        dp[0][0] = true;

        for j in 1..=n {
            if p_bytes[j - 1] == b'*' {
                dp[0][j] = dp[0][j - 2];
            }
        }

        for i in 1..=m {
            for j in 1..=n {
                if p_bytes[j - 1] == b'.' || p_bytes[j - 1] == s_bytes[i - 1] {
                    dp[i][j] = dp[i - 1][j - 1];
                } else if p_bytes[j - 1] == b'*' {
                    dp[i][j] = dp[i][j - 2];
                    if p_bytes[j - 2] == b'.' || p_bytes[j - 2] == s_bytes[i - 1] {
                        dp[i][j] = dp[i][j] || dp[i - 1][j];
                    }
                }
            }
        }

        dp[m][n]
    }
}

Swift

class Solution {
    func isMatch(_ s: String, _ p: String) -> Bool {
        let sChars = Array(s)
        let pChars = Array(p)
        let m = sChars.count
        let n = pChars.count
        var dp = Array(repeating: Array(repeating: false, count: n + 1), count: m + 1)
        dp[0][0] = true

        for j in stride(from: 1, through: n, by: 1) {
            if pChars[j - 1] == Character("*") {
                dp[0][j] = dp[0][j - 2]
            }
        }

        for i in stride(from: 1, through: m, by: 1) {
            for j in stride(from: 1, through: n, by: 1) {
                if pChars[j - 1] == Character(".") || pChars[j - 1] == sChars[i - 1] {
                    dp[i][j] = dp[i - 1][j - 1]
                } else if pChars[j - 1] == Character("*") {
                    dp[i][j] = dp[i][j - 2]
                    if pChars[j - 2] == Character(".") || pChars[j - 2] == sChars[i - 1] {
                        dp[i][j] = dp[i][j] || dp[i - 1][j]
                    }
                }
            }
        }

        return dp[m][n]
    }
}

Dart

class Solution {
  bool isMatch(String s, String p) {
    int m = s.length;
    int n = p.length;
    List<List<bool>> dp = List.generate(m + 1, (_) => List.filled(n + 1, false));
    dp[0][0] = true;

    for (int j = 1; j <= n; j++) {
      if (p[j - 1] == '*') {
        dp[0][j] = dp[0][j - 2];
      }
    }

    for (int i = 1; i <= m; i++) {
      for (int j = 1; j <= n; j++) {
        if (p[j - 1] == '.' || p[j - 1] == s[i - 1]) {
          dp[i][j] = dp[i - 1][j - 1];
        } else if (p[j - 1] == '*') {
          dp[i][j] = dp[i][j - 2];
          if (p[j - 2] == '.' || p[j - 2] == s[i - 1]) {
            dp[i][j] = dp[i][j] || dp[i - 1][j];
          }
        }
      }
    }

    return dp[m][n];
  }
}

PHP

class Solution {
    public function isMatch(string $s, string $p): bool {
        $m = strlen($s);
        $n = strlen($p);
        $dp = array_fill(0, $m + 1, array_fill(0, $n + 1, false));
        $dp[0][0] = true;

        for ($j = 1; $j <= $n; $j++) {
            if ($p[$j - 1] === '*') {
                $dp[0][$j] = $dp[0][$j - 2];
            }
        }

        for ($i = 1; $i <= $m; $i++) {
            for ($j = 1; $j <= $n; $j++) {
                if ($p[$j - 1] === '.' || $p[$j - 1] === $s[$i - 1]) {
                    $dp[$i][$j] = $dp[$i - 1][$j - 1];
                } elseif ($p[$j - 1] === '*') {
                    $dp[$i][$j] = $dp[$i][$j - 2];
                    if ($p[$j - 2] === '.' || $p[$j - 2] === $s[$i - 1]) {
                        $dp[$i][$j] = $dp[$i][$j] || $dp[$i - 1][$j];
                    }
                }
            }
        }

        return $dp[$m][$n];
    }
}

C#

public class Solution {
    public bool isMatch(string s, string p) {
        int m = s.Length;
        int n = p.Length;
        bool[,] dp = new bool[m + 1, n + 1];
        dp[0, 0] = true;

        for (int j = 1; j <= n; j++) {
            if (p[j - 1] == '*') {
                dp[0, j] = dp[0, j - 2];
            }
        }

        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (p[j - 1] == '.' || p[j - 1] == s[i - 1]) {
                    dp[i, j] = dp[i - 1, j - 1];
                } else if (p[j - 1] == '*') {
                    dp[i, j] = dp[i, j - 2];
                    if (p[j - 2] == '.' || p[j - 2] == s[i - 1]) {
                        dp[i, j] = dp[i, j] || dp[i - 1, j];
                    }
                }
            }
        }

        return dp[m, n];
    }
}

Practice Makes Perfect

Regex matching is a challenging problem, but the DP framework it teaches applies broadly. Once you are comfortable here, try these related problems to solidify your understanding:

  • Wildcard Pattern Matching - Similar idea but * matches any sequence independently
  • Edit Distance - Another 2D DP problem with string transformations
  • Longest Common Subsequence - Foundation for many string DP problems

This problem and hundreds of others are available on Firecode, where over 50,000 engineers have prepared for technical interviews and landed roles at top tech companies. Whether you are targeting Google, Amazon, or your next startup, building fluency with 2D DP problems like this one will pay dividends across your entire interview prep.

Frequently Asked Questions

What is the time complexity of regular expression matching with DP?
The time complexity is O(m * n) where m is the length of the string and n is the length of the pattern. Each cell in the 2D DP table is computed once, and each computation takes O(1) time by looking at previously computed cells.
Why is recursion alone insufficient for regex matching?
A pure recursive approach without memoization has exponential time complexity because it recomputes the same subproblems repeatedly. For example, when a '*' can match zero or more characters, the recursion branches at each step, leading to O(2^(m+n)) time in the worst case. DP eliminates this by storing results in a table.
How does the '.' wildcard differ from '*' in pattern matching?
The '.' character matches exactly one character of any kind, acting as a single-character wildcard. The '*' character is a quantifier that modifies the preceding character, allowing it to match zero or more occurrences of that character. So 'a.' matches any two-character string starting with 'a', while 'a*' matches zero or more consecutive 'a' characters.
What does dp[i][j] represent in the regex matching DP table?
dp[i][j] represents whether the first i characters of string s match the first j characters of pattern p. dp[0][0] is true because an empty string matches an empty pattern. The final answer is dp[m][n] where m and n are the lengths of s and p respectively.
How do you handle patterns like 'a*b*c*' that can match an empty string?
During initialization, iterate through the pattern and for each '*' at position j, set dp[0][j] = dp[0][j-2]. This propagates the fact that x* can match zero occurrences. So for 'a*b*c*', dp[0][2] = dp[0][0] = true, dp[0][4] = dp[0][2] = true, and dp[0][6] = dp[0][4] = true.
What is the difference between this problem and wildcard pattern matching?
In wildcard matching, '*' matches any sequence of characters independently. In regex matching, '*' is a quantifier tied to the preceding character, matching zero or more of that specific character. Regex matching requires tracking which character '*' modifies, making the DP transitions more nuanced.
Can this problem be solved with finite automata instead of DP?
Yes. You can convert the regex pattern to a nondeterministic finite automaton (NFA) and simulate it. Thompson's construction builds the NFA in O(n) time, and simulation runs in O(m * n) time. However, the DP approach is more commonly expected in interviews because it is easier to implement and explain under time pressure.
How often does regex matching appear in coding interviews?
Regular expression matching is a classic hard problem frequently asked at Google, Amazon, Microsoft, and Yahoo. It tests understanding of 2D dynamic programming, string processing, and handling complex state transitions. It appears regularly on interview prep platforms and is considered a must-know problem for senior engineering roles.