Find the first occurrence of a substring
You're in a Google phone screen and the interviewer asks, "Given two strings, find where the first one appears inside the second." It sounds almost too simple, but this problem has a surprising amount of depth. It tests your ability to handle string indices carefully, think about edge cases, and reason about loop boundaries. This problem is also known as "Find the Index of the First Occurrence in a String" on other platforms, and it comes up frequently at companies like Google, Uber, Adobe, and Amazon.
TL;DR
Slide a window of length m (needle length) across the haystack string, comparing the substring at each position to needle. If a match is found, return the starting index. If no position matches, return -1. This runs in O(n * m) time and O(1) space, where n is the haystack length and m is the needle length. The key detail is setting the correct loop boundary: iterate only while i ≤ haystackLength - needleLength, since any position beyond that cannot contain a full match.
Why This Problem Matters
Substring search is one of the fundamental operations in computer science. Every text editor's "Find" feature, every search engine's query matcher, and every compiler's lexer relies on some form of pattern matching. While production systems use optimized algorithms like KMP or Rabin-Karp, the brute force approach taught here is the foundation they all build upon. Nail this, and you have a solid base for understanding more advanced string algorithms.
Understanding the Problem
Given two strings, needle and haystack, find the index where needle first appears in haystack. If needle is not present, return -1.
strStr("haystack", "stack") => 3
strStr("hello", "ll") => 2
strStr("aaaaa", "bba") => -1
Constraints:
- Both strings contain only lowercase English letters
- Lengths are between 1 and 10,000
Edge Cases
- Empty needle: Return
0(the empty string is found at the start of any string) - Needle longer than haystack: Return
-1immediately - Exact match:
needle == haystackreturns0 - Needle at the end:
strStr("haystack", "ack")returns5 - Partial matches before the real one:
strStr("mississippi", "issip")returns4
That last case is particularly tricky. The substring "issi" appears starting at index 1, but "issip" does not match there because the fifth character is 's', not 'p'. The real match starts at index 4.
Solution Approach
The strategy is straightforward: slide a comparison window across the haystack and check if the substring at each position matches the needle.
For each valid starting position i from 0 to haystackLength - needleLength:
- Extract the substring
haystack[i..i+needleLength] - Compare it to
needle - If they match, return
i - If no position produces a match, return
-1
Walkthrough: Finding "ll" in "hello"
The needle has length 2, so we check positions 0 through 3 (since 5 - 2 = 3).
Loading visualization...
At position 0, we compare "he" against "ll". No match. At position 1, "el" versus "ll". Still no match. At position 2, "ll" versus "ll". That matches, so we return 2.
Walkthrough: "bba" Not Found in "aaaaa"
When the needle does not exist in the haystack, we exhaust all valid positions and return -1.
Loading visualization...
Every window of length 3 in "aaaaa" contains only 'a' characters, so "bba" never matches.
Handling Partial Matches
Finding "issip" in "mississippi" is a good test of correctness. Several positions produce partial matches where the first few characters align but the full substring does not.
Loading visualization...
The algorithm does not need special logic for partial matches. It simply compares the full substring at each position. If the comparison fails, it moves on to the next starting index.
Implementation
Prefer a different language? Jump to solutions in other languages.
public class Solution {
public int strStr(String haystack, String needle) {
// An empty needle is trivially found at index 0
if (needle.isEmpty()) {
return 0;
}
int haystackLength = haystack.length();
int needleLength = needle.length();
// Check each valid starting position
for (int i = 0; i <= haystackLength - needleLength; i++) {
// Compare the substring starting at i with needle
if (haystack.substring(i, i + needleLength).equals(needle)) {
return i;
}
}
// No match found
return -1;
}
}
The loop condition i <= haystackLength - needleLength is the most important detail. If the remaining characters in haystack are fewer than needle's length, there is no point checking. This bounds the loop correctly and prevents index-out-of-bounds errors.
Why Not Use Built-in Methods?
In a real codebase, you would use haystack.indexOf(needle) in Java or haystack.find(needle) in Python. But in an interview, the whole point is to demonstrate that you understand how substring search works under the hood. Using built-in methods defeats the purpose of the question.
Complexity Analysis
Time Complexity: O(n * m)
For each of the n - m + 1 starting positions, we compare up to m characters. The worst case occurs with inputs like haystack = "aaaaaa" and needle = "aab", where every position requires comparing almost all characters of needle before the mismatch at the end.
In practice, mismatches tend to occur on the first or second character, so average performance is usually closer to O(n). But the worst case is O(n * m), and that is what you should report in an interview.
Space Complexity: O(1)
We only use a few integer variables. In Java, substring() does create a new String object of length m, but no auxiliary data structures are needed. The algorithm is an in-place scan.
Can We Do Better?
Yes. The KMP (Knuth-Morris-Pratt) algorithm preprocesses the needle into a failure function that allows it to skip redundant comparisons, bringing the time down to O(n + m). Rabin-Karp uses rolling hash functions for O(n) average-case performance. For most interviews, the brute force O(n * m) solution is expected first. Mention KMP or Rabin-Karp as follow-ups to show depth of knowledge.
Common Pitfalls
-
Off-by-one in the loop bound: Using
i < haystackLength - needleLengthinstead ofi <= haystackLength - needleLengthmisses the case where needle appears at the very end. ForstrStr("abc", "c"), the correct answer is 2, which requires checking position3 - 1 = 2. -
Forgetting the empty needle case: The expected behavior is to return
0for an empty needle. Some candidates skip this check and get incorrect results or exceptions. -
Integer overflow on length subtraction: In C++ with
size_t(unsigned), ifhaystackis shorter thanneedle, thenhaystack.size() - needle.size()wraps around to a huge positive number. Adding a guardif (haystack.size() < needle.size()) return -1;before the loop prevents this. -
Using
==instead of.equals()in Java: The==operator compares object references, not string content. Always use.equals()for string comparison.
Interview Tips
When this problem comes up in an interview:
-
Start with examples: Write out
strStr("hello", "ll")and trace through it by hand. This helps you catch boundary issues before coding. -
State the loop bound explicitly: Say "I iterate from 0 to
haystackLength - needleLengthinclusive, because any later start position can't fit the full needle." This shows you have thought about correctness. -
Handle edge cases first: Check for empty needle and needle longer than haystack before entering the loop.
-
Mention advanced algorithms: After presenting the brute force solution, briefly mention KMP or Rabin-Karp. You do not need to implement them unless asked, but knowing they exist demonstrates breadth.
-
Discuss the complexity honestly: Do not claim O(n) for brute force. Acknowledge the O(n * m) worst case and explain why average performance is typically better.
Key Takeaways
- The brute force substring search slides a window of length
macross the haystack, comparing at each position. The loop boundi≤haystackLength - needleLengthis critical for correctness. - Time complexity is O(n * m) worst case, but average performance is often closer to O(n) because mismatches tend to occur early in the comparison.
- Always handle the empty needle edge case (return 0) and guard against needle being longer than haystack (return -1).
- In C++, watch out for unsigned integer underflow when computing
haystack.size() - needle.size(). Add a length check before the loop. - For follow-up discussion, KMP achieves O(n + m) by preprocessing the pattern, and Rabin-Karp uses rolling hashes for O(n) average time. Mention these to show depth.
Related Problems and Practice
Once you are comfortable with basic substring search, try these related problems to strengthen your string matching skills:
- Longest Substring Without Repeated Characters (sliding window variant)
- Shared Prefix Finder (comparing multiple strings simultaneously)
- Pattern-Driven String Matcher (regex-style matching)
This problem and hundreds of others are available on Firecode, where over 50,000 engineers have prepared for technical interviews at top companies. Consistent practice with problems like this builds the pattern recognition that makes interview day feel routine.
Solutions in Other Languages
Python
class Solution:
def str_str(self, haystack: str, needle: str) -> int:
if not needle:
return 0
needle_length = len(needle)
haystack_length = len(haystack)
for i in range(haystack_length - needle_length + 1):
if haystack[i:i + needle_length] == needle:
return i
return -1
JavaScript
class Solution {
strStr(haystack, needle) {
if (needle === "") return 0;
const needleLength = needle.length;
const haystackLength = haystack.length;
for (let i = 0; i <= haystackLength - needleLength; i++) {
if (haystack.substring(i, i + needleLength) === needle) {
return i;
}
}
return -1;
}
}
TypeScript
class Solution {
strStr(haystack: string, needle: string): number {
if (needle === "") return 0;
const needleLength = needle.length;
const haystackLength = haystack.length;
for (let i = 0; i <= haystackLength - needleLength; i++) {
if (haystack.substring(i, i + needleLength) === needle) {
return i;
}
}
return -1;
}
}
C++
#include <string>
class Solution {
public:
int strStr(std::string haystack, std::string needle) {
if (needle.empty()) return 0;
if (haystack.size() < needle.size()) return -1;
for (size_t i = 0; i <= haystack.size() - needle.size(); ++i) {
if (haystack.substr(i, needle.size()) == needle) {
return i;
}
}
return -1;
}
};
Note the guard if (haystack.size() < needle.size()) return -1; before the loop. Since size() returns an unsigned type, subtracting a larger value from a smaller one wraps around instead of going negative.
Go
func strStr(haystack string, needle string) int {
if len(needle) == 0 {
return 0
}
if len(haystack) < len(needle) {
return -1
}
for i := 0; i <= len(haystack)-len(needle); i++ {
if haystack[i:i+len(needle)] == needle {
return i
}
}
return -1
}
Scala
class Solution {
def strStr(haystack: String, needle: String): Int = {
if (needle.isEmpty) return 0
val haystackLength = haystack.length
val needleLength = needle.length
for (i <- 0 to (haystackLength - needleLength)) {
if (haystack.substring(i, i + needleLength) == needle) {
return i
}
}
-1
}
}
Kotlin
class Solution {
fun strStr(haystack: String, needle: String): Int {
if (needle.isEmpty()) return 0
val haystackLength = haystack.length
val needleLength = needle.length
for (i in 0..(haystackLength - needleLength)) {
if (haystack.substring(i, i + needleLength) == needle) {
return i
}
}
return -1
}
}
Swift
class Solution {
func strStr(_ haystack: String, _ needle: String) -> Int {
if needle.isEmpty { return 0 }
let haystackLength = haystack.count
let needleLength = needle.count
if needleLength > haystackLength { return -1 }
for i in 0...(haystackLength - needleLength) {
let startIdx = haystack.index(haystack.startIndex, offsetBy: i)
let endIdx = haystack.index(startIdx, offsetBy: needleLength)
if String(haystack[startIdx..<endIdx]) == needle {
return i
}
}
return -1
}
}
Swift's string indexing uses String.Index rather than integers, which makes the code more verbose. The logic is identical to the other implementations.
Ruby
class Solution
def str_str(haystack, needle)
return 0 if needle.empty?
haystack_length = haystack.length
needle_length = needle.length
(0..haystack_length - needle_length).each do |i|
if haystack[i, needle_length] == needle
return i
end
end
-1
end
end
Rust
pub fn str_str(haystack: String, needle: String) -> i32 {
if needle.is_empty() {
return 0;
}
let haystack_length = haystack.len();
let needle_length = needle.len();
for i in 0..=haystack_length - needle_length {
if &haystack[i..i + needle_length] == needle.as_str() {
return i as i32;
}
}
-1
}
C#
public class Solution {
public int strStr(string haystack, string needle) {
if (needle.Length == 0) return 0;
var haystackLength = haystack.Length;
var needleLength = needle.Length;
for (int i = 0; i <= haystackLength - needleLength; i++) {
if (haystack.Substring(i, needleLength) == needle) {
return i;
}
}
return -1;
}
}
Dart
class Solution {
int strStr(String haystack, String needle) {
if (needle.isEmpty) return 0;
int haystackLength = haystack.length;
int needleLength = needle.length;
for (int i = 0; i <= haystackLength - needleLength; i++) {
if (haystack.substring(i, i + needleLength) == needle) {
return i;
}
}
return -1;
}
}
PHP
class Solution {
public function strStr(string $haystack, string $needle): int {
if (strlen($needle) === 0) return 0;
$haystackLength = strlen($haystack);
$needleLength = strlen($needle);
for ($i = 0; $i <= $haystackLength - $needleLength; $i++) {
if (substr($haystack, $i, $needleLength) === $needle) {
return $i;
}
}
return -1;
}
}