Mutate DNA pairs by inserting G between adjacent duplicates
You're in the middle of a Microsoft phone screen, and the interviewer poses what sounds like a biology question: "Given a DNA sequence, insert a G between every pair of adjacent identical nucleotides." Before you start worrying about double helixes, recognize that this is a string recursion problem wearing a lab coat. The DNA framing is just flavor. Underneath, you're walking through a string character by character and deciding whether to insert something between neighbors.
TL;DR
Process the string one character at a time using recursion. Compare the first two characters: if they match, prepend the first character plus "G" to the recursive result; if they differ, just prepend the first character. The base case handles strings of length 0 or 1. This runs in O(n) time and O(n) space.
Why This Problem Matters
This problem is a clean introduction to recursive string building. It tests whether you can identify a repeating subproblem, define a base case, and trust the recursion to handle the rest. Companies like Microsoft and Amazon use problems like this early in interview loops because the logic is approachable, but the recursive thinking transfers directly to harder problems like string permutations, palindrome partitioning, and tree traversals.
Prefer a different language? Jump to solutions in other languages.
Understanding the Problem
Given a string of DNA nucleotides (A, T, C, G), insert a "G" between every pair of adjacent identical characters.
Examples:
mutateDnaPairs("AA") -> "AGA"
mutateDnaPairs("TTTT") -> "TGTGTGT"
mutateDnaPairs("TATA") -> "TATA"
mutateDnaPairs("") -> ""
The first example has one pair of identical neighbors (A, A), so we get "AGA". The second has three consecutive T's, which means T-T, T-T, T-T, producing "TGTGTGT". The third has no adjacent duplicates, so nothing changes.
Here is what "TTTT" looks like after mutation:
Loading visualization...
Each original T is preserved, and a G is spliced in between every pair of matching neighbors.
Edge Cases
- Empty string: Return
""immediately - Single character: Return it as-is, nothing to compare
- No adjacent matches: The string passes through unchanged (e.g., "ATCG")
- All identical characters: Maximum insertions (e.g., "GGGGGGGG" becomes "GGGGGGGGGGGGGGGG")
Solution Approach
The recursive insight is straightforward: look at the first two characters, make a decision, then hand the rest of the string back to the same function.
Breaking It Down
- Base case: If the string has 0 or 1 characters, return it. There is nothing to compare.
- Recursive case: Compare
sequence[0]andsequence[1].- If they match, return
sequence[0] + "G" + recurse(sequence[1:]). - If they differ, return
sequence[0] + recurse(sequence[1:]).
- If they match, return
Each call chops one character off the front and passes the remainder forward. The recursion bottoms out when only one character (or none) remains.
Tracing the Recursion
Let's trace mutateDnaPairs("AAGG") to see this in action:
Loading visualization...
The recursion walks down the string one character at a time. At each level, it checks whether the current character matches its neighbor and decides whether to insert a G.
How the Result Builds Up
As the recursion unwinds, each level prepends its contribution to the result from below:
Loading visualization...
Starting from the base case "G", the result grows rightward as each recursive frame adds its piece. The final output is "AGAGGG".
Implementation
public class Solution {
public String mutateDnaPairs(String sequence) {
// Base case: nothing to compare
if (sequence.length() <= 1) {
return sequence;
}
char firstChar = sequence.charAt(0);
// If the first two characters match, insert G between them
return firstChar +
(firstChar == sequence.charAt(1) ? "G" : "") +
mutateDnaPairs(sequence.substring(1));
}
}
The ternary expression keeps the logic compact. When firstChar equals the next character, we concatenate "G"; otherwise we concatenate an empty string, effectively a no-op.
Why substring(1) Works
sequence.substring(1) returns everything after the first character. This guarantees each recursive call processes a strictly shorter string, so we always converge on the base case. For the input "AAGG":
- Call 1:
"AAGG"- processesA, recurses on"AGG" - Call 2:
"AGG"- processesA, recurses on"GG" - Call 3:
"GG"- processesG, recurses on"G" - Call 4:
"G"- base case, returns"G"
Complexity Analysis
Time Complexity: O(n)
Each character in the string is visited exactly once across all recursive calls. The comparison and conditional insertion are O(1) per call, giving O(n) total.
A subtle note: in Java, substring() creates a new String object, and string concatenation with + also allocates new objects. In the worst case (all characters identical), the concatenation at each level copies an increasingly long string, making the true cost closer to O(n^2). For interview purposes, O(n) is the standard answer, but mentioning this nuance shows depth.
Space Complexity: O(n)
The call stack holds one frame per character, so the recursion depth is O(n). The output string also occupies O(n) space. Combined, space usage is O(n).
Iterative Alternative
An iterative version using StringBuilder would eliminate the call stack overhead:
public String mutateDnaPairsIterative(String sequence) {
StringBuilder result = new StringBuilder();
for (int i = 0; i < sequence.length(); i++) {
result.append(sequence.charAt(i));
if (i + 1 < sequence.length() && sequence.charAt(i) == sequence.charAt(i + 1)) {
result.append('G');
}
}
return result.toString();
}
This achieves true O(n) time with O(n) space for the output only. Mention it as a follow-up to show engineering judgment.
Common Pitfalls
-
Forgetting the base case: Without
if (sequence.length() <= 1), the recursion never stops. An empty or single-character string has no pair to compare. -
Off-by-one on substring: Using
substring(2)instead ofsubstring(1)would skip characters. The recursive call always advances by exactly one character. -
Checking the wrong pair: The comparison must be between positions 0 and 1, not between positions 0 and the last character.
-
Stack overflow on long strings: For inputs approaching 1000 characters, the recursive approach uses 1000 stack frames. Java's default stack is large enough, but in languages with smaller stacks you may need the iterative approach.
Interview Tips
When you encounter this problem:
-
Clarify the alphabet: Confirm the input contains only A, T, C, G. Ask about empty strings and single characters.
-
State the approach clearly: "I'll use recursion. At each call, I compare the first two characters, optionally insert a G, and recurse on the rest of the string."
-
Draw a short trace: Walk through "AA" or "AAG" on the whiteboard. Three characters is enough to show the pattern.
-
Mention trade-offs: "The recursive version is clean but uses O(n) stack space. An iterative version with StringBuilder avoids that."
-
Watch for the follow-up: The interviewer might ask you to generalize the insertion character, handle different rules for different pairs, or process the string iteratively.
Key Takeaways
- The recursive decomposition is "handle one character, recurse on the rest." This pattern shows up in dozens of string and list problems.
- The base case is strings of length 0 or 1. Always identify the smallest input that requires no work.
- String concatenation in Java creates new objects at every level. For production code, prefer
StringBuilder; for interviews, the recursive version is fine unless the interviewer asks for optimization. - Tracing two or three calls by hand is the fastest way to verify your logic before writing code.
- The iterative version is worth mentioning as a follow-up to demonstrate awareness of stack depth and performance trade-offs.
Practice and Related Problems
Once you're comfortable with this recursive string builder pattern, try these related challenges:
- Palindrome Tester - another character-by-character string recursion
- Recursive String Permutations - builds on the same shrinking-substring idea but with branching
- Find and Replace CSV Delimiter - iterative string processing with conditional insertion
This problem and hundreds of others are available on Firecode, where over 50,000 engineers have prepared for technical interviews at Microsoft, Amazon, Google, and other top companies. Consistent practice with problems like this builds the recursive thinking muscle that pays off in harder interviews.
Solutions in Other Languages
Python
class Solution:
def mutate_dna_pairs(self, sequence: str) -> str:
if len(sequence) <= 1:
return sequence
first_char = sequence[0]
mutator = "G" if first_char == sequence[1] else ""
return first_char + mutator + self.mutate_dna_pairs(sequence[1:])
Python's string slicing with sequence[1:] is the equivalent of Java's substring(1). The conditional expression keeps the logic on one line.
JavaScript
class Solution {
mutateDnaPairs(sequence) {
if (sequence.length <= 1) {
return sequence;
}
const firstChar = sequence.charAt(0);
return firstChar +
(firstChar === sequence[1] ? "G" : "") +
this.mutateDnaPairs(sequence.substring(1));
}
}
TypeScript
class Solution {
mutateDnaPairs(sequence: string): string {
if (sequence.length <= 1) {
return sequence;
}
const firstChar = sequence.charAt(0);
return firstChar +
(firstChar === sequence[1] ? "G" : "") +
this.mutateDnaPairs(sequence.substring(1));
}
}
C++
class Solution {
public:
std::string mutateDnaPairs(std::string sequence) {
if (sequence.length() <= 1) {
return sequence;
}
std::string result;
result.push_back(sequence.at(0));
if (sequence.at(0) == sequence.at(1)) {
result.push_back('G');
}
result.append(mutateDnaPairs(sequence.substr(1)));
return result;
}
};
The C++ version builds the result with push_back and append rather than string concatenation, which is more idiomatic for performance-sensitive C++ code.
Go
func (s *Solution) MutateDnaPairs(sequence string) string {
if len(sequence) <= 1 {
return sequence
}
var builder strings.Builder
builder.WriteByte(sequence[0])
if sequence[0] == sequence[1] {
builder.WriteByte('G')
}
builder.WriteString(s.MutateDnaPairs(sequence[1:]))
return builder.String()
}
Go's strings.Builder avoids repeated string allocations. Slice syntax sequence[1:] handles the substring naturally.
Scala
class Solution {
def mutateDnaPairs(sequence: String): String = {
if (sequence.length <= 1) return sequence
val firstChar = sequence(0)
val mutator = if (firstChar == sequence(1)) "G" else ""
firstChar + mutator + mutateDnaPairs(sequence.substring(1))
}
}
Kotlin
class Solution {
fun mutateDnaPairs(sequence: String): String {
if (sequence.length <= 1) {
return sequence
}
val firstChar = sequence[0]
return firstChar +
(if (firstChar == sequence[1]) "G" else "") +
mutateDnaPairs(sequence.substring(1))
}
}
Swift
func mutateDnaPairs(_ sequence: String) -> String {
if sequence.count <= 1 {
return sequence
}
let firstChar = sequence[sequence.startIndex]
return String(firstChar) +
(firstChar == sequence[sequence.index(after: sequence.startIndex)] ? "G" : "") +
mutateDnaPairs(String(sequence.dropFirst()))
}
Swift's string indexing is more verbose than other languages since it uses String.Index rather than integer subscripts.
Rust
pub fn mutate_dna_pairs(&self, sequence: String) -> String {
if sequence.len() <= 1 {
return sequence;
}
let chars: Vec<char> = sequence.chars().collect();
let first_char = chars[0];
let rest = self.mutate_dna_pairs(sequence[1..].to_string());
if first_char == chars[1] {
format!("{}G{}", first_char, rest)
} else {
format!("{}{}", first_char, rest)
}
}
Rust requires explicit conversion between string slices and owned Strings. The chars().collect() pattern handles Unicode-safe character access.
Ruby
class Solution
def mutate_dna_pairs(sequence)
return sequence if sequence.length <= 1
first_char = sequence[0]
mutator = first_char == sequence[1] ? "G" : ""
first_char + mutator + mutate_dna_pairs(sequence[1..])
end
end
Ruby's range syntax sequence[1..] returns everything from index 1 onward.
Dart
String mutateDnaPairs(String sequence) {
if (sequence.length <= 1) {
return sequence;
}
String firstChar = sequence[0];
return firstChar +
(firstChar == sequence[1] ? "G" : "") +
mutateDnaPairs(sequence.substring(1));
}
C#
public string mutateDnaPairs(string sequence) {
if (sequence.Length <= 1) {
return sequence;
}
char firstChar = sequence[0];
return firstChar +
(firstChar == sequence[1] ? "G" : "") +
mutateDnaPairs(sequence.Substring(1));
}
PHP
public function mutateDnaPairs(string $sequence): string {
if (strlen($sequence) <= 1) {
return $sequence;
}
$firstChar = $sequence[0];
$mutator = ($firstChar === $sequence[1]) ? "G" : "";
return $firstChar . $mutator . $this->mutateDnaPairs(substr($sequence, 1));
}
PHP uses the dot operator for string concatenation and substr() for slicing.