The Z algorithm is the string-matching pattern for reusing previously known match intervals against the full string prefix.
Strong candidates explain the [L, R] box invariant clearly, because that is what turns repeated character comparisons into a linear scan.
Interview lens
A strong explanation should name the invariant, the safe transition, and the condition that makes this pattern preferable to brute force.
Pattern Summary Table
| Pattern | When to Use | Key Idea | Example |
|---|---|---|---|
| 04 15 Z Algorithm String Matching | prefix matches at every position are useful for search or string analysis | reuse the current prefix-match window before extending comparisons | Pattern Search with Z Array |
Problem Statement
Given a string or a pattern-plus-text combination, compute how much each suffix-prefix alignment matches without rescanning everything from scratch.
Note
Emphasize the constraints before coding. The real signal is often whether the brute-force search space, update volume, or graph model makes the naive solution impossible.
Pattern Recognition Signals
- Keywords in the problem: Z array, prefix matches, [L,R] window, linear string matching.
- Structural signal: positions inside the current match box can borrow information from the mirrored prefix region.
- Complexity signal: the optimized version avoids repeated rescans, recomputation, or state explosion that brute force would suffer.
Important
If prefix-match lengths at every position are the useful state, think Z algorithm.
Java Template
public int[] zArray(String s) {
int n = s.length();
int[] z = new int[n];
for (int i = 1, l = 0, r = 0; i < n; i++) {
if (i <= r) z[i] = Math.min(r - i + 1, z[i - l]);
while (i + z[i] < n && s.charAt(z[i]) == s.charAt(i + z[i])) z[i]++;
if (i + z[i] - 1 > r) { l = i; r = i + z[i] - 1; }
}
return z;
}
Pattern Matching with Z-Array
For finding pattern in text:
- build string
pattern + "#" + text - compute Z-array
- any index with
Z[i] == pattern.length()is a match
public List<Integer> zMatch(String text, String pat) {
String s = pat + "#" + text;
int[] z = zArray(s);
List<Integer> out = new ArrayList<>();
int m = pat.length();
for (int i = m + 1; i < s.length(); i++) {
if (z[i] == m) out.add(i - m - 1);
}
return out;
}
Dry Run (Conceptual)
text="ababcab", pat="ab"
Combined: "ab#ababcab"
Z values equal to 2 at text offsets where "ab" starts:
- index
0 - index
2 - index
5
So matches are [0,2,5].
Z-Box Invariant
Maintain [l, r] as rightmost segment where substring matches prefix.
If i is inside this box, reuse previous Z information before extending.
This reuse is why algorithm is linear O(n).
Problem 1: Pattern Matching with the Z Algorithm
Problem description:
Given text and pattern, return every index where pattern occurs inside text.
What we are solving actually: We want pattern matches at every text position, but recomputing common prefixes repeatedly is wasteful. The Z-box keeps the best known matching window and reuses it.
What we are doing actually:
- Build
pattern + "#" + textso any full match has lengthpattern.length(). - Compute the Z array on the combined string.
- Reuse the current
[l, r]matching window when possible. - Convert full-match positions back into text indices.
public List<Integer> findOccurrences(String text, String pattern) {
String combined = pattern + "#" + text;
int[] z = buildZ(combined);
List<Integer> ans = new ArrayList<>();
for (int i = pattern.length() + 1; i < combined.length(); i++) {
if (z[i] == pattern.length()) {
ans.add(i - pattern.length() - 1); // Translate combined-string index back to the text index.
}
}
return ans;
}
private int[] buildZ(String s) {
int[] z = new int[s.length()];
for (int i = 1, l = 0, r = 0; i < s.length(); i++) {
if (i <= r) {
z[i] = Math.min(r - i + 1, z[i - l]); // Reuse the known matching window when possible.
}
while (i + z[i] < s.length() && s.charAt(z[i]) == s.charAt(i + z[i])) {
z[i]++; // Extend the match beyond the current window.
}
if (i + z[i] - 1 > r) {
l = i;
r = i + z[i] - 1; // Record the new rightmost Z-box.
}
}
return z;
}
Debug steps:
- print the Z array for
"aba#abacaba"to inspect reused windows - test a case where pattern appears multiple times with overlap
- verify the invariant that
[l, r]always stores the rightmost segment matching the prefix
Problem-Fit Checklist
- Identify whether input size or query count requires preprocessing or specialized data structures.
- Confirm problem constraints (sorted input, non-negative weights, DAG-only, immutable array, etc.).
- Validate that the pattern gives asymptotic improvement over brute-force under worst-case input.
- Define explicit success criteria: value only, index recovery, count, path reconstruction, or ordering.
Invariant and Reasoning
- Write one invariant that must stay true after every transition (loop step, recursion return, or update).
- Ensure each step makes measurable progress toward termination.
- Guard boundary states explicitly (empty input, singleton, duplicates, overflow, disconnected graph).
- Add a quick correctness check using a tiny hand-worked example before coding full solution.
Complexity and Design Notes
- Compute time complexity for both preprocessing and per-query/per-update operations.
- Track memory overhead and object allocations, not only Big-O notation.
- Prefer primitives and tight loops in hot paths to reduce GC pressure in Java.
- If multiple variants exist, choose the one with the simplest correctness proof first.
Production Perspective
- Convert algorithmic states into explicit metrics (queue size, active nodes, cache hit ratio, relaxation count).
- Add guardrails for pathological inputs to avoid latency spikes.
- Keep implementation deterministic where possible to simplify debugging and incident analysis.
- Separate pure algorithm logic from I/O and parsing so the core stays testable.
Implementation Workflow
- Implement the minimal correct template with clear invariants.
- Add edge-case tests before optimizing.
- Measure complexity-sensitive operations on realistic input sizes.
- Refactor for readability only after behavior is locked by tests.
Common Mistakes
- Choosing the pattern without proving problem fit.
- Ignoring edge cases (empty input, duplicates, overflow, disconnected state).
- Mixing multiple strategies without clear invariants.
- No complexity analysis against worst-case input.
Practice Set (Recommended Order)
- Pattern matching tasks via Z-array (CP)
- Longest Happy Prefix (LC 1392)
LeetCode
Key Takeaways
- This pattern is most effective when transitions are explicit and invariants are enforced at every step.
- Strong preconditions and boundary handling make these implementations production-safe.
- Reuse this template and adapt it per problem constraints.
Categories
Tags