Part 1 made schema safety explicit. Part 2 is where we stop trusting memory and review culture alone. Compatibility policy has to move into CI, because once delivery pressure rises, any rule that depends entirely on human attention will eventually be bypassed.
This part is not about teaching people what backward compatibility means. It is about enforcing the policy before the change reaches a merge button.
What CI Is Actually Protecting
The point of schema checks in CI is not bureaucracy. It is to move detection earlier than runtime and earlier than human fatigue.
A useful gate can answer:
- does this change violate the declared compatibility mode
- does the subject naming in the PR match real production subjects
- is there a migration note when the syntax passes but the semantics are risky
flowchart LR
A[Schema change in PR] --> B[CI compatibility gate]
B -->|Pass| C[Merge can proceed]
B -->|Fail| D[Engineer fixes or documents migration]
This is the difference between “we believe we follow schema discipline” and “the repo actually enforces it.”
Why Automation Is Not Enough by Itself
A registry check can tell you whether a rule was violated syntactically. It usually cannot tell you whether the field meaning was repurposed in a way that will confuse downstream consumers.
That is why the stronger pattern is:
- automated compatibility enforcement
- required migration note for risky changes
- human review for semantic meaning
CI handles repeatable rules. People still have to evaluate meaning.
What a Useful Gate Looks Like
Even if the exact vendor tooling differs, the policy is recognizable:
CI gates:
- backward compatibility
- forward compatibility when the rollout requires it
- prohibition on field renumbering or unsafe narrowing
- migration note for semantically risky changes
The key is that the build fails loudly when the contract is broken.
The Test That Builds Confidence
Do not just run the command and call it done. Keep one intentionally incompatible change around as a repeatable proof that the gate still catches what it claims to catch.
That simple drill does two things:
- verifies the pipeline is checking the right subjects and rules
- prevents the safety net from quietly drifting into irrelevance
Local Setup
Prerequisites
- Docker Desktop
- Java 21
- Kafka CLI tools
Local Stack
services:
zookeeper:
image: confluentinc/cp-zookeeper:7.6.1
environment:
ZOOKEEPER_CLIENT_PORT: 2181
kafka:
image: confluentinc/cp-kafka:7.6.1
depends_on: [zookeeper]
ports: ["9092:9092"]
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
docker compose up -d
The Right Verification for Part 2
Use a deliberately incompatible schema change in a test branch or local CI path and confirm the build blocks it.
# run schema compatibility check in CI or locally
# exact command depends on registry tooling
The meaningful proof is not “the job exists.” It is “a bad change cannot slip through quietly.”
Common Mistakes
Gating only one compatibility direction
That may be fine, or it may be incomplete, depending on how old and new readers and writers coexist during rollout.
Allowing manual registry edits outside review
Once production subjects can change outside the controlled path, the CI gate stops being authoritative.
Assuming syntactic safety equals semantic safety
A field can remain technically compatible while still changing meaning in a way that hurts consumers.
[!important] Schema CI should be a merge guard, not an advisory report. If the policy matters, the build has to own the consequence.
What This Part Should Leave You With
After Part 2, the team should understand:
- why schema safety has to become an automated gate
- what automation can and cannot verify
- why migration notes still matter for semantic risk
That is how schema discipline survives real delivery pressure instead of disappearing at the first rushed release.
Categories
Tags