Outbox Plus CDC with Debezium for Reliable Event Publishing (Part 1)

The transactional outbox pattern exists because “write to the database, then publish to Kafka” is not one action. It is two actions with a crash window between them. If the application commits business state and dies before publishing, you now have data that says one thing and an event stream that says nothing.

Part 1 is about closing that gap with the simplest reliable shape: business row and outbox row in one transaction, then CDC moves the outbox row into Kafka.

The Failure Window You Are Actually Removing

Without an outbox, a typical service flow looks harmless:

insert or update the business record
commit the database transaction
publish the event

The problem is that step 3 happens after the commit. A process crash in that gap does not roll anything back.

flowchart LR
    A[Application transaction] --> B[Order row committed]
    B --> C[Process crashes]
    B --> D[No Kafka event emitted]
    E[Outbox pattern] --> F[Order row + outbox row committed together]
    F --> G[Debezium reads WAL/binlog]
    G --> H[Kafka event published later]

That is why this pattern is not mainly about convenience. It is about moving publish reliability out of the application process.

Where This Pattern Fits Best

Transactional outbox is strongest when:

the source of truth is a relational database
the service already commits meaningful business state there
event publication must reflect committed state, not best effort
the team wants replayable, inspectable publication behavior

It is less compelling when there is no durable database write in the path, or when the outbox becomes an excuse to keep too much event-shaping logic inside the write transaction.

A More Realistic Order-Service Example

Suppose an order service does three things:

inserts the order row
reserves some internal state such as payment intent linkage
emits OrderCreated

If publishing is done directly after the commit, a crash can leave the order durable but invisible to every downstream consumer. Inventory, analytics, notifications, and orchestration pipelines now disagree about reality.

With an outbox table, the same transaction persists both:

the business change
the fact that an event must be published

CDC then turns that recorded intent into an actual Kafka event.

Schema Baseline

Keep the outbox shape boring. Boring is good here.

create table orders (
  id varchar(64) primary key,
  amount_minor bigint not null,
  status varchar(32) not null,
  created_at timestamp not null default current_timestamp
);

create table outbox_event (
  id varchar(64) primary key,
  aggregate_type varchar(64) not null,
  aggregate_id varchar(64) not null,
  event_type varchar(128) not null,
  payload json not null,
  created_at timestamp not null default current_timestamp
);

The goal is not to recreate Kafka inside the database. The outbox should be just enough to express “this committed change needs to be published.”

Write Path in One Transaction

This is the core guarantee:

begin;

insert into orders(id, amount_minor, status)
values ('ord-1001', 2500, 'CREATED');

insert into outbox_event(id, aggregate_type, aggregate_id, event_type, payload)
values (
  'evt-1001',
  'Order',
  'ord-1001',
  'OrderCreated',
  '{"orderId":"ord-1001","amountMinor":2500}'
);

commit;

If the transaction rolls back, neither row exists. If it commits, both rows exist. That is the reliability boundary you want.

[!important] The outbox pattern does not mean “publish inside the transaction.” It means “persist the instruction to publish inside the transaction.”

That distinction keeps the write path stable and lets CDC handle delivery separately.

CDC with Debezium

Once the outbox row is committed, Debezium can read the database log and emit the change into Kafka without asking the application to stay alive long enough to do it itself.

That is the real architectural benefit:

the application is responsible for recording intent
CDC is responsible for forwarding that intent reliably

Those are cleaner failure boundaries than a direct dual write.

Run It Locally

Prerequisites

Docker Desktop
Java 21
Kafka CLI tools

Local Stack

services:
  zookeeper:
    image: confluentinc/cp-zookeeper:7.6.1
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181

  kafka:
    image: confluentinc/cp-kafka:7.6.1
    depends_on: [zookeeper]
    ports: ["9092:9092"]
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

docker compose up -d

What to Verify

The important proof for Part 1 is not “a message appeared in Kafka once.” It is:

business row and outbox row commit together
the application can disappear after commit
the event still arrives through CDC

If you only test the happy path with the app still running, you miss the entire reason the pattern exists.

curl -X POST http://localhost:8083/connectors \
  -H "Content-Type: application/json" \
  -d @connector-outbox.json

kafka-console-consumer \
  --bootstrap-server localhost:9092 \
  --topic ordersdb.public.outbox_event \
  --from-beginning

Operational Guidance

Keep event shaping simple in the transaction

Complex branching in the write transaction is the fastest way to make the outbox path fragile. Build the payload you need, but do not turn the transaction into an orchestration engine.

Monitor backlog, not just success

An outbox table that keeps growing is not a harmless queue. It is a sign that publication is falling behind, and the gap between committed truth and published truth is widening.

Decide who owns cleanup

Many teams implement outbox and forget retention. If rows are never archived or deleted safely, the pattern slowly becomes a storage and operational burden.

Common Misunderstandings

“Outbox gives exactly-once everywhere”

No. It gives a safer bridge from committed database state to published event intent. Consumers still need their own correctness model.

“CDC means we no longer need to think about schemas”

Also no. Badly designed payloads stay badly designed after CDC. Reliability and event quality are different concerns.

What This Part Should Leave You With

After Part 1, the team should be clear on three things:

why direct dual writes fail under crash timing
why the outbox row belongs in the same transaction as business state
why CDC is useful precisely because it decouples event delivery from application uptime

That is the right baseline before you optimize connector shape, routing, or downstream contracts.

Find posts and pages