Part 1 is about drawing the first boundary correctly. Idempotent producers are extremely valuable, but they solve a narrower problem than many teams assume. They suppress duplicate appends caused by producer retries. They do not magically make a distributed workflow exactly-once.
That difference matters in production because the config is easy to enable, the guarantee is easy to overstate, and the resulting confusion usually shows up during incident review.
What Idempotence Actually Protects
The classic failure looks like this:
- the producer sends a record
- the broker writes it
- the acknowledgement is delayed, dropped, or times out
- the producer retries
- without idempotence, the same logical record can be appended twice
With idempotence enabled, Kafka tracks producer sequence information so the retry does not create a duplicate append on the broker.
sequenceDiagram
participant P as Producer
participant B as Broker leader
P->>B: Record seq=41
B-->>P: Ack delayed
P->>B: Retry seq=41
Note over B: Duplicate sequence is rejected for append
B-->>P: Success
That is a meaningful guarantee, but it lives at the producer-to-broker write path. It does not say anything yet about consumer retries, external side effects, or end-to-end deduplication.
What It Does Not Protect
Idempotence does not fix:
- a consumer writing the same database row twice
- an HTTP side effect being retried downstream
- a producer sending a semantically duplicate business event with a new key
- application bugs that publish the same event twice before Kafka sees it
[!warning] “We turned on
enable.idempotence=true” should never be rewritten in team language as “duplicates are impossible now.”
That sentence causes a surprising number of bad architectural assumptions.
The Producer Settings That Belong Together
Idempotence works best when the producer settings are treated as one safety bundle instead of one isolated flag.
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "true");
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.RETRIES_CONFIG, Integer.toString(Integer.MAX_VALUE));
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, "5");
The key idea is operational, not ceremonial:
acks=allkeeps the producer from treating weak durability as success- high retries let transient failures be retried instead of surfaced as immediate loss
max.in.flight.requests.per.connectionstays within the safe bound for ordering under retry
If different services set these inconsistently, you end up with a fleet where some producers are actually idempotent and some are only “close enough.”
A Better Real-World Example
Suppose an order service publishes OrderCreated after persisting an order. During a short broker hiccup:
- the broker leader accepts the record
- the ack times out
- the producer retries
Without idempotence, the topic may contain two identical logical order-created events. Downstream consumers now have to absorb duplicate work they did not ask for.
With idempotence, the broker suppresses the duplicate append. That immediately lowers pressure on downstream deduplication logic and reduces confusion during recovery.
Run It Locally
Prerequisites
- Docker Desktop
- Java 21
- Kafka CLI tools
Local Stack
services:
zookeeper:
image: confluentinc/cp-zookeeper:7.6.1
environment:
ZOOKEEPER_CLIENT_PORT: 2181
kafka:
image: confluentinc/cp-kafka:7.6.1
depends_on: [zookeeper]
ports: ["9092:9092"]
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
docker compose up -d
kafka-topics --bootstrap-server localhost:9092 \
--create \
--topic orders.out \
--partitions 3 \
--replication-factor 1
Minimal Producer Loop
Use one stable key so the output stays easy to inspect:
try (KafkaProducer<String, String> producer = new KafkaProducer<>(props)) {
ProducerRecord<String, String> record =
new ProducerRecord<>("orders.out", "order-1001", "{\"event\":\"OrderCreated\"}");
producer.send(record).get();
}
That is enough for the baseline. Part 1 is about verifying producer behavior, not yet introducing transactions or full consume-transform-produce flows.
How to Test It Instead of Trusting It
The mistake here is to stop at configuration review. A better test is:
- enable idempotence
- produce under a condition that forces retries
- inspect the committed records for duplicate appends
Even a crude local test is valuable because it proves the team understands where the guarantee lives.
kafka-console-consumer \
--bootstrap-server localhost:9092 \
--topic orders.out \
--from-beginning \
--property print.key=true
If you replay the same failure with idempotence disabled, the difference becomes very easy to explain to others.
Operational Guidance
Treat this as a platform default
If your organization has a shared producer library, idempotence should usually be the baseline rather than an opt-in per service.
Document the guarantee boundary
Write down exactly what the team gets:
- duplicate append suppression caused by producer retries
- not consumer deduplication
- not business-level exactly-once
- not automatic external side-effect safety
That one paragraph prevents a lot of future misuse.
Watch for false confidence during incidents
When a duplicate shows up downstream, the right question is not “why did Kafka idempotence fail?” It is “where in the pipeline does the guarantee stop?”
What This Part Should Leave You With
After Part 1, the team should understand:
- why producer retries create duplicates without idempotence
- which settings make idempotence safe in practice
- where this protection ends
That gives you the right foundation for transactions later, without pretending you already solved the full exactly-once problem.
Categories
Tags