kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pieter Hameete <pieter.hame...@blockbax.com>
Subject Repeating UNKNOWN_PRODUCER_ID errors for Kafka streams applications
Date Tue, 04 Jun 2019 09:09:45 GMT

Our Kafka streams applications are showing the following warning every few seconds (on each
of our 3 brokers, and on each of the 2 instances of the streams application):

[Producer clientId=event-rule-engine-dd71ae9b-523c-425d-a7c0-c62993315b30-StreamThread-1-1_24-producer,
transactionalId=event-rule-engine-1_24] Resetting sequence number of batch with current sequence
1 for partition event-rule-engine-KSTREAM-REDUCE-STATE-STORE-0000000015-repartition-24 to

Followed by:

[Producer clientId=event-rule-engine-dd71ae9b-523c-425d-a7c0-c62993315b30-StreamThread-1-1_24-producer,
transactionalId=event-rule-engine-1_24] Got error produce response with correlation id 5902
on topic-partition event-rule-engine-KSTREAM-REDUCE-STATE-STORE-0000000015-repartition-24,
retrying (2147483646 attempts left). Error: UNKNOWN_PRODUCER_ID

The brokers are showing errors that look related:

Error processing append operation on partition event-rule-engine-KSTREAM-REDUCE-STATE-STORE-0000000015-repartition-24

org.apache.kafka.common.errors.UnknownProducerIdException: Found no record of producerId=72
on the broker. It is possible that the last message with the producerId=72 has been removed
due to hitting the retention limit.

We would expect the UNKNOWN_PRODUCER_ID error to occur once. After a retry the record would
be published on the partition and the PRODUCER_ID would be known. However, this error keeps
occurring every few seconds. This is roughly at the same rate at which records are produced
on the input topics partitions, so it seems like it occurs for (nearly) every input record.

The following JIRA issue: https://issues.apache.org/jira/browse/KAFKA-7190 looks related.
Except the Jira issue mentions ‘little traffic’, and I am not sure if a message per every
few seconds is regarded as little traffic. Matthias mentions in the issue that a workaround
seems to be to increase topic configs `segment.bytes`, `segment.index.bytes`, and `segment.ms`
for the corresponding repartition topics. We’ve tried manually overriding these configs
for a relevant topic to the config values in the linked pull request (https://github.com/apache/kafka/pull/6511)
but this did not result in the errors disappearing.

Could anyone help us to figure out what is happening here, and why the proposed fix for the
above JIRA issue is not working in this case?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message