kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phillip Mann <pm...@trulia.com>
Subject Kafka Connect offset.storage.topic not receiving messages (i.e. how to access Kafka Connect offset metadata?)
Date Thu, 05 Jan 2017 19:30:25 GMT
I am working on setting up a Kafka Connect Distributed Mode application which will be a Kafka
to S3 pipeline. I am using Kafka 0.10.1.0-1 and Kafka Connect 3.1.1-1. So far things are going
smoothly but one aspect that is important to the larger system I am working with requires
knowing offset information of the Kafka -> FileSystem pipeline. According to the documentation,
the offset.storage.topic configuration will be the location the distributed mode application
uses for storing offset information. This makes sense given how Kafka stores consumer offsets
in the 'new' Kafka. However, after doing some testing with the FileStreamSinkConnector, nothing
is being written to my offset.storage.topic which is the default value: connect-offsets.

To be specific, I am using a Python Kafka producer to push data to a topic and using Kafka
Connect with the FileStreamSinkConnect to output the data from the topic to a file. This works
and behaves as I expect the connector to behave. Additionally, when I stop the connector and
start the connector, the application remembers the state in the topic and there is no data
duplication. However, when I go to the offset.storage.topic to see what offset metadata is
stored, there is nothing in the topic.

This is the command that I use:

kafka-console-consumer --bootstrap-server kafka1:9092,kafka2:9092,kafka3:9092 --topic connect-offsets
--from-beginning

I receive this message after letting this command run for a minute or so:

Processed a total of 0 messages

So to summarize, I have 2 questions:


  1.  Why is offset metadata not being written to the topic that should be storing this even
though my distributed application is keeping state correctly?


  1.  How do I access offset metadata information for a Kafka Connect distributed mode application?
This is 100% necessary for my team's Lambda Architecture implementation of our system.

Thanks for the help.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message