kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ewen Cheslack-Postava <e...@confluent.io>
Subject Re: kafka connect(copycat) question
Date Thu, 10 Dec 2015 18:23:31 GMT
Roman,

Agreed, this is definitely a gap in the docs (both Kafka's and Confluent's)
right now. The reason it was lower priority for documentation than other
items is that we expect there will be relatively few converter
implementations, especially compared to the number of converters.
Converters correspond to serialization formats (and any supporting pieces,
like Confluent's Schema Registry for the AvroConverter), so there might be
a few for, e.g., Avro, JSON, Protocol Buffers, Thrift, and possibly
variants (e.g. if you have a different approach for managing Avro schemas
than Confluent's schema registry).

https://cwiki.apache.org/confluence/display/KAFKA/Copycat+Data+API has a
slightly outdated image that explains how Converters fit into the data
processing pipeline in Kafka Connect. The API is also quite simple:
http://docs.confluent.io/2.0.0/connect/javadocs/org/apache/kafka/connect/storage/Converter.html

-Ewen

On Thu, Dec 10, 2015 at 3:34 AM, Roman Shtykh <rshtykh@yahoo.com.invalid>
wrote:

> Ewen,
>
> I just thought it would be helpful to have more detailed information on
> converters (including what you described here) on
> http://docs.confluent.io/2.0.0/connect/devguide.html
>
> Thanks,
> Roman
>
>
>
> On Wednesday, November 11, 2015 6:59 AM, Ewen Cheslack-Postava <
> ewen@confluent.io> wrote:
> Hi Venkatesh,
>
> If you're using the default settings included in the sample configs, it'll
> expect JSON data in a special format to support passing schemas along with
> the data. This is turned on by default because it makes it possible to work
> with a *lot* more connectors and data storage systems (many require
> schemas!), though it does mean consuming regular JSON data won't work out
> of the box. You can easily switch this off by changing these lines in the
> worker config:
>
> key.converter.schemas.enable=true
> value.converter.schemas.enable=true
>
> to be false instead. However, note that this will only work with connectors
> that can work with "schemaless" data. This wouldn't work for, e.g., writing
> Avro files in HDFS since they need schema information, but it might work
> for other formats. This would allow you to consume JSON data from any topic
> it already existed in.
>
> Note that JSON is not the only format you can use. You can also substitute
> other implementations of the Converter interface. Confluent has implemented
> an Avro version that works well with our schema registry (
> https://github.com/confluentinc/schema-registry/tree/master/avro-converter
> ).
> The JSON implementation made sense to add as the one included with Kafka
> simply because it didn't introduce any other dependencies that weren't
> already in Kafka. It's also possible to write implementations for other
> formats (e.g. Thrift, Protocol Buffers, Cap'n Proto, MessagePack, and
> more), but I'm not aware of anyone who has started to tackle those
> converters yet.
>
> -Ewen
>
> On Tue, Nov 10, 2015 at 1:23 PM, Venkatesh Rudraraju <
> venkatengineering@gmail.com> wrote:
>
> > Hi,
> >
> > I am trying out the new kakfa connect service.
> >
> > version : kafka_2.11-0.9.0.0
> > mode    : standalone
> >
> > I have a conceptual question on the service.
> >
> > Can I just start a sink connector which reads from Kafka and writes to
> say
> > HDFS ?
> > From what I have tried, it's expecting a source-connector as well because
> > the sink-connector is expecting a particular pattern of the message in
> > kafka-topic.
> >
> > Thanks,
> > Venkat
>
> >
>
>
>
> --
> Thanks,
> Ewen
>



-- 
Thanks,
Ewen

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message