kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Goodhope <kengoodh...@gmail.com>
Subject Re: Moving raw logs from kafka into hadoop using camus
Date Thu, 01 Aug 2013 19:42:07 GMT
Hi Vadim,

Sorry for the slow response.  If your topics share commonality, you should
be able to implement one decoder to handle all of them.  On the other hand
if your kafka data is different depending on the topic, you might need
separate decoders for each topic.  I don't recall if we added the ability
to specify a decoder per topic.  If we didn't then you may need to setup
separate camus instances using the white and black lists to pull the topics
that require a different decoder.

As for the schema registry, this is used for kafka data that is encoded in
avro and allows you to specify a schema identifier in your kafka message
that tells the decoder which schema is needed from the registry to decode
the kafka bytes.

If you are not using avro, then you can have your decoder return whatever
object you like as long as you provide a record writer that can handle
writing that object to hdfs.  The ability to plugin a custom writer was
added recently, and I haven't had a chance to review how that works.  I am
looking into it now, and will send an update to this mailing list shortly.


On Wed, Jul 31, 2013 at 4:31 PM, Vadim Keylis <vkeylis2009@gmail.com> wrote:

> Good afternoon. I am new to camus and would like to use it to move raw
> data from kafka to hadoop.
>  Do I have to pre-create avro schema in advance or it automatically
> created for me?
>  What is the role of the decoder class specified
> by camus.message.decoder.class in property file?   Do I need to implement
> decoder for each topic in order to parse ?
> Where do I provide logs field delimiter?
> What is the usage of the class specified
> by kafka.message.coder.schema.registry.class=?
> Thanks in advance!
>  --
> You received this message because you are subscribed to the Google Groups
> "Camus - Kafka ETL for Hadoop" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to camus_etl+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message