kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Stein <joe.st...@stealth.ly>
Subject Re: Experiences with larger message sizes
Date Tue, 24 Jun 2014 16:26:21 GMT
Hi Denny, have you considered saving those files to HDFS and sending the
"event" information to Kafka?

You could then pass that off to Apache Spark in a consumer and get data
locality for the file saved (or something of the sort [no pun intended]).

You could also stream every line (or however you want to "chunk" it) in the
file as a separate message to the broker with a wrapping message object (so
you know which file you are dealing with when consuming).

What you plan to-do with the data has a lot to-do with how you are going to
process and manage it.

/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/


On Tue, Jun 24, 2014 at 11:35 AM, Denny Lee <denny.g.lee@gmail.com> wrote:

> By any chance has anyone worked with using Kafka with message sizes that
> are approximately 50MB in size?  Based on from some of the previous threads
> there are probably some concerns on memory pressure due to the compression
> on the broker and decompression on the consumer and a best practices on
> ensuring batch size (to ultimately not have the compressed message exceed
> message size limit).
>
> Any other best practices or thoughts concerning this scenario?
>
> Thanks!
> Denny
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message