kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mukesh Jha <me.mukesh....@gmail.com>
Subject Re: Kafka getMetadata api
Date Fri, 02 Jan 2015 19:14:30 GMT
Thanks for your response guys,

Filtering messages in the application works fine for me, I am just thinking
out loud as fetching just the key will be much faster than fetching the
entire Message&Metadata and will avoid unnecessary data transfer b/w kafka
& consumer nodes, hence having an API to expose just the key sounds useful
to me.

I cannot partition each consumers message into different topics as that
will result into same data replicated in multiple topics (space wastage) &
a very high number of topics. Also the consumers should be able to
dynamically decide if the given message is relevant for it or not based on
some cache (or config or db call).

On Fri, Jan 2, 2015 at 11:36 PM, Joe Stein <joe.stein@stealth.ly> wrote:

> I think partitioning is best left for the semantics of the message (i.e.
> userId, customerId, etc) and not the type of message. If your consumers
> only need specific message types then separate the messages types by
> topics. This will make the consumers that don't need those message types
> work better not having to ignore them and focus on what they are built for.
> If you have consumers that need multiple message types that are now across
> topics then those consumers should consume from multiple topics i.e.
>
> https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/consumer/ConsoleConsumer.scala#L196
>
> /*******************************************
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> ********************************************/
>
> /*******************************************
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> ********************************************/
>
> On Fri, Jan 2, 2015 at 12:34 PM, Manikumar Reddy <kumar@nmsworks.co.in>
> wrote:
>
> > Hi,
> >
> > One option is to partition the data using key and consume from relevant
> > partition.
> > Or your current approach (filtering messages in the application) should
> be
> > OK.
> >
> > Using separate getMetaData/getkey and getMessage may hit the consumer
> > performance/throughput.
> >
> >
> > Regards,
> > Kumar
> >
> > On Fri, Jan 2, 2015 at 9:53 PM, Mukesh Jha <me.mukesh.jha@gmail.com>
> > wrote:
> >
> > > Any pointers guys?
> > > On 1 Jan 2015 15:26, "Mukesh Jha" <me.mukesh.jha@gmail.com> wrote:
> > >
> > > > Hello Experts,
> > > >
> > > > I'm using a kafka topic to store bunch of messages where the key
> > contains
> > > > metadata and value is the data (avro file in our case).
> > > > There are multiple consumers for each topic and the consumer can
> decide
> > > if
> > > > the message is relevant for it or not based on the metadata i.e. the
> > key
> > > of
> > > > the message.
> > > > Using this a group-consumer can check the key if the message is
> > required
> > > > by it and then it can retrieve the entire message otherwise it'll
> just
> > > > commit the offset and move on to the next message.
> > > >
> > > > So I was wondering if kafka has an api in kafka that lets the
> consumer
> > to
> > > > get just the metadata i.e. key, something like getMetadata instead
> > > > of getMessageAndMetadata?
> > > >
> > > > If kafka has something like this can you help me out with some
> > > > documentation for the same?
> > > > I think this will be useful in a lot of scenarios so if its not
> there I
> > > > can file a JIRA and take a dig at it. Let me know what you all think.
> > > >
> > > > Thanks for your help & suggestions.
> > > >
> > > > --
> > > > Thanks & Regards,
> > > >
> > > > *Mukesh Jha <me.mukesh.jha@gmail.com>*
> > > >
> > >
> >
>



-- 


Thanks & Regards,

*Mukesh Jha <me.mukesh.jha@gmail.com>*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message