kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alvaro Gareppe <agare...@gmail.com>
Subject Re: message filterin or "selector"
Date Thu, 06 Aug 2015 17:21:51 GMT
Thanks

On Thu, Aug 6, 2015 at 2:20 PM, Grant Henke <ghenke@cloudera.com> wrote:

> I completely agree with Ben's response. Especially the invitation to
> propose and get involved in adding functionality to Kafka. A first step to
> a change this large would be to thoroughly describe your motivations,
> needed features and proposed changes or architecture in a KIP proposal.
> This way the community can discuss if features like this belong in Kafka,
> where they belong, and options for implementation. More information about
> that process can be found here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>
> On Thu, Aug 6, 2015 at 11:55 AM, Ben Stopford <ben@confluent.io> wrote:
>
> > I think short answer here is that, if you need freeform selectors
> > semantics as per JMS message selectors then you’d need to wrap the API
> > yourself (or get involved in adding the functionality to Kafka).
> >
> > As Gwen and Grant say, you could synthesise something simpler using
> > topics/partitions to provide separate routing, but it would have to be a
> > relatively simple use case. Kafka will support a large number of
> > topics/partitions pairs but each one incurs a cost. Thus this route may
> not
> > be wise for the use case you are describing.
> >
> > B
> > > On 6 Aug 2015, at 16:38, Alvaro Gareppe <agareppe@gmail.com> wrote:
> > >
> > > Is not because of throughput is more about Security. I cant allow all
> > > clients to have access to all the topic content (in some cases)
> > > I know that access control is something that is not implemented yet,
> but
> > > planed. My idea is to plug a customisation there to add security at
> > > selection level too. But If the "selector" applies only at client side
> I
> > > wont get any information of how the user is planing to select on the
> > server
> > > side therefore I wont be able to restrict or grant access.
> > >
> > > I planing to substitute an activeMQ with Kafka but I need to keep some
> > > functionality like security and selection that are not yet implemented
> in
> > > kafka so I need to get creative in the workarounds to be able to use
> it.
> > >
> > > You comment that I can do some custom partitioning in my particular
> case.
> > > But I'm not sure if I can do something like that because even though I
> > can
> > > know what are the "fields" that can be used for filtering I dont know
> the
> > > values. but dont know...
> > >
> > > Lets say the message has a property X that I can use as selection
> > criteria.
> > > I can create a partitioning based on X, so that would split the topic
> > based
> > > on X values, and connect the clients to the specific partition, that
> > could
> > > work. But what if I have X and Y as possible selection criteria, I can
> > > split based on 2 properties ? if yes, can I connect based only on X ?
> > >
> > > If I do it like this the qty of partitions that I'm going to create is
> > > going to be amazingly large. How kafka is going to perform ?
> > >
> > > Maybe I'm trying to fit a problem into a system that is not for that. I
> > > would love to have the amazing performance of kafka, but sadly I'm not
> > sure
> > > if its the best fit for me because of this...
> > >
> > >
> > > Thank you very much guys for the responses
> > >
> > > On Thu, Aug 6, 2015 at 12:10 PM, Grant Henke <ghenke@cloudera.com>
> > wrote:
> > >
> > >> The filtering logic there is topic filtering and not message
> filtering.
> > The
> > >> idea is to subscribe to multiple topics via a regex whitelist or black
> > >> list. This does exist today as it does not depend on understanding the
> > >> content of the message, but I don't think it is what you are looking
> > for.
> > >>
> > >> As far as message filtering goes; As Gwen said, "The way Kafka is
> > currently
> > >> implemented is that Kafka is not aware of the content of messages, so
> > there
> > >> is no Selector logic available." However, If you know upfront how you
> > would
> > >> like to filter the messages you could write your producer to use
> > multiple
> > >> topics, or even some custom partitioning. And implement a consumer
> that
> > can
> > >> understand and filter based on that logic. However, that would be an
> > >> involved and creative implementation based on your use case.
> > >>
> > >> I would recommend starting simple and just dropping the messages you
> > don't
> > >> care about on the consumer side. If throughput becomes a problem, then
> > >> consider alternatives.
> > >>
> > >>
> > >> On Thu, Aug 6, 2015 at 9:47 AM, Alvaro Gareppe <agareppe@gmail.com>
> > wrote:
> > >>
> > >>> Is this implemented ?
> > >>>
> https://cwiki.apache.org/confluence/display/KAFKA/Consumer+API+changes
> > ?
> > >>>
> > >>> This is message filtering on the client or server side ?
> > >>>
> > >>> On Tue, Aug 4, 2015 at 9:54 PM, Gwen Shapira <gwen@confluent.io>
> > wrote:
> > >>>
> > >>>> The way Kafka is currently implemented is that Kafka is not aware
of
> > >> the
> > >>>> content of messages, so there is no Selector logic available.
> > >>>>
> > >>>> The way to go is to implement the Selector in your client - i.e.
> your
> > >>>> consume() loop will get all messages but will throw away those
that
> > >> don't
> > >>>> fit your pattern.
> > >>>>
> > >>>>
> > >>>> It may be worthwhile to add a ticket for pluggable selector logic
in
> > >> the
> > >>>> new consumer. I can't guarantee it will happen, there are infinite
> > >> things
> > >>>> that can be plugged into consumers and we need to draw the line
> > >>> somewhere,
> > >>>> but worth a discussion.
> > >>>>
> > >>>> On Tue, Aug 4, 2015 at 2:05 PM, Alvaro Gareppe <agareppe@gmail.com>
> > >>> wrote:
> > >>>>
> > >>>>> The is way to implement a "selector" logic in kafka (similar
to JMS
> > >>>>> selectors)
> > >>>>>
> > >>>>> So, allow to consume a message if only the message contains
certain
> > >>>> header
> > >>>>> or content ?
> > >>>>>
> > >>>>> I'm evaluating to migrate from ActiveMQ to kafka and I'm using
the
> > >>>> selector
> > >>>>> logic widely in the application
> > >>>>>
> > >>>>> --
> > >>>>> Ing. Alvaro Gareppe
> > >>>>> agareppe@gmail.com
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Ing. Alvaro Gareppe
> > >>> agareppe@gmail.com
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> Grant Henke
> > >> Software Engineer | Cloudera
> > >> grant@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> > >>
> > >
> > >
> > >
> > > --
> > > Ing. Alvaro Gareppe
> > > agareppe@gmail.com
> >
> >
>
>
> --
> Grant Henke
> Software Engineer | Cloudera
> grant@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>



-- 
Ing. Alvaro Gareppe
agareppe@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message