kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Henke <ghe...@cloudera.com>
Subject Re: message filterin or "selector"
Date Thu, 06 Aug 2015 17:20:33 GMT
I completely agree with Ben's response. Especially the invitation to
propose and get involved in adding functionality to Kafka. A first step to
a change this large would be to thoroughly describe your motivations,
needed features and proposed changes or architecture in a KIP proposal.
This way the community can discuss if features like this belong in Kafka,
where they belong, and options for implementation. More information about
that process can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals

On Thu, Aug 6, 2015 at 11:55 AM, Ben Stopford <ben@confluent.io> wrote:

> I think short answer here is that, if you need freeform selectors
> semantics as per JMS message selectors then you’d need to wrap the API
> yourself (or get involved in adding the functionality to Kafka).
>
> As Gwen and Grant say, you could synthesise something simpler using
> topics/partitions to provide separate routing, but it would have to be a
> relatively simple use case. Kafka will support a large number of
> topics/partitions pairs but each one incurs a cost. Thus this route may not
> be wise for the use case you are describing.
>
> B
> > On 6 Aug 2015, at 16:38, Alvaro Gareppe <agareppe@gmail.com> wrote:
> >
> > Is not because of throughput is more about Security. I cant allow all
> > clients to have access to all the topic content (in some cases)
> > I know that access control is something that is not implemented yet, but
> > planed. My idea is to plug a customisation there to add security at
> > selection level too. But If the "selector" applies only at client side I
> > wont get any information of how the user is planing to select on the
> server
> > side therefore I wont be able to restrict or grant access.
> >
> > I planing to substitute an activeMQ with Kafka but I need to keep some
> > functionality like security and selection that are not yet implemented in
> > kafka so I need to get creative in the workarounds to be able to use it.
> >
> > You comment that I can do some custom partitioning in my particular case.
> > But I'm not sure if I can do something like that because even though I
> can
> > know what are the "fields" that can be used for filtering I dont know the
> > values. but dont know...
> >
> > Lets say the message has a property X that I can use as selection
> criteria.
> > I can create a partitioning based on X, so that would split the topic
> based
> > on X values, and connect the clients to the specific partition, that
> could
> > work. But what if I have X and Y as possible selection criteria, I can
> > split based on 2 properties ? if yes, can I connect based only on X ?
> >
> > If I do it like this the qty of partitions that I'm going to create is
> > going to be amazingly large. How kafka is going to perform ?
> >
> > Maybe I'm trying to fit a problem into a system that is not for that. I
> > would love to have the amazing performance of kafka, but sadly I'm not
> sure
> > if its the best fit for me because of this...
> >
> >
> > Thank you very much guys for the responses
> >
> > On Thu, Aug 6, 2015 at 12:10 PM, Grant Henke <ghenke@cloudera.com>
> wrote:
> >
> >> The filtering logic there is topic filtering and not message filtering.
> The
> >> idea is to subscribe to multiple topics via a regex whitelist or black
> >> list. This does exist today as it does not depend on understanding the
> >> content of the message, but I don't think it is what you are looking
> for.
> >>
> >> As far as message filtering goes; As Gwen said, "The way Kafka is
> currently
> >> implemented is that Kafka is not aware of the content of messages, so
> there
> >> is no Selector logic available." However, If you know upfront how you
> would
> >> like to filter the messages you could write your producer to use
> multiple
> >> topics, or even some custom partitioning. And implement a consumer that
> can
> >> understand and filter based on that logic. However, that would be an
> >> involved and creative implementation based on your use case.
> >>
> >> I would recommend starting simple and just dropping the messages you
> don't
> >> care about on the consumer side. If throughput becomes a problem, then
> >> consider alternatives.
> >>
> >>
> >> On Thu, Aug 6, 2015 at 9:47 AM, Alvaro Gareppe <agareppe@gmail.com>
> wrote:
> >>
> >>> Is this implemented ?
> >>> https://cwiki.apache.org/confluence/display/KAFKA/Consumer+API+changes
> ?
> >>>
> >>> This is message filtering on the client or server side ?
> >>>
> >>> On Tue, Aug 4, 2015 at 9:54 PM, Gwen Shapira <gwen@confluent.io>
> wrote:
> >>>
> >>>> The way Kafka is currently implemented is that Kafka is not aware of
> >> the
> >>>> content of messages, so there is no Selector logic available.
> >>>>
> >>>> The way to go is to implement the Selector in your client - i.e. your
> >>>> consume() loop will get all messages but will throw away those that
> >> don't
> >>>> fit your pattern.
> >>>>
> >>>>
> >>>> It may be worthwhile to add a ticket for pluggable selector logic in
> >> the
> >>>> new consumer. I can't guarantee it will happen, there are infinite
> >> things
> >>>> that can be plugged into consumers and we need to draw the line
> >>> somewhere,
> >>>> but worth a discussion.
> >>>>
> >>>> On Tue, Aug 4, 2015 at 2:05 PM, Alvaro Gareppe <agareppe@gmail.com>
> >>> wrote:
> >>>>
> >>>>> The is way to implement a "selector" logic in kafka (similar to
JMS
> >>>>> selectors)
> >>>>>
> >>>>> So, allow to consume a message if only the message contains certain
> >>>> header
> >>>>> or content ?
> >>>>>
> >>>>> I'm evaluating to migrate from ActiveMQ to kafka and I'm using the
> >>>> selector
> >>>>> logic widely in the application
> >>>>>
> >>>>> --
> >>>>> Ing. Alvaro Gareppe
> >>>>> agareppe@gmail.com
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Ing. Alvaro Gareppe
> >>> agareppe@gmail.com
> >>>
> >>
> >>
> >>
> >> --
> >> Grant Henke
> >> Software Engineer | Cloudera
> >> grant@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> >>
> >
> >
> >
> > --
> > Ing. Alvaro Gareppe
> > agareppe@gmail.com
>
>


-- 
Grant Henke
Software Engineer | Cloudera
grant@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message