kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Taylor Gautier <tgaut...@gmail.com>
Subject Re: Using Kafka for "data" messages
Date Thu, 13 Jun 2013 15:39:56 GMT
I've been talking about this kind of architecture for years.

As you said it's an EDA architecture. You might also want to have a look at
Esper if you haven't already - it's a perfect complement to this strategy.

At my last job I built a relatively low latency site wide pub sub system
that showed real time updates on web clients using Kafka. At the time based
on our requirements and stack it took quite a bit of effort that we were
willing to invest.

Now with 0.8 out - or almost out? - at least quite a few of the issues we
had to workaround should be resolved.

I'd say its a pretty good choice - the biggest question - as with any
system - is durability vs latency.  I haven't had a chance to play with 0.8
but it looks promising in this regard.

As the current platform that I am building takes shape (it's an EDA/CEP
based architecture) Kafka figures in heavily to our plans.

One drawback though is a lack of a solid node client.


On Thursday, June 13, 2013, Josh Foure wrote:

>
> Hi all, my team is proposing a novel
> way of using Kafka and I am hoping someone can help do a sanity check on
> this:
>
> 1.  When a user logs
> into our website, we will create a “logged in” event message in Kafka
> containing the user id.
> 2.  30+ systems
> (consumers each in their own consumer groups) will consume this event and
> lookup data about this user id.  They
> will then publish all of this data back out into Kafka as a series of data
> messages.  One message may include the user’s name,
> another the user’s address, another the user’s last 10 searches, another
> their
> last 10 orders, etc.  The plan is that a
> single “logged in” event may trigger hundreds if not thousands of
> additional data
> messages.
> 3.  Another system,
> the “Product Recommendation” system, will have consumed the original
> “logged in”
> message and will also consume a subset of the data messages (realistically
> I
> think it would need to consume all of the data messages but would discard
> the
> ones it doesn’t need).  As the Product
> Recommendation consumes the data messages, it will process recommended
> products
> and publish out recommendation messages (that get more and more specific
> as it
> has consumed more and more data messages).
> 4.  The original
> website will consume the recommendation messages and show the
> recommendations to
> the user as it gets them.
>
> You don’t see many systems implemented this way but since
> Kafka has such a higher throughput than your typical MOM, this approach
> seems
> innovative.
>
> The benefits are:
>
> 1.  If we start
> collecting more information about the users, we can simply start publishing
> that in new data messages and consumers can start processing those messages
> whenever they want.  If we were doing
> this in a more traditional SOA approach the schemas would need to change
> every time
> we added a field but with this approach we can just create new messages
> without
> touching existing ones.
> 2.  We are looking to
> make our systems smaller so if we end up with more, smaller systems that
> each
> publish a small number of events, it becomes easier to make changes and
> test
> the changes.  If we were doing this in a
> more traditional SOA approach we would need to retest each consumer every
> time
> we changed our bigger SOA services.
>
> The downside appears to be:
>
> 1.  We may be
> publishing a large amount of data that never gets used but that everyone
> needs
> to consume to see if they need it before discarding it.
> 2.  The Product Recommendation
> system may need to wait until it consumes a number of messages and keep
> track
> of all the data internally before it can start processing.
> 3.  While we may be
> able to keep the messages somewhat small, the fact that they contain data
> will
> mean they will be bigger than your tradition EDA messages.
> 4.  It seems like we
> can do a lot of this using SOA (we already have an ESB than can do
> transformations to address consumers expecting an older version of the
> data).
>
> Any insight is appreciated.
> Thanks,
> Josh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message