kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Svante Karlsson <svante.karls...@csi.se>
Subject Re: Consultant Help
Date Fri, 02 Mar 2018 20:50:16 GMT
try https://www.confluent.io/ - that's what they do


2018-03-02 21:21 GMT+01:00 Matt Stone <mstone@nexeohr.com>:

> We are looking for a consultant or contractor that can come onsite to our
> Ogden, Utah location in the US, to help with a Kafka set up and maintenance
> project.  What we need is someone with the knowledge and experience to
> build out the Kafka environment from scratch.
> We are thinking they would need to be onsite for 6-12 months  to set it
> up, and mentor some of our team so they can get up to speed to do the
> maintenance once the contractor is gone.  If anyone has the experience
> setting up Kafka from scratch in a Linux environment, maintain node
> clusters, and help train others on the team how to do it, and you are
> interested in a long term project working at the client site, I would love
> to start up  a discussion, to see if we could use you for the role.
> I would also be interested in hearing about any consulting firms that
> might have resources that could help with this role.
> Matt Stone
> -----Original Message-----
> From: Matt Daum [mailto:matt@setfive.com]
> Sent: Friday, March 2, 2018 1:11 PM
> To: users@kafka.apache.org
> Subject: Re: Kafka Setup for Daily counts on wide array of keys
> Actually it looks like the better way would be to output the counts to a
> new topic then ingest that topic into the DB itself.  Is that the correct
> way?
> On Fri, Mar 2, 2018 at 9:24 AM, Matt Daum <matt@setfive.com> wrote:
> > I am new to Kafka but I think I have a good use case for it.  I am
> > trying to build daily counts of requests based on a number of
> > different attributes in a high throughput system (~1 million
> > requests/sec. across all  8 servers).  The different attributes are
> > unbounded in terms of values, and some will spread across 100's of
> > millions values.  This is my current through process, let me know
> > where I could be more efficient or if there is a better way to do it.
> >
> > I'll create an AVRO object "Impression" which has all the attributes
> > of the inbound request.  My application servers then will on each
> > request create and send this to a single kafka topic.
> >
> > I'll then have a consumer which creates a stream from the topic.  From
> > there I'll use the windowed timeframes and groupBy to group by the
> > attributes on each given day.  At the end of the day I'd need to read
> > out the data store to an external system for storage.  Since I won't
> > know all the values I'd need something similar to the KVStore.all()
> > but for WindowedKV Stores.  This appears that it'd be possible in 1.1
> > with this
> > commit: https://github.com/apache/kafka/commit/
> > 1d1c8575961bf6bce7decb049be7f10ca76bd0c5 .
> >
> > Is this the best approach to doing this?  Or would I be better using
> > the stream to listen and then an external DB like Aerospike to store
> > the counts and read out of it directly end of day.
> >
> > Thanks for the help!
> > Daum
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message