kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Susheel Kumar <susheel2...@gmail.com>
Subject Re: Kafka as a database/repository question
Date Fri, 16 Dec 2016 18:17:24 GMT
Thanks, Hans for the insight. Will use compacted topic.

On Thu, Dec 15, 2016 at 3:53 PM, Hans Jespersen <hans@confluent.io> wrote:

> for #2 definitely use a compacted topic. Compaction will remove old
> messages and keep the last update for each key. To use this function you
> will need to publish messages as Key/Value pairs. Apache Kafka 0.10.1 has
> some important fixes to make compacted topics more reliable when scaling to
> large numbers of keys so make sure to use the latest release if this
> becomes a large amount of data.
>
> #3 sounds like a Kafka Sink Connector for Solr (something like this
> https://github.com/jcustenborder/kafka-connect-solr)
>
> #4 messages in compacted topics do not expire and are only removed when
> updated by a newer message of the same key.
>
> -hans
>
> /**
>  * Hans Jespersen, Principal Systems Engineer, Confluent Inc.
>  * hans@confluent.io (650)924-2670
>  */
>
> On Thu, Dec 15, 2016 at 10:16 AM, Kenny Gorman <kenny@eventador.io> wrote:
>
> > A couple thoughts..
> >
> > - If you plan on fetching old messages in a non-contiguous manner then
> > this may not be the best design. For instance, “give me messages from
> > mondays for the last 3 quarters” is better served with a database. But if
> > you want to say “give me messages from the last month until now” that
> works
> > great.
> >
> > - I am not sure what you mean by updating messages. You would need to
> have
> > some sort of key and push in new messages with that key. Then when you
> read
> > by key, the application should understand that the latest is the version
> it
> > should use.
> >
> > - Alternatively, you can consume to something like a DB and use SQL to
> > select what you want using regular SQL. We see this pattern a lot.
> >
> > - For storing messages indefinitely it’s mostly making sure the config
> > options are set appropriately and you have enough storage space. Set
> > replication to something that makes you comfortable, maybe take backups
> as
> > was mentioned.
> >
> > Hope this helps some
> >
> > Kenny Gorman
> > Founder
> > www.eventador.io
> >
> >
> > > On Dec 15, 2016, at 12:00 PM, Susheel Kumar <susheel2777@gmail.com>
> > wrote:
> > >
> > > Hello Folks,
> > >
> > > I am going thru an existing design where Kafka is planned to be
> utilised
> > in
> > > below manner
> > >
> > >
> > >   1. Messages will pushed to Kafka by producers
> > >   2. There will be updates to existing messages on ongoing basis.  The
> > >   expectation is that all the updates are consolidated in Kafka and the
> > >   latest and greatest version/copy is kept
> > >   3. Consumers will read the messages from Kafka and push to Solr for
> > >   ingestion purposes
> > >   4. There will be no purging/removal of messages since it is expected
> to
> > >   replay the messages in the future and perform full-re-ingestion.  So
> > >   messages will be kept in Kafka for indefinite period similar to
> > database
> > >   where data once stored remains there and can be used later in teh
> > future.
> > >
> > >
> > > Do you see any pitfalls / any issue with this design especially wrt to
> > > storing the messages indefinitely.
> > >
> > >
> > > Thanks,
> > > Susheel
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message