kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damian Guy <damian....@gmail.com>
Subject Re: Initializing StateStores takes *really* long for large datasets
Date Fri, 25 Nov 2016 11:51:37 GMT
Hi Frank,

Is this on a restart of the application?

Thanks,
Damian

On Fri, 25 Nov 2016 at 11:09 Frank Lyaruu <flyaruu@gmail.com> wrote:

> Hi y'all,
>
> I have a reasonably simple KafkaStream application, which merges about 20
> topics a few times.
> The thing is, some of those topic datasets are pretty big, about 10M
> messages. In total I've got
> about 200Gb worth of state in RocksDB, the largest topic is 38 Gb.
>
> I had set the MAX_POLL_INTERVAL_MS_CONFIG to one hour to cover the
> initialization time,
> but that does not seem nearly enough, I'm looking at more than two hour
> startup times, and
> that starts to be a bit ridiculous.
>
> Any tips / experiences on how to deal with this case? Move away from Rocks
> and use an external
> data store? Any tuning tips on how to tune Rocks to be a bit more useful
> here?
>
> regards, Frank
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message