kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From R Krishna <krishna...@gmail.com>
Subject Re: Multiple streaming jobs on the same topic
Date Fri, 01 Apr 2016 16:43:15 GMT
Then, can you specify a size/percentage of cache per consumer group?
On Apr 1, 2016 9:09 AM, "Cees de Groot" <cees@pagerduty.com> wrote:

> One of Kafka's design ideas is to keep data in the JVM to a minimum,
> offloading caching to the OS. So on the Kafka level, there's pretty much
> not much you can do - the old data is buffered by the system (has to be to
> be read into userspace) and thus this reduces the amount of cache available
> to the other job.
>
> Buy more memory ;-)
>
> (also, I think it's smart to tune _down_ the amount of memory you give to
> the Kafka JVM, to maximize the OS's buffering. You don't want large amounts
> of JVM memory filled with garbage contending with OS buffer cache filled
> with useful data).
>
> On Fri, Apr 1, 2016 at 3:42 AM, Mayur Mohite <mayur.mohite@applift.com>
> wrote:
>
> > Hi,
> >
> > We have a kafka cluster running in production and there are two spark
> > streaming job (J1 and J2) that fetches the data from the same topic.
> >
> > We noticed that if one of the two jobs (say J1) starts reading data from
> > old offset (that job failed for 2 hours and when we started the job after
> > fixing the failure the offset was old), that data is read from disk
> instead
> > of reading from OS cache.
> >
> > When this happens the other job's (J2) throughput is reduced even though
> > that job's offset is recent.
> > We believe that the recent data is most likely in memory so we are not
> sure
> > why the other job's (J2) throughput is reduced.
> >
> > Did anyone come across such an issue in production? If yes how did you
> fix
> > the issue?
> >
> > -Mayur
> >
> > --
> >
> >
> > Learn more about our inaugural *FirstScreen Conference
> > <http://www.firstscreenconf.com/>*!
> > *Where the worlds of mobile advertising and technology meet!*
> >
> > June 15, 2016 @ Urania Berlin
> >
>
>
>
> --
> Cees de Groot
> Principal Software Engineer
> PagerDuty, Inc.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message