samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zach Cox <zcox...@gmail.com>
Subject Re: Question on nullEnvelop
Date Fri, 06 Feb 2015 21:11:05 GMT
I just added a comment to https://issues.apache.org/jira/browse/SAMZA-506
with details on our current approach to clean shutdown in Samza 0.8.0,
hopefully it's useful to others.


On Fri, Feb 6, 2015 at 2:39 PM, Chris Riccomini <criccomini@apache.org>
wrote:

> Hey Jae,
>
> > If so, what's the best way to shutdown the container without using
> command
> topic?
>
> YARN does send a SIGTERM before SIGKILL. The config in YARN to set the
> latency is here:
>
>   yarn.nodemanager.sleep-delay-before-sigkill.ms
>
> The default is 250ms. Samza does *not* currently handle the SIGTERM
> gracefully (it doesn't shut itself down). The ticket to do this is here:
>
>   https://issues.apache.org/jira/browse/SAMZA-506
>
> If you'd like to work on that patch, that should make it work. If not, yes,
> you'll have to use some form of a shutdown command. Zach (the guy who
> opened the JIRA) was able to hack around this himself by adding a shutdown
> hook. You could do something similar, if you want: add a shutdown hook that
> sets a variable, have window() check the variable ever N ms, and call
> coordinator.shutdown if it's set to true. You'd probably also have to raise
> the delay to more than 250ms in YARN.
>
> Options:
>
> 1. Use a topic like samza_command.
> 2. Fix SAMZA-506.
> 3. Write a custom shutdown hook with a static variable.
>
> >  Does it hurt overall processing performance? I don't think so, but I
> want to confirm.
>
> Nope, shouldn't. It only sleeps during "idle" time (no messages available).
> When there are messages available, you shouldn't get null_envelopes (unless
> you have a custom MessageChooser that withholds available messages, which I
> doubt you do).
>
> Cheers,
> Chris
>
> On Fri, Feb 6, 2015 at 12:30 PM, Bae, Jae Hyeon <metacret@gmail.com>
> wrote:
>
> > What I am doing is, consuming two topics, samza_input and samza_command.
> > samza_command will have some control command something like
> "shutdown,all"
> > because kill-yarn-job.sh does not gracefully shutdown SamzaContainer. Am
> I
> > correct? If so, what's the best way to shutdown the container without
> using
> > command topic?
> >
> > 10ms explains why 50 null envelops were consumed per second. Does it hurt
> > overall processing performance? I don't think so, but I want to confirm.
> >
> > Thank you
> > Best, Jae
> >
> > On Fri, Feb 6, 2015 at 12:16 PM, Chris Riccomini <criccomini@apache.org>
> > wrote:
> >
> > > Hey Jae,
> > >
> > > SamzaContainer polls for new messages by calling
> > > consumerMultiplexer.choose. In a case where there are no messages
> > > available, choose will return null. The next time choose is called, it
> > will
> > > be invoked with a timeout (the default is 10ms). This time, the poll
> call
> > > will block until 1) the timeout is hit 2) there is a new message
> > available
> > > to process. This is to prevent a tight loop.
> > >
> > > > its frequency is too high, in my testing environment, it's more than
> 50
> > > per second.
> > >
> > > Why do you think this is too high? It either has to do this, or sleep
> for
> > > longer. The longer the container sleeps, the more latency that's
> > introduced
> > > when there *is* a message available. 10ms is what we use by default.
> > >
> > > Cheers,
> > > Chris
> > >
> > > On Fri, Feb 6, 2015 at 11:11 AM, Bae, Jae Hyeon <metacret@gmail.com>
> > > wrote:
> > >
> > > > Could you explain why consumerMultiplexer.choose returns null?
> > > >
> > > > Can it happen when there's no message in the kafka topic?
> > > >
> > > > If my theory is correct, its frequency is too high, in my testing
> > > > environment, it's more than 50 per second.
> > > >
> > > > Thank you
> > > > Best, Jae
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message