hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wangda Tan <wheele...@gmail.com>
Subject Re: Regarding a scenario where applications are being hung
Date Tue, 02 Sep 2014 01:36:13 GMT
Hi Naga,
AFAIK, there's no such timeout,
Since it's a new feature request, I'd suggest to create a JIRA and let's
move discussions on it.

Thanks,
Wangda


On Tue, Sep 2, 2014 at 8:52 AM, Naganarasimha G R (Naga) <
garlanaganarasimha@huawei.com> wrote:

> Hi,
>
>     "AM can take action if it doesn't receive any container for some time."
>
> Can we have a generic timeout feature for all AM's @ the yarn side such
> that if no containers are assigned for an application for a defined period
> than yarn can timeout the application attempt.
>
> Default can be set to 0 where in RM will not timeout the app attempt and
> user can set his own timeout when he submits the application?
>
>
>
> Basically we faced this issues in MR2 itself and i was not able to find
> any such timeouts in map reduce config
>
> Regards,
> Naga
> ________________________________________
> From: Wangda Tan [wheeleast@gmail.com]
> Sent: Tuesday, September 02, 2014 07:59
> To: yarn-dev@hadoop.apache.org
> Subject: Re: Regarding a scenario where applications are being hung
>
> Hi Naga,
> When trying to allocate a container, the behavior is,
> First it will check capacity of queue, in your case, capacity of queue is
> 8G * 2 = 16G. So a 7G container will be pass the check
> The it will check if there's enough space on a node, in your case, it
> cannot pass the check, so the ResourceRequest will be skipped.
>
> There's no "timeout" for a ResourceRequest now, AM can take action if it
> doesn't receive any container for some time.
> Preemption is another story, preemption is used to reclaim resource for a
> queue under-satisfied from a queue over-satisfied. It's not your case too.
>
> Hope this helps,
> Wangda
>
>
>
> On Mon, Sep 1, 2014 at 11:11 PM, Naganarasimha G R (Naga) <
> garlanaganarasimha@huawei.com> wrote:
>
> > Hi Wangda,
> >
> >               Yes its the case where in its requesting 7GB per container.
> > But can you decribe why its expected behavior ?
> >
> > From user perspective either it should not have taken in the application
> > request or after some time may be both apps should have  been killed with
> > proper exception or log information, or pre empt any one AM container etc
> > ...
> >
> > Here none are happening!
> >
> >
> > Regards,
> > Naga
> >
> >
> >
> > ________________________________________
> > From: Wangda Tan [wheeleast@gmail.com]
> > Sent: Monday, September 01, 2014 22:51
> > To: yarn-dev@hadoop.apache.org
> > Subject: Re: Regarding a scenario where applications are being hung
> >
> > Hi Naga,
> > According to scenario you described, if "Now each AM is requesting for
> > container of 7Gb  mem resource ." is a request for a single container
> (not
> > request 7 * containers, 1G for each one), it should be expected behavior.
> > Please let me know if you have more questions.
> >
> > Thanks,
> > Wangda
> >
> >
> > On Mon, Sep 1, 2014 at 10:46 PM, Naganarasimha G R (Naga) <
> > garlanaganarasimha@huawei.com> wrote:
> >
> > > Hi ,
> > >
> > >     I have one sceanrio which makes the applications to get hung , so
> > > wanted to validate whether its a bug (if so will raise a jira).
> > >
> > >
> > >
> > > Consider a cluster setup which has 2 NMS of each 8GB resource,
> > >
> > > And 2 applications are launched in the default queue where in each AM
> is
> > > taking 2 GB each.
> > >
> > > Each AM is placed in each of the NM. Now each AM is requesting for
> > > container of 7Gb  mem resource .
> > >
> > > As in each NM only 6GB resource is available both the applications are
> > > hung forever.
> > >
> > >
> > >
> > > Is this a bug ?
> > >
> > >
> > >
> > > Regards,
> > >
> > > Naga
> > >
> > >
> > >
> > > Huawei Technologies Co., Ltd.
> > > Phone:
> > > Fax:
> > > Mobile:  +91 9980040283
> > > Email: naganarasimhagr@huawei.com<mailto:naganarasimhagr@huawei.com>
> > > Huawei Technologies Co., Ltd.
> > > Bantian, Longgang District,Shenzhen 518129, P.R.China
> > > http://www.huawei.com
> > >
> > > ¡This e-mail and its attachments contain confidential information from
> > > HUAWEI, which is intended only for the person or entity whose address
> is
> > > listed above. Any use of the information contained herein in any way
> > > (including, but not limited to, total or partial disclosure,
> > reproduction,
> > > or dissemination) by persons other than the intended recipient(s) is
> > > prohibited. If you receive this e-mail in error, please notify the
> sender
> > > by phone or email immediately and delete it!
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message