hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: WALPlayer kills many RS when play large number of WALs
Date Tue, 22 Jul 2014 17:20:39 GMT
> The node has only 15G
​
memory.
​​
​​

EC2 m1.xlarge or m3.xlarge ? You might find some of the new types with more
memory have better price-performance value. If you are on EC2 and are
colocating mapreduce with HBase, you'll want more RAM *and* vCPU I think.

> But will that cause Java Heap Space problem for the mapreduce job when
the WALPlayer reducer running?
​​

If you add "-XX:+HeapDumpOnOutOfMemoryError" to '
mapred.child.java.opts' then there might be a retrievable heap dump left
around on the worker nodes. Not sure the precise location offhand, pardon,
it's been a while since I've debugged mapreduce. You can then use jhat to
analyze the heap dump. The types of the top 10 or 20 most frequently
allocated objects would be interesting.


On Tue, Jul 22, 2014 at 9:58 AM, Tianying Chang <tychang@gmail.com> wrote:

> Andrew
>
> Thanks for your answer! I think you are right. The node has only 15G
> memory. We configured it to run RS with 12G. And then we configured 4
> mapper and 4 reducer on each node, each to use 2G memory.  So that probably
> caused RS being killed by OOM.
> *mapred.child.java.opts*-Xmx2048m
> I have another question.  If I change the mapper/reducer per node to 1 and
> lower the mapred.child.java.opts to 512M, I think that will prevent RS
> being killed due to OOM. But will that cause Java Heap Space problem for
> the mapreduce job when the WALPlayer reducer running? Since I saw some post
> that to fix the Java Heap space error for mapreduce job, they are
> recommended to increase the mapred.child.java.opts to higher.
>
> http://stackoverflow.com/questions/8464048/out-of-memory-error-in-hadoop
>
> Thanks
> Tian-Ying
>
>
>
>
>
> On Tue, Jul 22, 2014 at 9:14 AM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
> > Accidentally hit send too soon.
> > ​
> > ​
> >  A good rule of thumb is the aggregate of all Java heaps (daemons like
> > DataNOde, RegionServer, NodeManager, etc. + the max allowed number of
> > mapreduce jobs * task heap setting) ... should fit into available RAM.
> >
> > If you don't have enough available RAM, then you need to take steps to
> > reduce resource consumption. Limit the allowed number of concurrent
> > mapreduce tasks. Reduce the heap size specified in
> > 'mapred.child.java.opts'. Or both.  ​
> >
> >
> > On Tue, Jul 22, 2014 at 9:12 AM, Andrew Purtell <apurtell@apache.org>
> > wrote:
> >
> > > You need to better manage the colocation of the mapreduce runtime. In
> > > other words, you are allowing mapreduce to grab too many node
> resources,
> > > resulting in activation of the kernel's OOM killer.
> > > ​​
> > > A good rule of thumb is the aggregate of all Java heaps (daemons like
> > > DataNOde, RegionServer, NodeManager, etc. + the max allowed number of
> > > mapreduce jobs * task heap setting). Reduce the allowed mapreduce task
> > > concurrency.
> > >
> > >
> > > On Tue, Jul 22, 2014 at 8:15 AM, Tianying Chang <tychang@gmail.com>
> > wrote:
> > >
> > >> Hi
> > >>
> > >> I was running WALPlayer that output HFile for future bulkload. There
> are
> > >> 6200 hlogs, and the total size is about 400G.
> > >>
> > >> The mapreduce job finished. But I saw two bad things:
> > >> 1. More than half of RS died. I checked the syslog, it seems they are
> > >> killed by OOM. They also have very high CPU spike for the whole time
> > >> during
> > >> WALPlayer
> > >>
> > >> cpu user usage of 84.4% matches resource limit [cpu user usage>70.0%]
> > >>
> > >> 2. Mapreduce job also has failure of Java heap Space error. My job set
> > the
> > >> heap usage as 2G,
> > >> *mapred.child.java.opts*-Xmx2048m
> > >> Does this mean WALPlayer cannot support this load on this kind of
> > setting?
> > >>
> > >> Thanks
> > >> Tian-Ying
> > >>
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > >    - Andy
> > >
> > > Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > > (via Tom White)
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message