hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@effectivemachines.com>
Subject Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
Date Tue, 24 Oct 2017 15:27:59 GMT

> On Oct 23, 2017, at 12:50 PM, Allen Wittenauer <aw@effectivemachines.com> wrote:
> 
> 
> 
> With no other information or access to go on, my current hunch is that one of the HDFS
unit tests is ballooning in memory size.  The easiest way to kill a Linux machine is to eat
all of the RAM, thanks to overcommit and that’s what this “feels” like.
> 
> Someone should verify if 2.8.2 has the same issues before a release goes out …


	FWIW, I ran 2.8.2 last night and it has the same problems.

	Also: the node didn’t die!  Looking through the workspace (so the next run will destroy
them), two sets of logs stand out:

https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/ws/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt

							and

https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/ws/sourcedir/hadoop-hdfs-project/hadoop-hdfs/

	It looks like my hunch is correct:  RAM in the HDFS unit tests are going through the roof.
 It’s also interesting how MANY log files there are.  Is surefire not picking up that jobs
are dying?  Maybe not if memory is getting tight. 

	Anyway, at the point, branch-2.8 and higher are probably fubar’d. Additionally, I’ve
filed YETUS-561 so that Yetus-controlled Docker containers can have their RAM limits set in
order to prevent more nodes going catatonic.



---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Mime
View raw message