spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: Spark Job always cause a node to reboot
Date Fri, 05 Jun 2015 10:51:40 GMT

> On 4 Jun 2015, at 15:59, Chao Chen <kandy.cs@gmail.com> wrote:
> 
> But when I try to run the Pagerank from HiBench, it always cause a node to reboot during
the middle of the work for all scala, java, and python versions. But works fine
> with the MapReduce version from the same benchmark. 

do you mean a real server reboot? Without warning?

That's a serious problem. If it was just one server I'd look at hardware problems, especially
memory, whether you have mixed CPUs in a dual-socket server, or even potentially HDD issue.

if its all servers then its an OS or filesystem problem.

As well as the vm.swappiness, turn off huge pages in the kernel
http://docs.hortonworks.com/HDPDocuments/Ambari-1.6.1.0/bk_using_Ambari_book/content/ambari-chap1-5-8.html
http://www.cloudera.com/content/cloudera/en/documentation/cdh4/v4-2-0/CDH4-Release-Notes/cdh4ki_topic_1_3.html

See also some Hadoop/HDFS notes on filesystems, 5 years old
http://wiki.apache.org/hadoop/DiskSetup

Everyone generally still recommends ext3 & maybe ext4 with noatime

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message