spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From agg212 <alexander_galaka...@brown.edu>
Subject Problems running TPC-H on Raspberry Pi Cluster
Date Wed, 10 Jul 2019 14:57:53 GMT
We are trying to benchmark TPC-H (scale factor 1) on a 13-node Raspberry Pi
3B+ cluster (1 master, 12 workers). Each node has 1GB of RAM and a quad-core
processor, running Ubuntu Server 18.04. The cluster is using the Spark
standalone scheduler with the *.tbl files from TPCH’s dbgen tool stored in
HDFS.

We are experiencing several failures when trying to run queries. Jobs fail
unpredictably, usually with one or many “DEAD/LOST” nodes displaying in the
web UI. It appears that one or more nodes “hang” during query execution and
become unreachable/timeout.

We have included our configuration parameters as well as the driver program
below. Any recommendations would be greatly appreciated

-------------------------------------------

-------------------------------------------



Driver:
-------------------------------------------




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message