Hi community,

I am using Spark on Yarn. When submiting a job after a long time I get an error mesage and retry.

It happens when I want to store the dataframe to a table.

spark_df.write.option("path", "/nlb_datalake/golden_zone/webhose/sentiment").saveAsTable("news_summary_test", mode="overwrite") 

The error is (after long time):

 Hive Session ID = be590d1b-ed5b-404b-bcb4-77cbb977a847 [Stage 2:> (0 + 16) / 16]19/08/15 15:42:08 WARN BlockManagerMasterEndpoint: No more replicas available for rdd_9_2 ! 19/08/15 15:42:08 WARN BlockManagerMasterEndpoint: No more replicas available for rdd_9_1 ! 19/08/15 15:42:08 WARN BlockManagerMasterEndpoint: No more replicas available for rdd_9_4 ! 19/08/15 15:42:08 WARN BlockManagerMasterEndpoint: No more replicas available for rdd_9_6 ! 19/08/15 15:42:08 WARN BlockManagerMasterEndpoint: No more replicas available for rdd_9_7 ! 19/08/15 15:42:08 WARN BlockManagerMasterEndpoint: No more replicas available for rdd_9_0 ! 19/08/15 15:42:08 WARN BlockManagerMasterEndpoint: No more replicas available for rdd_9_5 ! 19/08/15 15:42:08 WARN BlockManagerMasterEndpoint: No more replicas available for rdd_9_3 ! 19/08/15 15:42:08 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Requesting driver to remove executor 2 for reason Container killed by YARN for exceeding memory limits. 9.1 GB of 9 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead. 19/08/15 15:42:08 ERROR YarnScheduler: Lost executor 2 on nlb-srv-hd-08.i-lab.local: Container killed by YARN for exceeding memory limits. 9.1 GB of 9 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead. 19/08/15 15:42:08 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 17, nlb-srv-hd-08.i-lab.local, executor 2): ExecutorLostFailure (executor 2 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 9.1 GB of 9 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead. 19/08/15 15:42:08 WARN TaskSetManager: Lost task 5.0 in stage 2.0 (TID 26, nlb-srv-hd-08.i-lab.local, executor 2): ExecutorLostFailure (executor 2 exite

Do you have a rough idea where to tweak ?

Br,

Dennis