spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vivek Mishra <vmis...@impetus.com>
Subject Running 100 GB at standalone node
Date Tue, 18 Apr 2017 09:10:28 GMT
Hi,
I am running application over spark v 1.6.2(in standalone mode) for over 100 GB of data .
Given below are my configurations:

Job configuration
spark.driver.memory=5g
spark.executor.memory=5g
spark.cores.max=4

spark-env.sh
export SPARK_WORKER_INSTANCES=3;
export SPARK_WORKER_MEMORY=5g;


There is one data frame which periodically does union and aggregation over multiple csv files
streamed in. All goes well, but towards the end when I need to persist this data frame(using
spark jdbc), executor seems to get in hang state for good!  I even tried dataframe.show()
but no luck.

Read about it and tried multiple things but nothing is working so far.


Any suggestion would really help!

Sincerely,
-Vivek

________________________________






NOTE: This message may contain information that is confidential, proprietary, privileged or
otherwise protected by law. The message is intended solely for the named addressee. If received
in error, please destroy and notify the sender. Any use of this email is prohibited when received
in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors, virus, interception
or interference.

Mime
View raw message