spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Difference between collect() and take(n)
Date Thu, 10 Jul 2014 12:50:22 GMT
Hi all,

I want to know how collect() works, and how it is different from take().I am just reading
a file of 330MB which has 43lakh rows with 13 columns and calling take(4300000) to save to
a variable.But the same is not working with collect().So is there any difference in the operation
of both.

Again,I wanted to set java heap size for my spark pgm. I set it using spark.executor.extraJavaOptions
in Now I want to set the same for the worker.Can I do that with SPARK_DAEMON_JAVA_OPTS?Is
the following syntax correct?

SPARK_DAEMON_JAVA_OPTS="-XX:+UseCompressedOops -Xmx3g"

Thanks & Regards, 
Meethu M
View raw message