spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MEETHU MATHEW <meethu2...@yahoo.co.in>
Subject Difference between collect() and take(n)
Date Thu, 10 Jul 2014 12:50:22 GMT
Hi all,

I want to know how collect() works, and how it is different from take().I am just reading
a file of 330MB which has 43lakh rows with 13 columns and calling take(4300000) to save to
a variable.But the same is not working with collect().So is there any difference in the operation
of both.


Again,I wanted to set java heap size for my spark pgm. I set it using spark.executor.extraJavaOptions
in spark-default-conf.sh. Now I want to set the same for the worker.Can I do that with SPARK_DAEMON_JAVA_OPTS?Is
the following syntax correct?

SPARK_DAEMON_JAVA_OPTS="-XX:+UseCompressedOops -Xmx3g"


Thanks & Regards, 
Meethu M
Mime
View raw message