spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix Cheung <felixcheun...@hotmail.com>
Subject Re: what does dapply actually do?
Date Thu, 19 Jan 2017 04:23:29 GMT
With Spark, the processing is performed lazily. This means nothing much is really happening
until you call an "action" - an example that is collect(). Another way is to write the output
in a distributed manner - see write.df() in R.

With SparkR dapply() passing the data from Spark to R to process by your UDF could have significant
overhead. Could you provide more information on your case?


_____________________________
From: Xiao Liu1 <liuxiao@us.ibm.com<mailto:liuxiao@us.ibm.com>>
Sent: Wednesday, January 18, 2017 11:30 AM
Subject: what does dapply actually do?
To: <user@spark.apache.org<mailto:user@spark.apache.org>>



Hi,
I'm really new and trying to learn sparkR. I have defined a relatively complicated user-defined
function, and use dapply() to apply the function on a SparkDataFrame. It was very fast. But
I am not sure what has actually been done by dapply(). Because when I used collect() to see
the output, which is very simple, it took a long time to get the result. I suppose maybe I
don't need to use collect(), but without using it, how can I output the final results, say,
in a .csv file?
Thank you very much for the help.

Best Regards,
Xiao


[Inactive hide details for Ninad Shringarpure ---01/18/2017 02:24:08 PM---Hi Team, Is there
a standard way of generating a uniqu]Ninad Shringarpure ---01/18/2017 02:24:08 PM---Hi Team,
Is there a standard way of generating a unique id for each row in from

From: Ninad Shringarpure <ninad@cloudera.com<mailto:ninad@cloudera.com>>
To: user <user@spark.apache.org<mailto:user@spark.apache.org>>
Date: 01/18/2017 02:24 PM
Subject: Creating UUID using SparksSQL

________________________________



Hi Team,

Is there a standard way of generating a unique id for each row in from Spark SQL. I am looking
for functionality similar to UUID generation in hive.

Let me know if you need any additional information.

Thanks,
Ninad





Mime
View raw message