spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adaryl Wakefield <adaryl.wakefi...@hotmail.com>
Subject SparklyR and the Tidyverse
Date Tue, 17 Oct 2017 23:31:36 GMT
I'm curious about the inner technical workings of SparklyR. Let's say you have:
titanic_train = spark_read_csv(sc, name="titanic_train", path="../Data/titanic_train.csv",
header = TRUE, delimiter = ",", quote = "\"", escape = "\\", charset = "UTF-8", null_value
= NULL, repartition = 0, memory = TRUE, overwrite = TRUE)

IF I said:

ggplot(titanic_train,aes(Pclass)) + geom_bar(aes(fill=factor(Pclass)),alpha=0.5)

Is ggplot being executed in a distributed fashion or is something else going on here?


Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.massstreet.net<http://www.massstreet.net/>
www.linkedin.com/in/bobwakefieldmba<http://www.linkedin.com/in/bobwakefieldmba>
Twitter: @BobLovesData<http://twitter.com/BobLovesData>



Mime
View raw message