spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sun Rui <sunrise_...@163.com>
Subject Re: Using R code as part of a Spark Application
Date Thu, 30 Jun 2016 06:45:50 GMT
I would guess that the technology behind Azure R Server is about Revolution Enterprise DistributedR/ScaleR.
I don’t know the details, but the statement in the “Step 6. Install R packages” section
in the given documentation page.
    However, if you need to install R packages on the worker nodes of the cluster, you must
use a Script Action.

That implies that R should be installed on each worker node.

> On Jun 30, 2016, at 05:53, John Aherne <john.aherne@justenough.com <mailto:john.aherne@justenough.com>>
wrote:
> 
> I don't think R server requires R on the executor nodes. I originally set up a SparkR
cluster for our Data Scientist on Azure which required that I install R on each node, but
for the R Server set up, there is an extra edge node with R server that they connect to. From
what little research I was able to do, it seems that there are some special functions in R
Server that can distribute the work to the cluster. 
> 
> Documentation is light, and hard to find but I found this helpful:
> https://blogs.msdn.microsoft.com/uk_faculty_connection/2016/05/10/r-server-for-hdinsight-running-on-microsoft-azure-cloud-data-science-challenges/
<https://blogs.msdn.microsoft.com/uk_faculty_connection/2016/05/10/r-server-for-hdinsight-running-on-microsoft-azure-cloud-data-science-challenges/>
> 
> 
> 
> On Wed, Jun 29, 2016 at 3:29 PM, Sean Owen <sowen@cloudera.com <mailto:sowen@cloudera.com>>
wrote:
> Oh, interesting: does this really mean the return of distributing R
> code from driver to executors and running it remotely, or do I
> misunderstand? this would require having R on the executor nodes like
> it used to?
> 
> On Wed, Jun 29, 2016 at 5:53 PM, Xinh Huynh <xinh.huynh@gmail.com <mailto:xinh.huynh@gmail.com>>
wrote:
> > There is some new SparkR functionality coming in Spark 2.0, such as
> > "dapply". You could use SparkR to load a Parquet file and then run "dapply"
> > to apply a function to each partition of a DataFrame.
> >
> > Info about loading Parquet file:
> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/sparkr.html#from-data-sources
<http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/sparkr.html#from-data-sources>
> >
> > API doc for "dapply":
> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/api/R/index.html
<http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/api/R/index.html>
> >
> > Xinh
> >
> > On Wed, Jun 29, 2016 at 6:54 AM, sujeet jog <sujeet.jog@gmail.com <mailto:sujeet.jog@gmail.com>>
wrote:
> >>
> >> try Spark pipeRDD's , you can invoke the R script from pipe , push  the
> >> stuff you want to do on the Rscript stdin,  p
> >>
> >>
> >> On Wed, Jun 29, 2016 at 7:10 PM, Gilad Landau <Gilad.Landau@clicktale.com
<mailto:Gilad.Landau@clicktale.com>>
> >> wrote:
> >>>
> >>> Hello,
> >>>
> >>>
> >>>
> >>> I want to use R code as part of spark application (the same way I would
> >>> do with Scala/Python).  I want to be able to run an R syntax as a map
> >>> function on a big Spark dataframe loaded from a parquet file.
> >>>
> >>> Is this even possible or the only way to use R is as part of RStudio
> >>> orchestration of our Spark  cluster?
> >>>
> >>>
> >>>
> >>> Thanks for the help!
> >>>
> >>>
> >>>
> >>> Gilad
> >>>
> >>>
> >>
> >>
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org <mailto:user-unsubscribe@spark.apache.org>
> 
> 
> 
> 
> -- 
> John Aherne
> Big Data and SQL Developer
> 
> 
> Cell:
> Email:
> Skype:
> Web:
> 
> +1 (303) 809-9718
> john.aherne@justenough.com <mailto:john.aherne@justenough.com>
> john.aherne.je <http://john.aherne.je/>
> www.justenough.com <http://www.justenough.com/>
> 
> Confidentiality Note: The information contained in this email and document(s) attached
are for the exclusive use of the addressee and may contain confidential, privileged and non-disclosable
information. If the recipient of this email is not the addressee, such recipient is strictly
prohibited from reading, photocopying, distribution or otherwise using this email or its contents
in any way.
> 


Mime
View raw message