spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shivaram Venkataraman <shiva...@eecs.berkeley.edu>
Subject Re: Spark 1.4.0 - Using SparkR on EC2 Instance
Date Fri, 26 Jun 2015 17:13:15 GMT
I was using RStudio on the master node of the same cluster in the demo.
However I had installed Spark under the user `rstudio` (i.e. /home/rstudio)
and that will make the permissions work correctly. You will need to copy
the config files from /root/spark/conf after installing Spark though and it
might need some more manual tweaks.

Thanks
Shivaram

On Fri, Jun 26, 2015 at 9:59 AM, Mark Stephenson <mark@redoakstrategic.com>
wrote:

> Thanks!
>
> In your demo video, were you using RStudio to hit a separate EC2 Spark
> cluster?  I noticed that it appeared your browser that you were using EC2
> at that time, so I was just curious.  It appears that might be one of the
> possible workarounds - fire up a separate EC2 instance with RStudio Server
> that initializes the spark context against a separate Spark cluster.
>
> On Jun 26, 2015, at 11:46 AM, Shivaram Venkataraman <
> shivaram@eecs.berkeley.edu> wrote:
>
> We don't have a documented way to use RStudio on EC2 right now. We have a
> ticket open at https://issues.apache.org/jira/browse/SPARK-8596 to
> discuss work-arounds and potential solutions for this.
>
> Thanks
> Shivaram
>
> On Fri, Jun 26, 2015 at 6:27 AM, RedOakMark <mark@redoakstrategic.com>
> wrote:
>
>> Good morning,
>>
>> I am having a bit of trouble finalizing the installation and usage of the
>> newest Spark version 1.4.0, deploying to an Amazon EC2 instance and using
>> RStudio to run on top of it.
>>
>> Using these instructions (
>> http://spark.apache.org/docs/latest/ec2-scripts.html
>> <http://spark.apache.org/docs/latest/ec2-scripts.html>  ) we can fire up
>> an
>> EC2 instance (which we have been successful doing - we have gotten the
>> cluster to launch from the command line without an issue).  Then, I
>> installed RStudio Server on the same EC2 instance (the master) and
>> successfully logged into it (using the test/test user) through the web
>> browser.
>>
>> This is where I get stuck - within RStudio, when I try to reference/find
>> the
>> folder that SparkR was installed, to load the SparkR library and
>> initialize
>> a SparkContext, I get permissions errors on the folders, or the library
>> cannot be found because I cannot find the folder in which the library is
>> sitting.
>>
>> Has anyone successfully launched and utilized SparkR 1.4.0 in this way,
>> with
>> RStudio Server running on top of the master instance?  Are we on the right
>> track, or should we manually launch a cluster and attempt to connect to it
>> from another instance running R?
>>
>> Thank you in advance!
>>
>> Mark
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-4-0-Using-SparkR-on-EC2-Instance-tp23506.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>
>

Mime
View raw message