spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ram Venkatesh <venkar.m...@gmail.com>
Subject Re: SparkR in yarn-client mode needs sparkr.zip
Date Sun, 25 Oct 2015 22:10:37 GMT
Ted Yu,

Agree that either picking up sparkr.zip if it already exists, or creating a
zip in a local scratch directory will work. This code is called by the
client side job submission logic and the resulting zip is already added to
the local resources for the YARN job, so I don't think the directory needs
be accessible by the user yarn or from the cluster. Filed
https://issues.apache.org/jira/browse/SPARK-11304 for this issue.

As a temporary hack workaround, I created a writable file called sparkr.zip
in R/lib and made it world writable.

Thanks
Ram

On Sun, Oct 25, 2015 at 9:56 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> In zipRLibraries():
>
>     // create a zip file from scratch, do not append to existing file.
>     val zipFile = new File(dir, name)
>
> I guess instead of creating sparkr.zip in the same directory as R lib,
> the zip file can be created under some directory writable by the user
> launching the app and accessible by user 'yarn'.
>
> Cheers
>
> On Sun, Oct 25, 2015 at 8:29 AM, Ram Venkatesh <venkar.mail@gmail.com>
> wrote:
>
>> <newbie sparkr question, apologies if already answered>
>>
>> If you run sparkR in yarn-client mode, it fails with
>>
>> Exception in thread "main" java.io.FileNotFoundException:
>> /usr/hdp/2.3.2.1-12/spark/R/lib/sparkr.zip (Permission denied)
>>         at java.io.FileOutputStream.open0(Native Method)
>>         at java.io.FileOutputStream.open(FileOutputStream.java:270)
>>         at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
>>         at
>>
>> org.apache.spark.deploy.RPackageUtils$.zipRLibraries(RPackageUtils.scala:215)
>>         at
>>
>> org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:371)
>>         at
>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
>>         at
>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>> The behavior is the same when I use the pre-built
>> spark-1.5.1-bin-hadoop2.6
>> version also.
>>
>> Interestingly if I run as a user with write permissions to the R/lib
>> directory, it succeeds. However, the sparkr.zip file is recreated each
>> time
>> sparkR is launched, so even if the file is present it has to be writable
>> by
>> the submitting user.
>>
>> Couple questions:
>> 1. Can spark.zip be packaged once and placed in that location for multiple
>> users
>> 2. If not, is this location configurable, so that each user can specify a
>> directory that they can write?
>>
>> Thanks!
>> Ram
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-in-yarn-client-mode-needs-sparkr-zip-tp25194.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Mime
View raw message