spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Mahadevan <ar...@apache.org>
Subject Re: Question about RDD pipe
Date Thu, 17 Jan 2019 23:28:41 GMT
Yes, the script should be present on all the executor nodes.

You can pass your script via spark-submit (e.g. --files script.sh) and then
you should be able to refer that (e.g. "./script.sh") in rdd.pipe.

- Arun

On Thu, 17 Jan 2019 at 14:18, Mkal <diomfeas@hotmail.com> wrote:

> Hi, im trying to run an external script on spark using rdd.pipe() and
> although it runs successfully on standalone, it throws an error on cluster.
> The error comes from the executors and it's : "Cannot run program
> "path/to/program": error=2, No such file or directory".
>
> Does the external script need to be available on all nodes in the cluster
> when using rdd.pipe()?
>
> What if i don't have permission to install anything on the nodes of the
> cluster? Is there any other way to make the script available to the worker
> nodes?
>
> (The external script is loaded in HDFS and is passed to the driver class
> through args)
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Mime
View raw message