For your last point, spark-submit has:

if [ -z "${SPARK_HOME}" ]; then
  export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"
fi

Meaning the script would determine the proper SPARK_HOME variable.

FYI

On Wed, Mar 16, 2016 at 4:22 AM, Леонид Поляков <owispyo@gmail.com> wrote:

Hello, guys!

 

I’ve been developing a kind of framework on top of spark, and my idea is to bundle the framework jars and some extra configs with the spark and pass it to other developers for their needs. So that devs can use this bundle and run usual spark stuff but with extra flavor that framework will add.

 

I’m trying to figure out how to properly set up the driver/executor classpath, so that framework classes are always loaded when you use the bundle.

I put framework libs in /lib folder right now, but will switch to something more specific later. I’m putting next spark-defaults.conf into my bundle:

 

spark.executor.extraClassPath /home/user/Apps/spark-bundled/lib/*

spark.driver.extraClassPath lib/*

 

And this seem to work, but I want to get rid of the absolute path from spark.executor.extraClassPath and use something relative, or spark home somehow, since libs are right there under /lib

I’ve tried these settings for executor, and they do not work:

spark.executor.extraClassPath $SPARK_HOME/lib/*

spark.executor.extraClassPath lib/*

 

I’ve found out that work directory for started workers is like $SPARK_HOME/work/app-20160316070310-0002/0, so this works:

spark.executor.extraClassPath ../../../lib/*

 

But looks cheaty and not stable.

 

Could you help me with this issue? Maybe there are some placeholders that I can use in configs?

Let me know if you need any worker/master/driver logs

 

P.S. driver does not work if I am not in $SPARK_HOME when I execute  spark-submit, e.g. if I do

cd bin

./spark-submit …

Then driver classpath is relative to /bin and now lib/* or ./lib/* in classpath does not work, so I need $SPARK_HOME for driver as well

 

Thanks, Leonid