spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arijit Tarafdar <>
Subject Questions on Python support with Spark
Date Fri, 09 Nov 2018 22:04:08 GMT
Hello All,

We have a requirement to run PySpark in standalone cluster mode and also reference python
libraries (egg/wheel) which are not local but placed in a distributed storage like HDFS. From
the code it looks like none of cases are supported.

Questions are:

  1.  Why is PySpark supported only in standalone client mode?
  2.  Why –py-files only support local files and not files stored in remote stores?

We will like to update the Spark code to support these scenarios but just want to be aware
of any technical difficulties that the community has faced while trying to support those.

Thanks, Arijit

View raw message