systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Boehm <>
Subject Re: spark hybrid mode on HDFS
Date Tue, 18 Jul 2017 02:49:25 GMT
well, at a high-level, resource negotiation and distributed storage are
orthogonal concepts. Yarn, Mesos, Standalone, and Kubernetes are resource
schedulers, which you can configure via master and a separate deploy mode
(client/cluster). Under the covers of the HDFS API, you can also use
various alternative file system implementations such as HDFS, local file,
object stores (e.g., swift/s3), etc. At a bare minimum, you need to have
some hadoop jars in your classpath, which would already allow you to run
local/standalone and the local file system implementation.

Regarding the attached error, it looks like your HDFS is configured with
local FS as the default file system implementation but you're trying to
write to a filename with prefix hdfs. It also looks like you're running a
stale version of SystemML (according to the given line numbers in your
stacktrace). Note that up until SystemML 0.14 (inclusive), we always used
the default file system implementation, but in master, we create the
correct file system according to the given file schemes (see
SYSTEMML-1696). So please try to (1) use a recent build of SystemML master,
or (2) reconfigure your hdfs-site.xml to use hdfs as the default fs


On Sun, Jul 16, 2017 at 11:22 PM, Krishna Kalyan <>

> Hello All,
> I have some questions running systemml scripts on HDFS (with hybrid_spark
> execution mode).
> My Current Configuration:
> Standalone HDFS on OSX (version 2.8)
> and Spark Pre-Built for hadoop 2.7 (version 2.1.0)
> *jps* out from my system
> [image: Inline image 1]
> Both of them have been installed separately.
> As far as I understand, to enable hdfs support we need to run spark master
> on yarn-client | yarn-cluster. (Is this understanding correct?)
> My question:
> I dont have access to a cluster, is there a way to set up a yarn-client /
> yarn-cluster or my local system so that I can run systemml scripts on
> hybrid_spark mode with HDFS?. If yes could you please point to some
> documentation?.
> Thank you so much,
> Krishna
> PS : sysout of what I have tried already attached below.

  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message