spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tobias Pfeiffer <>
Subject Re: Strange behavior of spark-shell while accessing hdfs
Date Wed, 12 Nov 2014 02:15:39 GMT

On Tue, Nov 11, 2014 at 2:04 PM, hmxxyy <> wrote:
> If I run bin/spark-shell without connecting a master, it can access a hdfs
> file on a remote cluster with kerberos authentication.


However, if I start the master and slave on the same host and using
> bin/spark-shell --master spark://*.*.*.*:7077
> run the same commands

[... ]
> Client cannot
> authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
> "*.*.*.*.com/"; destination host is: "*.*.*.*":8020;

When you give no master, it is "local[*]", so Spark will (implicitly?)
authenticate to HDFS from your local machine using local environment
variables, key files etc., I guess.

When you give a "spark://*" master, Spark will run on a different machine,
where you have not yet authenticated to HDFS, I think. I don't know how to
solve this, though, maybe some Kerberos token must be passed on to the
Spark cluster?


View raw message