spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Ariens <dari...@blackberry.com>
Subject Re: Accessing Kerberos Secured HDFS Resources from Spark on Mesos
Date Fri, 26 Jun 2015 20:55:35 GMT
There's a few security related issues that I am postponing dealing with.   Once I get this
working I'll look at the security side.   Likely I'll be encouraging users to submit their
jobs via docker containers.   Regardless, getting the users keytab and principal name in the
working environment of the executor isn't hard, it's being able to call the login method before
the HDFS resources are accessed.

See the gist below.   That login completes successfully but it's only on the driver.   Once
that HDFS resource is read with the Avro input format and key and the tasks are created inherently
on the slaves they are reading from that HDFS resource within their own running environment
(JVM?) and ‎any file system instantiations performed by spark aren't by a UserGroupInformation
resource associated to the principal.

From: Marcelo Vanzin
Sent: Friday, June 26, 2015 4:20 PM
To: Tim Chen
Cc: Olivier Girardot; Dave Ariens; user@spark.apache.org
Subject: Re: Accessing Kerberos Secured HDFS Resources from Spark on Mesos


On Fri, Jun 26, 2015 at 1:13 PM, Tim Chen <tim@mesosphere.io<mailto:tim@mesosphere.io>>
wrote:
So correct me if I'm wrong, sounds like all you need is a principal user name and also a keytab
file downloaded right?

I'm not familiar with Mesos so don't know what kinds of features it has, but at the very least
it would need to start containers as the requesting users (like YARN does when running with
Kerberos enabled), to avoid users being able to read each other's credentials.

--
Marcelo

Mime
View raw message