spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Vanzin <>
Subject Re: Accessing Kerberos Secured HDFS Resources from Spark on Mesos
Date Fri, 26 Jun 2015 21:28:21 GMT
On Fri, Jun 26, 2015 at 2:08 PM, Tim Chen <> wrote:

> Mesos do support running containers as specific users passed to it.
> Thanks for chiming in, what else does YARN do with Kerberos besides keytab
> file and user?

The basic things I'd expect from a system to properly support Kerberos
would be:

- The cluster manager should authenticate users (like the YARN RM does)
before users can start applications.
- The cluster manager should use Kerberos to authenticate within itself
(e.g. a YARN NM connecting to the RM).
- Started applications are properly isolated (e.g. application runs as
requesting user, or in a separate container that cannot be accessed by
other applications in any way).

On top of that, for HDFS and other Hadoop services, the applications
themselves need to be aware that Kerberos is enabled and that they need to
do certain things. For example, they need to get delegation tokens for each
service they need (Spark on YARN supports that HDFS and Hive) - you can
look for uses of "obtainTokensForNamenodes" as an example. And those tokens
need to be distributed to all executors securely (which you get when you
enable encrypted RPCs on YARN).

So if Mesos handles the above cases, you could probably adapt the code in
the YARN integration to work with Mesos too; the YARN code uses Hadoop
library features like UserGroupInformation to propagate tokens, which is
integrated into the YARN API itself, so there might be some extra work to
make it all work with Mesos.

> On Fri, Jun 26, 2015 at 1:20 PM, Marcelo Vanzin <>
> wrote:
>> On Fri, Jun 26, 2015 at 1:13 PM, Tim Chen <> wrote:
>>> So correct me if I'm wrong, sounds like all you need is a principal user
>>> name and also a keytab file downloaded right?
>> I'm not familiar with Mesos so don't know what kinds of features it has,
>> but at the very least it would need to start containers as the requesting
>> users (like YARN does when running with Kerberos enabled), to avoid users
>> being able to read each other's credentials.
>> --
>> Marcelo


View raw message