spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Wendell (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-5158) Allow for keytab-based HDFS security in Standalone mode
Date Thu, 08 Jan 2015 21:25:34 GMT

     [ https://issues.apache.org/jira/browse/SPARK-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Patrick Wendell updated SPARK-5158:
-----------------------------------
    Description: 
There have been a handful of patches for allowing access to Kerberized HDFS clusters in standalone
mode. The main reason we haven't accepted these patches have been that they rely on insecure
distribution of token files from the driver to the other components.

As a simpler solution, I wonder if we should just provide a way to have the Spark driver and
executors independently log in and acquire credentials using a keytab. This would work for
users who have a dedicated, single-tenant, Spark clusters (i.e. they are willing to have a
keytab on every machine running Spark for their application). It wouldn't address all possible
deployment scenarios, but if it's simple I think it's worth considering.

This would also work for Spark streaming jobs, which often run on dedicated hardware since
they are long-running services.

  was:
There have been a handful of patches for allowing access to Kerberized HDFS clusters in standalone
mode. The main reason we haven't accepted these patches have been that they rely on insecure
distribution of token files from the driver to the other components.

As a simpler solution, I wonder if we should just provide a way to have the Spark driver and
executors independently log in and acquire credentials using a keytab. This would work for
users who are build dedicated, single-tenant, Spark clusters (i.e. they are willing to have
a keytab on every machine running Spark for their application). It wouldn't address all possible
deployment scenarios, but if it's simple I think it's worth considering.

This would also work for Spark streaming jobs, which often run on dedicated hardware since
they are long-running services.


> Allow for keytab-based HDFS security in Standalone mode
> -------------------------------------------------------
>
>                 Key: SPARK-5158
>                 URL: https://issues.apache.org/jira/browse/SPARK-5158
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Patrick Wendell
>            Assignee: Matthew Cheah
>            Priority: Critical
>
> There have been a handful of patches for allowing access to Kerberized HDFS clusters
in standalone mode. The main reason we haven't accepted these patches have been that they
rely on insecure distribution of token files from the driver to the other components.
> As a simpler solution, I wonder if we should just provide a way to have the Spark driver
and executors independently log in and acquire credentials using a keytab. This would work
for users who have a dedicated, single-tenant, Spark clusters (i.e. they are willing to have
a keytab on every machine running Spark for their application). It wouldn't address all possible
deployment scenarios, but if it's simple I think it's worth considering.
> This would also work for Spark streaming jobs, which often run on dedicated hardware
since they are long-running services.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message