spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nkechi Achara <nkach...@googlemail.com>
Subject Using Spark to retrieve a HDFS file protected by Kerberos
Date Tue, 22 Mar 2016 23:47:52 GMT
I am having issues setting up my spark environment to read from a
kerberized HDFS file location.

At the moment I have tried to do the following:

def ugiDoAs[T](ugi:   Option[UserGroupInformation])(code: => T) = ugi match
{
    case None => code
    case Some(u) => u.doAs(new PrivilegedExceptionAction[T] {
      override def run(): T = code }) }

val sparkConf =
defaultSparkConf.setAppName("file-test").setMaster("yarn-client")

val sc = ugiDoAs(ugi) {new SparkContext(conf)}

val file = sc.textFile("path")

It fails at the point of creating the Spark Context, with the following
error:

Exception in thread "main"
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is
not enabled. Available:[TOKEN, KERBEROS] at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:155)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497) at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)


Has anyone got a simple example on how to allow spark to connect to a
kerberized HDFS location?

I know that spark needs to be in Yarn mode to be able to make it work, but
the login method does not seem to be working in this respect. Although I
know that the User Group Information (ugi) object is valid as I have used
it to connect to ZK in the same object and HBase.

Mime
View raw message