spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aneela Saleem <ane...@platalytics.com>
Subject Re: Accessing HBase through Spark with Security enabled
Date Tue, 16 Aug 2016 10:04:37 GMT
Thanks Steve,

I have gone through it's documentation, i did not get any idea how to
install it. Can you help me?

On Mon, Aug 15, 2016 at 4:23 PM, Steve Loughran <stevel@hortonworks.com>
wrote:

>
> On 15 Aug 2016, at 08:29, Aneela Saleem <aneela@platalytics.com> wrote:
>
> Thanks Jacek!
>
> I have already set hbase.security.authentication property set to
> kerberos, since Hbase with kerberos is working fine.
>
> I tested again after correcting the typo but got same error. Following is
> the code, Please have a look:
>
> System.setProperty("java.security.krb5.conf", "/etc/krb5.conf");
> System.setProperty("java.security.auth.login.config",
> "/etc/hbase/conf/zk-jaas.conf");
> val hconf = HBaseConfiguration.create()
> val tableName = "emp"
> hconf.set("hbase.zookeeper.quorum", "hadoop-master")
> hconf.set(TableInputFormat.INPUT_TABLE, tableName)
> hconf.set("hbase.zookeeper.property.clientPort", "2181")
> hconf.set("hbase.master", "hadoop-master:60000")
> hconf.set("hadoop.security.authentication", "kerberos")
> hconf.addResource(new Path("/etc/hbase/conf/core-site.xml"))
> hconf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"))
>
>
> spark should be automatically picking those up from the classpath; adding
> them to your  own hconf isn't going to have any effect on the hbase config
> used to extract the hbase token on Yarn app launch. That all needs to be
> set up at the time the Spark cluster/app is launched. If you are running
>
> There's a little diagnostics tool, kdiag, which will be in future Hadoop
> versions —It's available as a standalone JAR for others to use
>
> https://github.com/steveloughran/kdiag
>
> This may help verify things like your keytab/login details
>
>
> val conf = new SparkConf()
> conf.set("spark.yarn.security.tokens.hbase.enabled", "true")
> conf.set("spark.authenticate", "true")
> conf.set("spark.authenticate.secret","None")
> val sc = new SparkContext(conf)
> val hBaseRDD = sc.newAPIHadoopRDD(hconf, classOf[TableInputFormat],
> classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
> classOf[org.apache.hadoop.hbase.client.Result])
>
> val count = hBaseRDD.count()
> print("HBase RDD count:" + count)
>
>
>
>
> On Sat, Aug 13, 2016 at 8:36 PM, Jacek Laskowski <jacek@japila.pl> wrote:
>
>> Hi Aneela,
>>
>> My (little to no) understanding of how to make it work is to use
>> hbase.security.authentication property set to kerberos (see [1]).
>>
>>
> Nobody understands kerberos; you are not alone. And the more you
> understand of Kerberos, the less you want to.
>
> Spark on YARN uses it to get the tokens for Hive, HBase et al (see
>> [2]). It happens when Client starts conversation to YARN RM (see [3]).
>>
>> You should not do that yourself (and BTW you've got a typo in
>> spark.yarn.security.tokens.habse.enabled setting). I think that the
>> entire code you pasted matches the code Spark's doing itself before
>> requesting resources from YARN.
>>
>> Give it a shot and report back since I've never worked in such a
>> configuration and would love improving in this (security) area.
>> Thanks!
>>
>> [1] http://www.cloudera.com/documentation/enterprise/5-5-x/
>> topics/cdh_sg_hbase_authentication.html#concept_zyz_vg5_nt__
>> section_s1l_nwv_ls
>> [2] https://github.com/apache/spark/blob/master/yarn/src/main/
>> scala/org/apache/spark/deploy/yarn/security/HBaseCredentialP
>> rovider.scala#L58
>> [3] https://github.com/apache/spark/blob/master/yarn/src/main/
>> scala/org/apache/spark/deploy/yarn/Client.scala#L396
>>
>>
>
> [2] is the code from last week; SPARK-14743. The predecessor code was
> pretty similar though: make an RPC call to HBase to ask for an HBase
> delegation token to be handed off to the YARN app; it requires the use to
> be Kerberos authenticated first.
>
>
> Pozdrawiam,
>> Jacek Laskowski
>>
>> >> > 2016-08-07 20:43:57,617 WARN
>> >> > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1]
>> ipc.RpcClientImpl:
>> >> > Exception encountered while connecting to the server :
>> >> > javax.security.sasl.SaslException: GSS initiate failed [Caused by
>> >> > GSSException: No valid credentials provided (Mechanism level: Failed
>> to
>> >> > find
>> >> > any Kerberos tgt)]
>> >> > 2016-08-07 20:43:57,619 ERROR
>> >> > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1]
>> ipc.RpcClientImpl:
>> >> > SASL
>> >> > authentication failed. The most likely cause is missing or invalid
>> >> > credentials. Consider 'kinit'.
>> >> > javax.security.sasl.SaslException: GSS initiate failed [Caused by
>> >> > GSSException: No valid credentials provided (Mechanism level: Failed
>> to
>> >> > find
>> >> > any Kerberos tgt)]
>> >> >       at
>> >> >
>> >> > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChalleng
>> e(GssKrb5Client.java:212)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConn
>> ect(HBaseSaslRpcClient.java:179)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSa
>> slConnection(RpcClientImpl.java:617)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$
>> 700(RpcClientImpl.java:162)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(
>> RpcClientImpl.java:743)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(
>> RpcClientImpl.java:740)
>> >> >       at java.security.AccessController.doPrivileged(Native Method)
>> >> >       at javax.security.auth.Subject.doAs(Subject.java:415)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> upInformation.java:1657)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIO
>> streams(RpcClientImpl.java:740)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRe
>> quest(RpcClientImpl.java:906)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedW
>> riteRequest(RpcClientImpl.java:873)
>> >> >       at
>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl
>> .java:1241)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMe
>> thod(AbstractRpcClient.java:227)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcCha
>> nnelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$
>> ClientService$BlockingStub.scan(ClientProtos.java:34094)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.client.ClientSmallScanner$SmallScann
>> erCallable.call(ClientSmallScanner.java:201)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.client.ClientSmallScanner$SmallScann
>> erCallable.call(ClientSmallScanner.java:180)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithout
>> Retries(RpcRetryingCaller.java:210)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$R
>> etryingRPC.call(ScannerCallableWithReplicas.java:360)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$R
>> etryingRPC.call(ScannerCallableWithReplicas.java:334)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRet
>> ries(RpcRetryingCaller.java:136)
>> >> >       at
>> >> >
>> >> > org.apache.hadoop.hbase.client.ResultBoundedCompletionServic
>> e$QueueingFuture.run(ResultBoundedCompletionService.java:65)
>> >> >       at
>> >> >
>> >> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1145)
>> >> >       at
>> >> >
>> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:615)
>> >> >       at java.lang.Thread.run(Thread.java:745)
>> >> > Caused by: GSSException: No valid credentials provided (Mechanism
>> level:
>> >> > Failed to find any Kerberos tgt)
>> >> >       at
>> >> >
>> >> > sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5In
>> itCredential.java:147)
>> >> >       at
>> >> >
>> >> > sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(
>> Krb5MechFactory.java:121)
>> >> >       at
>> >> >
>> >> > sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(K
>> rb5MechFactory.java:187)
>> >> >       at
>> >> >
>> >> > sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSMana
>> gerImpl.java:223)
>> >> >       at
>> >> > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextIm
>> pl.java:212)
>> >> >       at
>> >> > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextIm
>> pl.java:179)
>> >> >       at
>> >> >
>> >> > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChalleng
>> e(GssKrb5Client.java:193)
>> >> >       ... 25 more
>> >> >
>> >> >
>> >> > I have Spark running on Yarn with security enabled. I have kinit'd
>> from
>> >> > console and have provided necessarry principals and keytabs. Can you
>> >> > please
>> >> > help me find out the issue?
>> >> >
>> >> >
>> >> > Thanks
>> >
>> >
>>
>
>
>

Mime
View raw message