spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aneela Saleem <ane...@platalytics.com>
Subject Re: Accessing HBase through Spark with Security enabled
Date Sun, 21 Aug 2016 21:01:10 GMT
Any update on this?

On Tuesday, 16 August 2016, Aneela Saleem <aneela@platalytics.com> wrote:

> Thanks Steve,
>
> I have gone through it's documentation, i did not get any idea how to
> install it. Can you help me?
>
> On Mon, Aug 15, 2016 at 4:23 PM, Steve Loughran <stevel@hortonworks.com
> <javascript:_e(%7B%7D,'cvml','stevel@hortonworks.com');>> wrote:
>
>>
>> On 15 Aug 2016, at 08:29, Aneela Saleem <aneela@platalytics.com
>> <javascript:_e(%7B%7D,'cvml','aneela@platalytics.com');>> wrote:
>>
>> Thanks Jacek!
>>
>> I have already set hbase.security.authentication property set to
>> kerberos, since Hbase with kerberos is working fine.
>>
>> I tested again after correcting the typo but got same error. Following is
>> the code, Please have a look:
>>
>> System.setProperty("java.security.krb5.conf", "/etc/krb5.conf");
>> System.setProperty("java.security.auth.login.config",
>> "/etc/hbase/conf/zk-jaas.conf");
>> val hconf = HBaseConfiguration.create()
>> val tableName = "emp"
>> hconf.set("hbase.zookeeper.quorum", "hadoop-master")
>> hconf.set(TableInputFormat.INPUT_TABLE, tableName)
>> hconf.set("hbase.zookeeper.property.clientPort", "2181")
>> hconf.set("hbase.master", "hadoop-master:60000")
>> hconf.set("hadoop.security.authentication", "kerberos")
>> hconf.addResource(new Path("/etc/hbase/conf/core-site.xml"))
>> hconf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"))
>>
>>
>> spark should be automatically picking those up from the classpath; adding
>> them to your  own hconf isn't going to have any effect on the hbase config
>> used to extract the hbase token on Yarn app launch. That all needs to be
>> set up at the time the Spark cluster/app is launched. If you are running
>>
>> There's a little diagnostics tool, kdiag, which will be in future Hadoop
>> versions —It's available as a standalone JAR for others to use
>>
>> https://github.com/steveloughran/kdiag
>>
>> This may help verify things like your keytab/login details
>>
>>
>> val conf = new SparkConf()
>> conf.set("spark.yarn.security.tokens.hbase.enabled", "true")
>> conf.set("spark.authenticate", "true")
>> conf.set("spark.authenticate.secret","None")
>> val sc = new SparkContext(conf)
>> val hBaseRDD = sc.newAPIHadoopRDD(hconf, classOf[TableInputFormat],
>> classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
>> classOf[org.apache.hadoop.hbase.client.Result])
>>
>> val count = hBaseRDD.count()
>> print("HBase RDD count:" + count)
>>
>>
>>
>>
>> On Sat, Aug 13, 2016 at 8:36 PM, Jacek Laskowski <jacek@japila.pl
>> <javascript:_e(%7B%7D,'cvml','jacek@japila.pl');>> wrote:
>>
>>> Hi Aneela,
>>>
>>> My (little to no) understanding of how to make it work is to use
>>> hbase.security.authentication property set to kerberos (see [1]).
>>>
>>>
>> Nobody understands kerberos; you are not alone. And the more you
>> understand of Kerberos, the less you want to.
>>
>> Spark on YARN uses it to get the tokens for Hive, HBase et al (see
>>> [2]). It happens when Client starts conversation to YARN RM (see [3]).
>>>
>>> You should not do that yourself (and BTW you've got a typo in
>>> spark.yarn.security.tokens.habse.enabled setting). I think that the
>>> entire code you pasted matches the code Spark's doing itself before
>>> requesting resources from YARN.
>>>
>>> Give it a shot and report back since I've never worked in such a
>>> configuration and would love improving in this (security) area.
>>> Thanks!
>>>
>>> [1] http://www.cloudera.com/documentation/enterprise/5-5-x/topic
>>> s/cdh_sg_hbase_authentication.html#concept_zyz_vg5_nt__secti
>>> on_s1l_nwv_ls
>>> [2] https://github.com/apache/spark/blob/master/yarn/src/main/sc
>>> ala/org/apache/spark/deploy/yarn/security/HBaseCredentialPro
>>> vider.scala#L58
>>> [3] https://github.com/apache/spark/blob/master/yarn/src/main/sc
>>> ala/org/apache/spark/deploy/yarn/Client.scala#L396
>>>
>>>
>>
>> [2] is the code from last week; SPARK-14743. The predecessor code was
>> pretty similar though: make an RPC call to HBase to ask for an HBase
>> delegation token to be handed off to the YARN app; it requires the use to
>> be Kerberos authenticated first.
>>
>>
>> Pozdrawiam,
>>> Jacek Laskowski
>>>
>>> >> > 2016-08-07 20:43:57,617 WARN
>>> >> > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1]
>>> ipc.RpcClientImpl:
>>> >> > Exception encountered while connecting to the server :
>>> >> > javax.security.sasl.SaslException: GSS initiate failed [Caused
by
>>> >> > GSSException: No valid credentials provided (Mechanism level:
>>> Failed to
>>> >> > find
>>> >> > any Kerberos tgt)]
>>> >> > 2016-08-07 20:43:57,619 ERROR
>>> >> > [hconnection-0x24b5fa45-metaLookup-shared--pool2-t1]
>>> ipc.RpcClientImpl:
>>> >> > SASL
>>> >> > authentication failed. The most likely cause is missing or invalid
>>> >> > credentials. Consider 'kinit'.
>>> >> > javax.security.sasl.SaslException: GSS initiate failed [Caused
by
>>> >> > GSSException: No valid credentials provided (Mechanism level:
>>> Failed to
>>> >> > find
>>> >> > any Kerberos tgt)]
>>> >> >       at
>>> >> >
>>> >> > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChalleng
>>> e(GssKrb5Client.java:212)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConn
>>> ect(HBaseSaslRpcClient.java:179)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSa
>>> slConnection(RpcClientImpl.java:617)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$
>>> 700(RpcClientImpl.java:162)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(R
>>> pcClientImpl.java:743)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(R
>>> pcClientImpl.java:740)
>>> >> >       at java.security.AccessController.doPrivileged(Native Method)
>>> >> >       at javax.security.auth.Subject.doAs(Subject.java:415)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>> upInformation.java:1657)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIO
>>> streams(RpcClientImpl.java:740)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRe
>>> quest(RpcClientImpl.java:906)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedW
>>> riteRequest(RpcClientImpl.java:873)
>>> >> >       at
>>> >> > org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl
>>> .java:1241)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMe
>>> thod(AbstractRpcClient.java:227)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcCha
>>> nnelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Clie
>>> ntService$BlockingStub.scan(ClientProtos.java:34094)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.client.ClientSmallScanner$SmallScann
>>> erCallable.call(ClientSmallScanner.java:201)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.client.ClientSmallScanner$SmallScann
>>> erCallable.call(ClientSmallScanner.java:180)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithout
>>> Retries(RpcRetryingCaller.java:210)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$R
>>> etryingRPC.call(ScannerCallableWithReplicas.java:360)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$R
>>> etryingRPC.call(ScannerCallableWithReplicas.java:334)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRet
>>> ries(RpcRetryingCaller.java:136)
>>> >> >       at
>>> >> >
>>> >> > org.apache.hadoop.hbase.client.ResultBoundedCompletionServic
>>> e$QueueingFuture.run(ResultBoundedCompletionService.java:65)
>>> >> >       at
>>> >> >
>>> >> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1145)
>>> >> >       at
>>> >> >
>>> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:615)
>>> >> >       at java.lang.Thread.run(Thread.java:745)
>>> >> > Caused by: GSSException: No valid credentials provided (Mechanism
>>> level:
>>> >> > Failed to find any Kerberos tgt)
>>> >> >       at
>>> >> >
>>> >> > sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5In
>>> itCredential.java:147)
>>> >> >       at
>>> >> >
>>> >> > sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(
>>> Krb5MechFactory.java:121)
>>> >> >       at
>>> >> >
>>> >> > sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(K
>>> rb5MechFactory.java:187)
>>> >> >       at
>>> >> >
>>> >> > sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSMana
>>> gerImpl.java:223)
>>> >> >       at
>>> >> > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextIm
>>> pl.java:212)
>>> >> >       at
>>> >> > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextIm
>>> pl.java:179)
>>> >> >       at
>>> >> >
>>> >> > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChalleng
>>> e(GssKrb5Client.java:193)
>>> >> >       ... 25 more
>>> >> >
>>> >> >
>>> >> > I have Spark running on Yarn with security enabled. I have kinit'd
>>> from
>>> >> > console and have provided necessarry principals and keytabs. Can
you
>>> >> > please
>>> >> > help me find out the issue?
>>> >> >
>>> >> >
>>> >> > Thanks
>>> >
>>> >
>>>
>>
>>
>>
>

Mime
View raw message