spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <philipp.meyerhoe...@thomsonreuters.com>
Subject RE: HBase / Spark Kerberos problem
Date Thu, 19 May 2016 12:10:48 GMT
Thanks Tom & John!

modifying spark-env.sh did the trick - my last line in the file is now:

export SPARK_DIST_CLASSPATH=$(paste -sd: "$SELF/classpath.txt"):`hbase classpath`:/etc/hbase/conf:/etc/hbase/conf/hbase-site.xml

Now o.a.s.d.y.Client logs “Added HBase security token to credentials” and the .count()
on my HBase RDD works fine.

From: Ellis, Tom (Financial Markets IT) [mailto:Tom.Ellis@LloydsBanking.com] 
Sent: 19 May 2016 09:51
To: 'John Trengrove'; Meyerhoefer, Philipp (TR Technology & Ops)
Cc: user
Subject: RE: HBase / Spark Kerberos problem

Yeah we ran into this issue. Key part is to have the hbase jars and hbase-site.xml config
on the classpath of the spark submitter.

We did it slightly differently from Y Bodnar, where we set the required jars and config on
the env var SPARK_DIST_CLASSPATH in our spark env file (rather than SPARK_CLASSPATH which
is deprecated).

With this and –principal/--keytab, if you turn DEBUG logging for org.apache.spark.deploy.yarn
you should see “Added HBase security token to credentials.”

Otherwise you should at least hopefully see the error where it fails to add the HBase tokens.

Check out the source of Client [1] and YarnSparkHadoopUtil  [2] – you’ll see how obtainTokenForHBase
is being done.

It’s a bit confusing as to why it says you haven’t kinited even when you do loginUserFromKeytab
– I haven’t quite worked through the reason for that yet.

Cheers,

Tom Ellis
tellisnz@gmail.com

[1] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
[2] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala


From: John Trengrove [mailto:john.trengrove@servian.com.au] 
Sent: 19 May 2016 08:09
To: philipp.meyerhoefer@thomsonreuters.com
Cc: user
Subject: Re: HBase / Spark Kerberos problem

-- This email has reached the Bank via an external source -- 
  
Have you had a look at this issue?

https://issues.apache.org/jira/browse/SPARK-12279 

There is a comment by Y Bodnar on how they successfully got Kerberos and HBase working.

2016-05-18 18:13 GMT+10:00 <philipp.meyerhoefer@thomsonreuters.com>:
Hi all,

I have been puzzling over a Kerberos problem for a while now and wondered if anyone can help.

For spark-submit, I specify --keytab x --principal y, which creates my SparkContext fine.
Connections to Zookeeper Quorum to find the HBase master work well too.
But when it comes to a .count() action on the RDD, I am always presented with the stack trace
at the end of this mail.

We are using CDH5.5.2 (spark 1.5.0), and com.cloudera.spark.hbase.HBaseContext is a wrapper
around TableInputFormat/hadoopRDD (see https://github.com/cloudera-labs/SparkOnHBase), as
you can see in the stack trace.

Am I doing something obvious wrong here?
A similar flow, inside test code, works well, only going via spark-submit exposes this issue.

Code snippet (I have tried using the commented-out lines in various combinations, without
success):

   val conf = new SparkConf().
      set("spark.shuffle.consolidateFiles", "true").
      set("spark.kryo.registrationRequired", "false").
      set("spark.serializer", "org.apache.spark.serializer.KryoSerializer").
      set("spark.kryoserializer.buffer", "30m")
    val sc = new SparkContext(conf)
    val cfg = sc.hadoopConfiguration
//    cfg.addResource(new org.apache.hadoop.fs.Path("/etc/hbase/conf/hbase-site.xml"))
//    UserGroupInformation.getCurrentUser.setAuthenticationMethod(UserGroupInformation.AuthenticationMethod.KERBEROS)
//    cfg.set("hbase.security.authentication", "kerberos")
    val hc = new HBaseContext(sc, cfg)
    val scan = new Scan
    scan.setTimeRange(startMillis, endMillis)
    val matchesInRange = hc.hbaseRDD(MY_TABLE, scan, resultToMatch)
    val cnt = matchesInRange.count()
    log.info(s"matches in range $cnt")

Stack trace / log:

16/05/17 17:04:47 INFO SparkContext: Starting job: count at Analysis.scala:93
16/05/17 17:04:47 INFO DAGScheduler: Got job 0 (count at Analysis.scala:93) with 1 output
partitions
16/05/17 17:04:47 INFO DAGScheduler: Final stage: ResultStage 0(count at Analysis.scala:93)
16/05/17 17:04:47 INFO DAGScheduler: Parents of final stage: List()
16/05/17 17:04:47 INFO DAGScheduler: Missing parents: List()
16/05/17 17:04:47 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map
at HBaseContext.scala:580), which has no missing parents
16/05/17 17:04:47 INFO MemoryStore: ensureFreeSpace(3248) called with curMem=428022, maxMem=244187136
16/05/17 17:04:47 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated
size 3.2 KB, free 232.5 MB)
16/05/17 17:04:47 INFO MemoryStore: ensureFreeSpace(2022) called with curMem=431270, maxMem=244187136
16/05/17 17:04:47 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated
size 2022.0 B, free 232.5 MB)
16/05/17 17:04:47 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 10.6.164.40:33563
(size: 2022.0 B, free: 232.8 MB)
16/05/17 17:04:47 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:861
16/05/17 17:04:47 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1]
at map at HBaseContext.scala:580)
16/05/17 17:04:47 INFO YarnScheduler: Adding task set 0.0 with 1 tasks
16/05/17 17:04:47 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, hpg-dev-vm,
partition 0,PROCESS_LOCAL, 2208 bytes)
16/05/17 17:04:47 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on hpg-dev-vm:52698
(size: 2022.0 B, free: 388.4 MB)
16/05/17 17:04:48 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on hpg-dev-vm:52698
(size: 26.0 KB, free: 388.4 MB)
16/05/17 17:04:57 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, hpg-dev-vm): org.apache.hadoop.hbase.client.RetriesExhaustedException:
Can't get the location
        at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:308)
        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:155)
        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:63)
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
        at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:314)
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:289)
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:161)
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:156)
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:888)
        at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.restart(TableRecordReaderImpl.java:90)
        at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.initialize(TableRecordReaderImpl.java:167)
        at org.apache.hadoop.hbase.mapreduce.TableRecordReader.initialize(TableRecordReader.java:138)
        at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.initialize(TableInputFormatBase.java:200)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:153)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:124)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:88)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Could not set up IO Streams to hpg-dev-vm /127.0.0.1:60020
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:773)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:890)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:859)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1193)
        at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
        at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:32627)
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1583)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1293)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1125)
        at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:299)
        ... 26 more
Caused by: java.lang.RuntimeException: SASL authentication failed. The most likely cause is
missing or invalid credentials. Consider 'kinit'.
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$1.run(RpcClientImpl.java:673)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.handleSaslConnectionFailure(RpcClientImpl.java:631)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:739)
        ... 36 more
Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException:
No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
        at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:605)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:154)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:731)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:728)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:728)
        ... 36 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any
Kerberos tgt)
        at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
        at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
        at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
        at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
        ... 45 more

--
Philipp Meyerhoefer
Thomson Reuters
philipp.meyerhoefer@tr.com


________________________________

This e-mail is for the sole use of the intended recipient and contains information that may
be privileged and/or confidential. If you are not an intended recipient, please notify the
sender by return e-mail and delete this e-mail and any attachments. Certain required legal
entity disclosures can be accessed on our website.<http://site.thomsonreuters.com/site/disclosures/>



Lloyds Banking Group plc. Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in Scotland
no. SC95000. Telephone: 0131 225 4555. Lloyds Bank plc. Registered Office: 25 Gresham Street,
London EC2V 7HN. Registered in England and Wales no. 2065. Telephone 0207626 1500. Bank of
Scotland plc. Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in Scotland no.
SC327000. Telephone: 03457 801 801. Cheltenham & Gloucester plc. Registered Office: Barnett
Way, Gloucester GL4 3RL. Registered in England and Wales 2299428. Telephone: 0345 603 1637
Lloyds Bank plc, Bank of Scotland plc are authorised by the Prudential Regulation Authority
and regulated by the Financial Conduct Authority and Prudential Regulation Authority.
Cheltenham & Gloucester plc is authorised and regulated by the Financial Conduct Authority.
Halifax is a division of Bank of Scotland plc. Cheltenham & Gloucester Savings is a division
of Lloyds Bank plc.
HBOS plc. Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in Scotland no. SC218813.
This e-mail (including any attachments) is private and confidential and may contain privileged
material. If you have received this e-mail in error, please notify the sender and delete it
(including any attachments) immediately. You must not copy, distribute, disclose or use any
of the information in it or any attachments. Telephone calls may be monitored or recorded.
Mime
View raw message