spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sudhir Babu Pothineni <sbpothin...@gmail.com>
Subject Re: Spark standalone - reading kerberos hdfs
Date Fri, 08 Jan 2021 19:53:30 GMT
Incase of Spark on Yarn, Application Master shares the token.

I think incase of spark stand alone the token is not shared to executor,
any example how to get the HDFS token for executor?

On Fri, Jan 8, 2021 at 12:13 PM Gabor Somogyi <gabor.g.somogyi@gmail.com>
wrote:

> TGT is not enough, you need HDFS token which can be obtained by Spark.
> Please check the logs...
>
> On Fri, 8 Jan 2021, 18:51 Sudhir Babu Pothineni, <sbpothineni@gmail.com>
> wrote:
>
>> I spin up a spark standalone cluster (spark.autheticate=false), submitted
>> a job which reads remote kerberized HDFS,
>>
>> val spark = SparkSession.builder()
>>                   .master("spark://spark-standalone:7077")
>>                   .getOrCreate()
>>
>> UserGroupInformation.loginUserFromKeytab(principal, keytab)
>> val df = spark.read.parquet("hdfs://namenode:8020/test/parquet/")
>>
>> Ran into following exception:
>>
>> Caused by:
>> java.io.IOException: java.io.IOException: Failed on local exception:
>> java.io.IOException: org.apache.hadoop.security.AccessControlException:
>> Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host
>> is: "..."; destination host is: "...":10346;
>>
>>
>> Any suggestions?
>>
>> Thanks
>> Sudhir
>>
>

Mime
View raw message