spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Hill <greg.h...@RACKSPACE.COM>
Subject spark on yarn history server + hdfs permissions issue
Date Tue, 09 Sep 2014 19:30:16 GMT
I am running Spark on Yarn with the HDP 2.1 technical preview.  I'm having issues getting the
spark history server permissions to read the spark event logs from hdfs.  Both sides are configured
to write/read logs from:

hdfs:///apps/spark/events

The history server is running as user spark, the jobs are running as user lavaqe.  Both users
are in the  hdfs group on all the nodes in the cluster.

That root logs folder is globally writeable, but owned by the spark user:

drwxrwxrwx   - spark hdfs          0 2014-09-09 18:19 /apps/spark/events

All good so far.  Spark jobs create subfolders and put their event logs in there just fine.
 The problem is that the history server, running as the spark user, cannot read those logs.
 They're written as the user that initiates the job, but still in the same hdfs group:

drwxrwx---   - lavaqe hdfs          0 2014-09-09 19:24 /apps/spark/events/spark-pi-1410290714996

The files are group readable/writable, but this is the error I get:

Permission denied: user=spark, access=READ_EXECUTE, inode="/apps/spark/events/spark-pi-1410290714996":lavaqe:hdfs:drwxrwx---

So, two questions, I guess:

1. Do group permissions just plain not work in hdfs or am I missing something?
2. Is there a way to tell Spark to log with more permissive permissions so the history server
can read the generated logs?

Greg

Mime
View raw message