I'm quite unfamiliar with Hadoop/HDFS auth mechanisms for now, but would like to investigate this issue later. Would you please open an JIRA for it? Thanks!
On 1/19/15 1:00 AM, Yi Tian wrote:
Is there any way to support multiple users executing SQL on one thrift server?
I think there are some problems for spark 1.2.0, for example:
- Start thrift server with user A
- Connect to thrift server via beeline with user B
- Execute “insert into table dest select … from table src”
then we found these items on hdfs:
drwxr-xr-x - B supergroup 0 2015-01-16 16:42 /tmp/hadoop/hive_2015-01-16_16-42-48_923_1860943684064616152-3/-ext-10000 drwxr-xr-x - B supergroup 0 2015-01-16 16:42 /tmp/hadoop/hive_2015-01-16_16-42-48_923_1860943684064616152-3/-ext-10000/_temporary drwxr-xr-x - B supergroup 0 2015-01-16 16:42 /tmp/hadoop/hive_2015-01-16_16-42-48_923_1860943684064616152-3/-ext-10000/_temporary/0 drwxr-xr-x - A supergroup 0 2015-01-16 16:42 /tmp/hadoop/hive_2015-01-16_16-42-48_923_1860943684064616152-3/-ext-10000/_temporary/0/_temporary drwxr-xr-x - A supergroup 0 2015-01-16 16:42 /tmp/hadoop/hive_2015-01-16_16-42-48_923_1860943684064616152-3/-ext-10000/_temporary/0/task_201501161642_0022_m_000000 -rw-r--r-- 3 A supergroup 2671 2015-01-16 16:42 /tmp/hadoop/hive_2015-01-16_16-42-48_923_1860943684064616152-3/-ext-10000/_temporary/0/task_201501161642_0022_m_000000/part-00000
You can see all the temporary path created on driver side (thrift server side) is owned by user B (which is what we expected).
But all the output data created on executor side is owned by user A, (which is NOT what we expected).
error owner of the output data cause
org.apache.hadoop.security.AccessControlExceptionwhile the driver side moving output data into
Is anyone know how to resolve this problem?