spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Schulz <>
Subject Data Security on Spark-on-HDFS
Date Mon, 31 Aug 2015 10:02:05 GMT
Hi guys,

In a nutshell: does Spark check and respect user privileges when reading/writing data.

I am curious about the data security when Spark runs on top of HDFS — maybe though YARN.
Is Spark running it's long-running JVM processes as a Spark user, that makes no distinction
when accessing data? So is there a shortcoming when using Spark because the JVM processes
are already running and therefore the launching user is omitted by Spark when accessing data
residing on HDFS? Or is Spark only reading/writing data, that the user had access to, that
launched this Thread?

What about local store when running in Standalone mode? What about access calls to HBase or
Hive then?

Thanks for taking time.

Best regards, Daniel.
To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message