ranger-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kent Yao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (RANGER-2128) Implement SparkSQL plugin
Date Sat, 18 May 2019 14:57:00 GMT

    [ https://issues.apache.org/jira/browse/RANGER-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843180#comment-16843180

Kent Yao commented on RANGER-2128:

{quote}What is the use case we are trying to solve here? Is it using the Spark catalog with
Ranger AuthZ? Or is this for the use case where there is no Hive metastore and Spark has its
own catalog (I believe this is the case with Hive3+ and in the more recent Spark2.3+ if I
remember correctly)
We are adding a Spark SQL Authorizer plugin here, also with row filter and data masking functions.
It uses spark's catalog to do authz, works  for both so called hive tables and spark sql
datasource tables.
{quote}Where is this plugin deployed? Will it work on kerberized clusters?
Spark has two deploy modes, client and cluster. The authz happens at Spark's driver process,
which will be the ApplicationMaster in cluster mode. In order to adapt both modes, I shaded all
jars to a uber jar which should be put into `SPARK_HOME`/jars. 

It works for kerberized clusters.
{quote}What specifically are differences in what is supported between this plugin and the
current Ranger-Hive Authorizer? What versions of Spark, Hive and Ranger will this require?
 This plugin works for SQLs supported by Spark. We currently develop this plugin against
Spark 2.3.2, Ranger master branch. Hive version is not a problem we need to concern. 
{quote}There are pointers to another github project [https://github.com/yaooqinn/kyuubi] in
the thread above which appears to be an enhanced version of SparkThrift Server. It would
be good to understand whether that has Apache 2 compatible licenses and whether kyuubi has
to be included into an existing Spark deployment directly or via external dependencies for
this Spark SQL Ranger plugin to work?
Kyuubi has Apache 2 compatible licenses. Kyuubi need not be added to Spark libs, Kyuubi can
start itself with SPARK_HOME correctly set. If the spark has implemented this plugin, Kyuubi
can directly use it. 
{quote}If there is a requirement to have Kyuubi vesion deployed in a cluster on top of Spark2
then does anyone know whether there is any plan to add this directly into Spark2 project
first class? It becomes more difficult to certify against such clones of core services in
another Apache project if the mainstream Spark2 project is not supporting this enhanced version.

I am afraid that there is no plan for Spark PMCs to add kyuubi to Apache Spark. Spark apps including
its own SparkThrift Server are single "user" apps. Besides Kyuubi, there are a lot of Projects
can provide Spark with multi tenant feature, such as Apache Livy, Apache Zeppelin etc. Maybe
I can devote Kyuubi to ASF too.


> Implement SparkSQL plugin
> -------------------------
>                 Key: RANGER-2128
>                 URL: https://issues.apache.org/jira/browse/RANGER-2128
>             Project: Ranger
>          Issue Type: New Feature
>          Components: plugins, Ranger
>    Affects Versions: 1.1.0
>            Reporter: t oo
>            Assignee: Kent Yao
>            Priority: Major
>             Fix For: 2.0.0
>         Attachments: support_ranger11.tgz
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
> Implement SparkSQL plugin

This message was sent by Atlassian JIRA

View raw message