ranger-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Don Bosco Durai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (RANGER-2128) Implement SparkSQL plugin
Date Mon, 25 Jun 2018 23:02:00 GMT

    [ https://issues.apache.org/jira/browse/RANGER-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522919#comment-16522919

Don Bosco Durai commented on RANGER-2128:

{quote}It has exposed Parser/Analyzer/Optimizer/Planner, which is so great for all users.
It also makes it easier for users to call our plug-in.

1. spark-authorizer is designed as a Optimize Rule for Spark SQL and executed after all other
default rules because rules, such as column pruning, projection push down, and so on, should
be operated first.
I was wondering if it would be difficult to migrate your extension to use the official hook
provided by Spark? If we can do that, then it might be easy to add Ranger features like dynamic
UDF and row level filtering.
{quote}2. spark-authorizer has to visit hive SessionState object which is not accessible for
spark context classloader because Spark use a isolated classloader to load hive client jars.
2.1 spark-authorizer itself will rewrite SessionState object the first time to do privileges
I checked that. It is a pretty good hack that works :) I had to update it to support custom
authentication. The current Ranger Hive Plugin use Hadoop UGI, which only knows Kerberos
and Simple Auth. 
{quote}2.2 kyuubi hacks spark and turn off that classloader.
I went through your documentation, it seems you have added a lot of good features. Currently,
kyuubi is a custom build. Is it possible to integrate your extensions as an addon to existing
deployment? In this way, users can deploy the default Thrift Server, but using some properties
or code injections adds your feature? We might then able to support Livy also with the same
code base.
{quote}3. spark-authorizer reuses the ranger hive plugin(0.5)which contains incompatible
jersey dependencies with spark ones.
There are few limitations with Ranger 0.5, most notably it doesn't support Tag Based policies.
I was thinking, we should just implement first class plugin for SparkSQL using Ranger 0.7
or 1.0. It could use the same Hive ServiceDef/Policies, but native implementation for SparkSQL.
In this way, we don't have to be dependent with Hive libraries and it's limitation.

{quote}And what are the steps I should follow to contribute Ranger?
I have added you as a contributor to Ranger. You should be able to assign Jira to yourself and
create new ones. I was thinking of splitting the work among those interested. Since you are
familiar with the Spark code, do you want to look into the new extensions and see how we can
implement basic authorization and advanced features like dynamic masking/UDF and Row Level
filtering? I can look into Tag based policies and also see if I can extract your current Spark
Authorizer feature into native SparkSQL Ranger Plugin.

Give me your thoughts and suggestions.







> Implement SparkSQL plugin
> -------------------------
>                 Key: RANGER-2128
>                 URL: https://issues.apache.org/jira/browse/RANGER-2128
>             Project: Ranger
>          Issue Type: New Feature
>          Components: plugins, Ranger
>    Affects Versions: 1.1.0
>            Reporter: t oo
>            Priority: Major
>             Fix For: 1.1.0
> Implement SparkSQL plugin

This message was sent by Atlassian JIRA

View raw message