sentry-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gregory Chanan (JIRA)" <>
Subject [jira] [Commented] (SENTRY-27) Refactor to be able to support different provider backends (e.g. db vs file)
Date Mon, 07 Oct 2013 23:58:42 GMT


Gregory Chanan commented on SENTRY-27:

I spent a bit of time working on this and here's what I found:

There are 3 important concepts:
- GroupMapping (e.g. hadoop/local file)
- PolicyEngine (e.g. hive or solr)
- ProviderBackend (e.g. local fs or database)

Today these are all coupled (e.g. HadoopGroupResourceAuthorizationProvider uses the Hadoop
Group Mapping, and the Hive PolicyEngine, which uses the File ProviderBackend.  Now, let's
say we want to split these up.

The problem is in the interaction with the PolicyEngine and the ProviderBackend.  Basically,
the PolicyEngine relies on the ProviderBackend (to get the roles), but the ProviderBackend
depends on the PolicyEngine (to get the validators), and this both happens in their constructors.
 So, the classic way to solve this is to pull one of these things out of the constructor.

I purpose changing the interface to the ProviderBackend.  Today, this is done via the SimplePolicyParser,
that does all the parsing in the constructor (and thus requires the validators at construction
time).  The parsing could be done later as some result of a function call that takes the roleValidators.
 I'm not sure of a good backend-agnostic name here ("parse" seems file specific -- maybe "process").
 One issue I'm concerned about is that the SimplePolicyParser has some code designed to be
threadsafe (e.g. AtomicReferences).  I can't see the reason for this right now, because everything
happens in the constructor.  But if multiple threads can call "process" there may be an issue.

Am I missing something, [~shreepadma]?  Why is there thread-safe code in SimplePolicyParser?
 What do you think of this proposal?

> Refactor to be able to support different provider backends (e.g. db vs file)
> ----------------------------------------------------------------------------
>                 Key: SENTRY-27
>                 URL:
>             Project: Sentry
>          Issue Type: Sub-task
>    Affects Versions: 1.2.0
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>             Fix For: 1.3.0
> see this review request:
> Here's the part that is relevant to this JIRA (note: this is just refactoring to be able
to support other backends, this is not support for other backends).
> The issue right now is the sentry-provider-file and sentry-provider-hive have things
that are both backend-specific (e.g. PolicyFileConstants) and backend-agnostic (e.g. HadoopMappingService).
 Let's take an specific use case: we want to use the HadoopGroupResourceAuthorizationProvider
via a database-backend.  If today I specify the AuthorizationProvider to be say "org.apache.sentry.provider.hive.HadoopGroupResourceAuthorizationProvider"
it will automatically be file-backed because it automatically instantiates a SimpleHivePolicyEngine
which in turn uses the file.SimplePolicyParser.  So, we should separate out the specification
of the AuthorizationProvider and the PolicyEngine.  So I would specify the AuthorizationProvider
to be "org.apache.sentry.provider.hive.HadoopGroupResourceAuthorizationProvider" and the PolicyEngine
to be "org.apache.sentry.policyengine.db.DBPolicyEngine" (that example sort of sucks because
it has db twice -- maybe dbbackend.DBPolicyEngine).  I think that is pretty similar to what
you are saying -- just using "PolicyEngine" or "Policy" instead of "Permission Implementation".
> In my mind, there are 6 different files-types here, if we assume support for file/db
backends and db/solr services:
> Policy-Engine Specific {common, db, solr}
> Non-Policy-Engine Specific {common, file, db}
> Now I don't have a huge preference for where this should all go, except that the policy-engine
specific stuff for db and solr should be in their own package, to avoid pulling in a bunch
of dependencies if they aren't needed.  So this could be something like:
> sentry-policy/sentry-policy-db
> sentry-policy/sentry-policy-solr
> sentry-policy/sentry-policy-common
> sentry-provider/sentry-provider-backend-file
> sentry-provider/sentry-provider-backend-db
> sentry-provider/sentry-provider-common
> Or we could just throw all the "common" stuff into core.

This message was sent by Atlassian JIRA

View raw message