ranger-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Madhan Neethiraj <mad...@apache.org>
Subject Re: Help: Tag based policy for non-Atlas solution
Date Sun, 06 Sep 2020 21:50:32 GMT
Smit,

 

I understand the reasoning to leverage existing Ranger tag-sync and tag-store implementation,
instead of going with a custom context-enricher. While this is feasible, it will require use
of internal APIs which could change in future releases. If you still want to provide an alternate
source for tags, I suggest to consider extending org.apache.ranger.tagsync.model.AbstractTagSource,
similar to AtlasTagSource, and register using with following configurations in ranger-tagsync-site.xml:

ranger.tagsync.source.<name-of-your-source>=true

ranger.tagsync.source.<name-of-your-source>.class=<implementation-class-name>

 

Hope this helps.

 

Madhan

 

From: Smit Shah <smits@zillowgroup.com>
Date: Tuesday, September 1, 2020 at 4:06 PM
To: Madhan Neethiraj <madhan@apache.org>, "dev@ranger.apache.org" <dev@ranger.apache.org>
Cc: "abhay@apache.org" <abhay@apache.org>, "bganesan@apache.org" <bganesan@apache.org>
Subject: Re: Help: Tag based policy for non-Atlas solution

 

Hi Madhan, 

Thank you for writing back with suggestion. 

I would like to get some more insights on few options and general questions based on the suggestion
provided and more investigation.

 

Option A: The solution you suggested (it’s really helpful)
With this we will not be leveraging ranger-tagsync process and all the tag related tables
(ranger.x_tag*) that Ranger maintains. I can think of two challenges to tackle for us:
For our high request demand, the end-point which retrieves tags for resource needs to be highly
available, faster and handle concurrent requests. 
If incase the end-point or our tag store is down, it will fail and we have to either make
the resource request deny/pass-through. 
 

Option B: Leveraging ranger-tagsync process

Similar to how Ranger listens to Atlas’s Kafka topic, we can create an Apache Kafka topic
for our tag stores change notification and let ranger-tagsync process listen to it. We can
skip Option A.

Many of the property name defined inside install.properties are specific to Atlas. So, not
sure if ranger-tagsync is designed specifically for Atlas. 
Can you think of any challenges here? 

Option C: Storing our tags directly inside Rangers internal tag store
There are end-points provided by Ranger that we can leverage. So, instead of implementing
content enricher (Option A), we can store our tags inside ranger tag-store and let Ranger
work the normal way. 

Can you think of any challenges here?   




General question:

Does Ranger plugins also keep a cached version of the rangers internal tag-store apart from
policy? Trying to see if there are benefits of putting our tag details inside rangers tag-store.





Overall, Option B seems like a better option to me if possible to implement. 

 

 

SMIT SHAH
SDE, Big Data
Pronouns: he/him/his
 

 

From: Madhan Neethiraj <madhan@apache.org>
Date: Monday, August 31, 2020 at 1:28 AM
To: Smit Shah <smits@zillowgroup.com>, "dev@ranger.apache.org" <dev@ranger.apache.org>
Cc: "madhan@apache.org" <madhan@apache.org>, "abhay@apache.org" <abhay@apache.org>,
"bganesan@apache.org" <bganesan@apache.org>
Subject: Re: Help: Tag based policy for non-Atlas solution

 

Smit,

 

I suggest to consider implementing a context enricher that deals with retrieving tags from
your tag store and sets tags for the resource in the request-context, with a call to RangerAccessRequestUtil.setRequestTagsInContext(context,
tags). Tag service-def should be updated to register this context enricher, instead of current
enricher implementation (RangerAdminTagRetriever).

 

Hope this helps.

 

Madhan

 

 

 

From: Smit Shah <smits@zillowgroup.com>
Date: Wednesday, August 26, 2020 at 3:59 PM
To: "dev@ranger.apache.org" <dev@ranger.apache.org>
Cc: "madhan@apache.org" <madhan@apache.org>, "abhay@apache.org" <abhay@apache.org>,
"bganesan@apache.org" <bganesan@apache.org>
Subject: Help: Tag based policy for non-Atlas solution

 

cc: Team Members who created Confluence wiki pages that I have referred

 

Hi Apache Ranger Dev Team, 

I am Smit Shah, working at Zillow as a Data Engineer. My team is working on Data Governance
around Apache Hive. We came across Apache Ranger and one of the key feature we like is Tag
Based Policies, and really interested to leverage this. :)

Now, when going through the documentation for Tag Based Policies, I found that Tag Sync has
native support for Apache Atlas. Now, our team already has our own tag store and trying to
avoid adding another layer. So, checking with the team if there are any examples/blogs/documentation
that you can share which can help to: 
1. Store tags
2. How to make tag based policy work in Apache Ranger for non Apache Atlas solution 

Some web-pages that I came across during my initial investigation: 
1. Context enrichers – Not sure if this is important for my use-case
2. Installing Tag Synchronizer – How to make this work for non-Atlas solution
3. Ranger API – This might be needed for storing tags, like we can create service which
calls this end-point which takes data from our tag store and store it in Ranger in required
format. 


You help/details will be really helpful to us. Sending email seemed like the best way to reach
out to the team. Thank you very much in advance. :)

 

SMIT SHAH
SDE, Big Data
Pronouns: he/him/his
 


Mime
View raw message