ranger-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason-Morries Adam (Jira)" <j...@apache.org>
Subject [jira] [Comment Edited] (RANGER-3099) Ranger hdfs policies not syncing automatically
Date Sat, 22 May 2021 13:21:00 GMT

    [ https://issues.apache.org/jira/browse/RANGER-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17349734#comment-17349734
] 

Jason-Morries Adam edited comment on RANGER-3099 at 5/22/21, 1:20 PM:
----------------------------------------------------------------------

An AWS Support Engineer was able to replicate this issue and found a solution. Below you can
find the details:

In particular by inspecting the Ranger code

[https://github.com/apache/ranger/blob/master/agents-common/src/main/java/org/apache/ranger/admin/client/RangerAdminRESTClient.java#L129]

this was the line of code which controlled this wrong behavior:
{code:java}
final boolean isSecureMode = user != null && UserGroupInformation.isSecurityEnabled();{code}
In specific, the "UserGroupInformation.isSecurityEnabled()" call returns false even if the
plugin was deployed on a kerberized env (i.e., I had the "hadoop.security.authentication"
property setup to "kerberos" in /etc/hadoop/conf/core-site.xml and so the call should have
returned "true"). As this looked to me as a bug I recompiled the plugin substituting in the
class "RangerAdminRESTClient.java" the following line of code:
{code:java}
final boolean isSecureMode = user != null && UserGroupInformation.isSecurityEnabled();{code}
with:
{code:java}
//final boolean isSecureMode = user != null && UserGroupInformation.isSecurityEnabled();
final boolean test_value = UserGroupInformation.isSecurityEnabled();
LOG.info("MYCUSTOMLOG test_value - " + test_value); final boolean isSecureMode = true; LOG.info("MYCUSTOMLOG
isSecureMode - " + isSecureMode); {code}
In this we basically hardcoded the fact I want the plugin to use the secured endpoint.

After performing this change and using the newer version of the Hive/HDFS plugins everything
was working as expected.

Please note this change is related to the plugins deployed on the EMR master instance only
and not the ranger-admin, usersync tools deployed on the Ranger server.
----
In the meantime you can find below some info regarding my configuration/architecture and some
mandatory prerequisites needed to have everything working.
 # 
 ## 
 ### 
 #### 
 ##### 
 ###### 
 ####### 
 ######## 
 ######### Network Architecture #########

 * EMR cluster and Ranger server on the same VPC but on 2 different subnets.
 * The DHCP option set assigns the same domain to the EMR nodes and to the Ranger server.

 # 
 ## 
 ### 
 #### 
 ##### 
 ###### 
 ####### 
 ######## 
 ######### Important prerequisites #########

 # hostnames should be cross-resolvable. Example, the following commands should all work both
on the master and on the Ranger server:
{code:java}
$ hostname -f
$ nslookup $(hostname -f)
$ nslookup <ranger_fully_qualified_hostname> $ nslookup <EMR_master_fully_qualified_hostname>{code}

 # The Ranger instance and the EMR cluster should be able to communicate. Example:

 * on the Ranger Server SecurityGroup I opened all the Inbound traffic from the EMR master
 * on the EMR master SecurityGroup I opened all the Inbound traffic from the Ranger server

 # The Ranger server should have in /etc/hadoop/conf the same files as the ones present on
the master

 # Install the kerberos client on the Ranger server => sudo yum install krb5-workstation

 # The Ranger server should have in /etc/krb5.conf the same file as the one present on the
master

 # 
 ## 
 ### 
 #### 
 ##### 
 ###### 
 ####### 
 ######## 
 ######### General overview #########

 * the Ranger server will retrieve the users from the AD server (i.e., UserSync - LDAP). In
order to do this you need to collect the following info. Here I have some dummy values as
an example:
{code:java}
ldap_ip_address="rootdomain.com"
ldap_server_url="ldap://$ldap_ip_address"
ldap_base_dn="DC=ROOTDOMAIN,DC=COM"
ldap_bind_user_dn="CN=BindUser,CN=Users,DC=ROOTDOMAIN,DC=COM"
ldap_bind_password="MyStrongPa55word"{code}

 * On the Ranger server you have to create the ranger user (i.e., sudo useradd ranger) and
assign the password "ranger" to it (i.e., passwd ranger)

 * on the EMR master we have to create all the principals for ranger. Example, let's say your
kerberos realm is MYEMRDOMAIN.COM and that your Ranger fully qualified hostname is "ip-7-0-3-163.myemrdomain.com".
You have to create on the EMR master KDC (i.e., sudo kadmin.local) the principals for:

 - HTTP/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
 - rangeradmin/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
 - rangerlookup/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
 - rangerusersync/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM

and than you have to push them on the related keytab:
 - rangerspnego.keytab
 - rangeradmin.keytab
 - rangerlookup.keytab
 - rangerusersync.keytab

Example, on the master:
{code:java}
mkdir /home/hadoop/keytabs/
sudo kadmin.local
addprinc -randkey HTTP/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
xst -k /home/hadoop/keytabs/rangerspnego.keytab HTTP/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
addprinc -randkey rangeradmin/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
xst -k /home/hadoop/keytabs/rangeradmin.keytab rangeradmin/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
addprinc -randkey rangerlookup/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
xst -k /home/hadoop/keytabs/rangerlookup.keytab rangerlookup/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
addprinc -randkey rangerusersync/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
xst -k /home/hadoop/keytabs/rangerusersync.keytab rangerusersync/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
{code}
You than have to copy the keytabs on the Ranger server on a dedicated location (i.e., /etc/),
make them owned by the ranger user (chown ranger <keytab>) and only readable by it (chmod
700 <keytab>).

Once you have them try to see if they works. Example:
{code:java}
sudo su - ranger
kinit HTTP/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM -kt /etc/rangerspnego.keytab klist
kinit rangeradmin/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM -kt /etc/rangeradmin.keytab
klist
kinit rangerlookup/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM -kt /etc/rangerlookup.keytab
klist
kinit rangerusersync/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM -kt /etc/rangerusersync.keytab
klist
{code}
 


was (Author: jasonmadam):
An AWS Support Engineer was able to replicate this issue and found a solution. Below you can
find the details:

In particular by inspecting the Ranger code

https://github.com/apache/ranger/blob/master/agents-common/src/main/java/org/apache/ranger/admin/client/RangerAdminRESTClient.java#L129

this was the line of code which controlled this wrong behavior:

 
{code:java}
final boolean isSecureMode = user != null && UserGroupInformation.isSecurityEnabled();{code}
 

In specific, the "UserGroupInformation.isSecurityEnabled()" call returns false even if the
plugin was deployed on a kerberized env (i.e., I had the "hadoop.security.authentication"
property setup to "kerberos" in /etc/hadoop/conf/core-site.xml and so the call should have
returned "true"). As this looked to me as a bug I recompiled the plugin substituting in the
class "RangerAdminRESTClient.java" the following line of code:

 
{code:java}
final boolean isSecureMode = user != null && UserGroupInformation.isSecurityEnabled();{code}
 

with:
{code:java}
//final boolean isSecureMode = user != null && UserGroupInformation.isSecurityEnabled();
final boolean test_value = UserGroupInformation.isSecurityEnabled();
LOG.info("MYCUSTOMLOG test_value - " + test_value); final boolean isSecureMode = true; LOG.info("MYCUSTOMLOG
isSecureMode - " + isSecureMode); {code}
In this we basically hardcoded the fact I want the plugin to use the secured endpoint.

After performing this change and using the newer version of the Hive/HDFS plugins everything
was working as expected.

Please note this change is related to the plugins deployed on the EMR master instance only
and not the ranger-admin, usersync tools deployed on the Ranger server.
----
In the meantime you can find below some info regarding my configuration/architecture and some
mandatory prerequisites needed to have everything working.


######### Network Architecture #########

* EMR cluster and Ranger server on the same VPC but on 2 different subnets.
* The DHCP option set assigns the same domain to the EMR nodes and to the Ranger server.

######### Important prerequisites #########

# hostnames should be cross-resolvable. Example, the following commands should all work both
on the master and on the Ranger server:
{code:java}
$ hostname -f
$ nslookup $(hostname -f)
$ nslookup <ranger_fully_qualified_hostname> $ nslookup <EMR_master_fully_qualified_hostname>{code}
# The Ranger instance and the EMR cluster should be able to communicate. Example:

* on the Ranger Server SecurityGroup I opened all the Inbound traffic from the EMR master
* on the EMR master SecurityGroup I opened all the Inbound traffic from the Ranger server

# The Ranger server should have in /etc/hadoop/conf the same files as the ones present on
the master

# Install the kerberos client on the Ranger server => sudo yum install krb5-workstation

# The Ranger server should have in /etc/krb5.conf the same file as the one present on the
master

######### General overview #########

* the Ranger server will retrieve the users from the AD server (i.e., UserSync - LDAP). In
order to do this you need to collect the following info. Here I have some dummy values as
an example:
{code:java}
ldap_ip_address="rootdomain.com"
ldap_server_url="ldap://$ldap_ip_address"
ldap_base_dn="DC=ROOTDOMAIN,DC=COM"
ldap_bind_user_dn="CN=BindUser,CN=Users,DC=ROOTDOMAIN,DC=COM"
ldap_bind_password="MyStrongPa55word"{code}
* On the Ranger server you have to create the ranger user (i.e., sudo useradd ranger) and
assign the password "ranger" to it (i.e., passwd ranger)

* on the EMR master we have to create all the principals for ranger. Example, let's say your
kerberos realm is MYEMRDOMAIN.COM and that your Ranger fully qualified hostname is "ip-7-0-3-163.myemrdomain.com".
You have to create on the EMR master KDC (i.e., sudo kadmin.local) the principals for:

- HTTP/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
- rangeradmin/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
- rangerlookup/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
- rangerusersync/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM

and than you have to push them on the related keytab:

- rangerspnego.keytab
- rangeradmin.keytab
- rangerlookup.keytab
- rangerusersync.keytab

Example, on the master:
{code:java}
mkdir /home/hadoop/keytabs/
sudo kadmin.local
addprinc -randkey HTTP/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
xst -k /home/hadoop/keytabs/rangerspnego.keytab HTTP/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
addprinc -randkey rangeradmin/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
xst -k /home/hadoop/keytabs/rangeradmin.keytab rangeradmin/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
addprinc -randkey rangerlookup/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
xst -k /home/hadoop/keytabs/rangerlookup.keytab rangerlookup/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
addprinc -randkey rangerusersync/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
xst -k /home/hadoop/keytabs/rangerusersync.keytab rangerusersync/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM
{code}
You than have to copy the keytabs on the Ranger server on a dedicated location (i.e., /etc/),
make them owned by the ranger user (chown ranger <keytab>) and only readable by it (chmod
700 <keytab>).

Once you have them try to see if they works. Example:
{code:java}
sudo su - ranger
kinit HTTP/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM -kt /etc/rangerspnego.keytab klist
kinit rangeradmin/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM -kt /etc/rangeradmin.keytab
klist
kinit rangerlookup/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM -kt /etc/rangerlookup.keytab
klist
kinit rangerusersync/ip-7-0-3-163.myemrdomain.com@MYEMRDOMAIN.COM -kt /etc/rangerusersync.keytab
klist
{code}
 

> Ranger hdfs policies not syncing automatically
> ----------------------------------------------
>
>                 Key: RANGER-3099
>                 URL: https://issues.apache.org/jira/browse/RANGER-3099
>             Project: Ranger
>          Issue Type: Bug
>          Components: plugins, Ranger
>    Affects Versions: 2.1.0
>         Environment: AWS EMR, WIndows AD
>            Reporter: Anoop Kumar K M
>            Priority: Blocker
>
> Hi,
> We are trying to implement Ranger 2 .1.0 on top of AWS EMR 6.1.0.
> EMR 6.1.0 has  hadoop 3. The cluster is Kerberos enabled.
> I have installed ranger in a separate ec2 machine and able to install hdfs plugin in
EMR.
> But the problem is that for policies to be applied, both ranger server and hdfs namenode
should be restarted . After I restart both the policies becomes effective
> Ranger admin logs shows below error.
> ==========
> 2020-11-30 10:57:42,397 [http-bio-6080-exec-9] INFO org.apache.ranger.common.RESTErrorUtil
(RESTErrorUtil.java:345) - Request failed. loginId=null, logMessage=Unauthenticated access
not allowed javax.ws.rs.WebApplicationException at org.apache.ranger.common.RESTErrorUtil.createRESTException(RESTErrorUtil.java:337)
=========
>  
> Namenode logs show below error.
> ==========
>  
> 2020-12-02 13:32:53,863 ERROR org.apache.ranger.admin.client.RangerAdminRESTClient (Thread-29):
Error getting Roles; service not found. secureMode=false, user=hdfs/ip-10-98-84-189.eu-west-1.compute.internal@EU-WEST-1.COMPUTE.INTERNAL
(auth:KERBEROS), response=404, serviceName=hadoopdev, lastKnownRoleVersion=-1, lastActivationTimeInMillis=1606746562885
>  
> 2020-12-02 13:32:53,863 WARN org.apache.ranger.admin.client.RangerAdminRESTClient (Thread-29):
Received 404 error code with body:[null], Ignoring
>  2020-12-02 13:32:53,863 INFO org.apache.ranger.admin.client.RangerAdminRESTClient (Thread-29):
Skip Securetrue
>  2020-12-02 13:32:53,869 WARN org.apache.ranger.admin.client.RangerAdminRESTClient (Thread-29):
Error getting policies. secureMode=false, user=hdfs/ip-10-98-84-189.eu-west-1.compute.internal@EU-WEST-1.COMPUTE.INTERNAL
(auth:KERBEROS), response=\{"httpStatusCode":400,"statusCode":0}, serviceName=hadoopdev
> ==========
>  
> Under kerberos config in install.properties of ranger I have the below settings
>  
> --------------Kerberos Config -----------------
>  spnego_principal=HTTP/ip-10-98-84-189.eu-west-1.compute.internal@EU-WEST-1.COMPUTE.INTERNAL
>  spnego_keytab=/etc/security/keytabs/spnego.keytab
>  token_valid=30
>  cookie_domain=ip-10-98-84-189.eu-west-1.compute.internal
>  cookie_path=/
>  admin_principal=rangeradmin/ip-10-98-84-189.eu-west-1.compute.internal@EU-WEST-1.COMPUTE.INTERNAL
>  admin_keytab=/etc/security/keytabs/rangeradmin.keytab
>  lookup_principal=rangerlookup/ip-10-98-84-189.eu-west-1.compute.internal@EU-WEST-1.COMPUTE.INTERNAL
>  lookup_keytab=/etc/security/keytabs/rangerlookup.keytab
>  hadoop_conf=/etc/hadoop/conf
>  
> In the ranger console for the service config I have given below property
>  
> [policy.download.auth.users = hdfs@EU-WEST-1.COMPUTE.INTERNAL|mailto:policy.download.auth.users=hdfs@EU-WEST-1.COMPUTE.INTERNAL]
>  
> Not sure what I am missing. Any input in this will be a great help
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message