knox-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From su...@apache.org
Subject svn commit: r1681785 [5/8] - in /knox: site/ site/books/knox-0-7-0/ trunk/ trunk/books/0.7.0/ trunk/books/0.7.0/dev-guide/
Date Tue, 26 May 2015 16:07:09 GMT
Added: knox/trunk/books/0.7.0/book_troubleshooting.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.7.0/book_troubleshooting.md?rev=1681785&view=auto
==============================================================================
--- knox/trunk/books/0.7.0/book_troubleshooting.md (added)
+++ knox/trunk/books/0.7.0/book_troubleshooting.md Tue May 26 16:07:07 2015
@@ -0,0 +1,320 @@
+<!---
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+--->
+
+## Troubleshooting ##
+
+### Finding Logs ###
+
+When things aren't working the first thing you need to do is examine the diagnostic logs.
+Depending upon how you are running the gateway these diagnostic logs will be output to different locations.
+
+#### java -jar bin/gateway.jar ####
+
+When the gateway is run this way the diagnostic output is written directly to the console.
+If you want to capture that output you will need to redirect the console output to a file using OS specific techniques.
+
+    java -jar bin/gateway.jar > gateway.log
+
+#### bin/gateway.sh start ####
+
+When the gateway is run this way the diagnostic output is written to /var/log/knox/knox.out and /var/log/knox/knox.err.
+Typically only knox.out will have content.
+
+
+### Increasing Logging ###
+
+The `log4j.properties` files `{GATEWAY_HOME}/conf` can be used to change the granularity of the logging done by Knox.
+The Knox server must be restarted in order for these changes to take effect.
+There are various useful loggers pre-populated but commented out.
+
+    log4j.logger.org.apache.hadoop.gateway=DEBUG # Use this logger to increase the debugging of Apache Knox itself.
+    log4j.logger.org.apache.shiro=DEBUG          # Use this logger to increase the debugging of Apache Shiro.
+    log4j.logger.org.apache.http=DEBUG           # Use this logger to increase the debugging of Apache HTTP components.
+    log4j.logger.org.apache.http.client=DEBUG    # Use this logger to increase the debugging of Apache HTTP client component.
+    log4j.logger.org.apache.http.headers=DEBUG   # Use this logger to increase the debugging of Apache HTTP header.
+    log4j.logger.org.apache.http.wire=DEBUG      # Use this logger to increase the debugging of Apache HTTP wire traffic.
+
+
+### LDAP Server Connectivity Issues ###
+
+If the gateway cannot contact the configured LDAP server you will see errors in the gateway diagnostic output.
+
+    13/11/15 16:30:17 DEBUG authc.BasicHttpAuthenticationFilter: Attempting to execute login with headers [Basic Z3Vlc3Q6Z3Vlc3QtcGFzc3dvcmQ=]
+    13/11/15 16:30:17 DEBUG ldap.JndiLdapRealm: Authenticating user 'guest' through LDAP
+    13/11/15 16:30:17 DEBUG ldap.JndiLdapContextFactory: Initializing LDAP context using URL 	[ldap://localhost:33389] and principal [uid=guest,ou=people,dc=hadoop,dc=apache,dc=org] with pooling disabled
+    13/11/15 16:30:17 DEBUG servlet.SimpleCookie: Added HttpServletResponse Cookie [rememberMe=deleteMe; Path=/gateway/vaultservice; Max-Age=0; Expires=Thu, 14-Nov-2013 21:30:17 GMT]
+    13/11/15 16:30:17 DEBUG authc.BasicHttpAuthenticationFilter: Authentication required: sending 401 Authentication challenge response.
+
+The client should see something along the lines of:
+
+    HTTP/1.1 401 Unauthorized
+    WWW-Authenticate: BASIC realm="application"
+    Content-Length: 0
+    Server: Jetty(8.1.12.v20130726)
+
+Resolving this will require ensuring that the LDAP server is running and that connection information is correct.
+The LDAP server connection information is configured in the cluster's topology file (e.g. {GATEWAY_HOME}/deployments/sandbox.xml).
+
+
+### Hadoop Cluster Connectivity Issues ###
+
+If the gateway cannot contact one of the services in the configured Hadoop cluster you will see errors in the gateway diagnostic output.
+
+    13/11/18 18:49:45 WARN hadoop.gateway: Connection exception dispatching request: http://localhost:50070/webhdfs/v1/?user.name=guest&op=LISTSTATUS org.apache.http.conn.HttpHostConnectException: Connection to http://localhost:50070 refused
+    org.apache.http.conn.HttpHostConnectException: Connection to http://localhost:50070 refused
+    	at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190)
+    	at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
+    	at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
+    	at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
+    	at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
+    	at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
+    	at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
+    	at org.apache.hadoop.gateway.dispatch.HttpClientDispatch.executeRequest(HttpClientDispatch.java:99)
+
+The the resulting behavior on the client will differ by client.
+For the client DSL executing the {GATEWAY_HOME}/samples/ExampleWebHdfsLs.groovy the output will look look like this.
+
+    Caught: org.apache.hadoop.gateway.shell.HadoopException: org.apache.hadoop.gateway.shell.ErrorResponse: HTTP/1.1 500 Server Error
+    org.apache.hadoop.gateway.shell.HadoopException: org.apache.hadoop.gateway.shell.ErrorResponse: HTTP/1.1 500 Server Error
+      at org.apache.hadoop.gateway.shell.AbstractRequest.now(AbstractRequest.java:72)
+      at org.apache.hadoop.gateway.shell.AbstractRequest$now.call(Unknown Source)
+      at ExampleWebHdfsLs.run(ExampleWebHdfsLs.groovy:28)
+
+When executing commands requests via cURL the output might look similar to the following example.
+
+    Set-Cookie: JSESSIONID=16xwhpuxjr8251ufg22f8pqo85;Path=/gateway/sandbox;Secure
+    Content-Type: text/html;charset=ISO-8859-1
+    Cache-Control: must-revalidate,no-cache,no-store
+    Content-Length: 21856
+    Server: Jetty(8.1.12.v20130726)
+
+    <html>
+    <head>
+    <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
+    <title>Error 500 Server Error</title>
+    </head>
+    <body><h2>HTTP ERROR 500</h2>
+
+Resolving this will require ensuring that the Hadoop services are running and that connection information is correct.
+Basic Hadoop connectivity can be evaluated using cURL as described elsewhere.
+Otherwise the Hadoop cluster connection information is configured in the cluster's topology file (e.g. {GATEWAY_HOME}/deployments/sandbox.xml).
+
+### HTTP vs HTTPS protocol issues ###
+When Knox is configured to accept requests over SSL and is presented with a request over plain HTTP, the client is presented with an error such as seen in the following:
+
+	curl -i -k -u guest:guest-password -X GET 'http://localhost:8443/gateway/sandbox/webhdfs/v1/?op=LISTSTATUS'
+	the following error is returned
+	curl: (52) Empty reply from server
+
+This is the default behavior for Jetty SSL listener. While the credentials to the default authentication provider continue to be username and password, we do not want to encourage sending these in clear text. Since prememptively sending BASIC credentials is a common pattern with REST APIs it would be unwise to redirect to a HTTPS listener thus allowing clear text passwords.
+
+To resolve this issue, we have two options:
+
+1. change the scheme in the URL to https and deal with any trust relationship issues with the presented server certificate
+2. Disabling SSL in gateway-site.xml - this is not encouraged due to the reasoning described above.
+
+### Check Hadoop Cluster Access via cURL ###
+
+When you are experiencing connectivity issue it can be helpful to "bypass" the gateway and invoke the Hadoop REST APIs directly.
+This can easily be done using the cURL command line utility or many other REST/HTTP clients.
+Exactly how to use cURL depends on the configuration of your Hadoop cluster.
+In general however you will use a command line the one that follows.
+
+    curl -ikv -X GET 'http://namenode-host:50070/webhdfs/v1/?op=LISTSTATUS'
+
+If you are using Sandbox the WebHDFS or NameNode port will be mapped to localhost so this command can be used.
+
+    curl -ikv -X GET 'http://localhost:50070/webhdfs/v1/?op=LISTSTATUS'
+
+If you are using a cluster secured with Kerberos you will need to have used `kinit` to authenticate to the KDC.
+Then the command below should verify that WebHDFS in the Hadoop cluster is accessible.
+
+    curl -ikv --negotiate -u : -X 'http://localhost:50070/webhdfs/v1/?op=LISTSTATUS'
+
+
+### Authentication Issues ###
+The following log information is available when you enable debug level logging for shiro. This can be done within the conf/log4j.properties file. Not the "Password not correct for user" message.
+
+    13/11/15 16:37:15 DEBUG authc.BasicHttpAuthenticationFilter: Attempting to execute login with headers [Basic Z3Vlc3Q6Z3Vlc3QtcGFzc3dvcmQw]
+    13/11/15 16:37:15 DEBUG ldap.JndiLdapRealm: Authenticating user 'guest' through LDAP
+    13/11/15 16:37:15 DEBUG ldap.JndiLdapContextFactory: Initializing LDAP context using URL [ldap://localhost:33389] and principal [uid=guest,ou=people,dc=hadoop,dc=apache,dc=org] with pooling disabled
+    2013-11-15 16:37:15,899 INFO  Password not correct for user 'uid=guest,ou=people,dc=hadoop,dc=apache,dc=org'
+    2013-11-15 16:37:15,899 INFO  Authenticator org.apache.directory.server.core.authn.SimpleAuthenticator@354c78e3 failed to authenticate: BindContext for DN 'uid=guest,ou=people,dc=hadoop,dc=apache,dc=org', credentials <0x67 0x75 0x65 0x73 0x74 0x2D 0x70 0x61 0x73 0x73 0x77 0x6F 0x72 0x64 0x30 >
+    2013-11-15 16:37:15,899 INFO  Cannot bind to the server
+    13/11/15 16:37:15 DEBUG servlet.SimpleCookie: Added HttpServletResponse Cookie [rememberMe=deleteMe; Path=/gateway/vaultservice; Max-Age=0; Expires=Thu, 14-Nov-2013 21:37:15 GMT]
+    13/11/15 16:37:15 DEBUG authc.BasicHttpAuthenticationFilter: Authentication required: sending 401 Authentication challenge response.
+
+The client will likely see something along the lines of:
+
+    HTTP/1.1 401 Unauthorized
+    WWW-Authenticate: BASIC realm="application"
+    Content-Length: 0
+    Server: Jetty(8.1.12.v20130726)
+
+#### Using ldapsearch to verify ldap connectivtiy and credentials
+
+If your authentication to knox fails and you believe your are using correct creedentilas, you could try to verify the connectivity and credentials usong ldapsearch, assuming you are using ldap directory for authentication.
+
+Assuming you are using the default values that came out of box with knox, your ldapsearch command would be like the following
+
+    ldapsearch -h localhost -p 33389 -D "uid=guest,ou=people,dc=hadoop,dc=apache,dc=org" -w guest-password -b "uid=guest,ou=people,dc=hadoop,dc=apache,dc=org" "objectclass=*"
+
+This should produce output like the following
+
+    # extended LDIF
+    
+    LDAPv3
+    base <uid=guest,ou=people,dc=hadoop,dc=apache,dc=org> with scope subtree
+    filter: objectclass=*
+    requesting: ALL
+    
+    
+    # guest, people, hadoop.apache.org
+    dn: uid=guest,ou=people,dc=hadoop,dc=apache,dc=org
+    objectClass: organizationalPerson
+    objectClass: person
+    objectClass: inetOrgPerson
+    objectClass: top
+    uid: guest
+    cn: Guest
+    sn: User
+    userpassword:: Z3Vlc3QtcGFzc3dvcmQ=
+    
+    # search result
+    search: 2
+    result: 0 Success
+    
+    # numResponses: 2
+    # numEntries: 1
+
+In a more general form the ldapsearch command would be
+
+    ldapsearch -h {HOST} -p {PORT} -D {DN of binding user} -w {bind password} -b {DN of binding user} "objectclass=*}
+
+### Hostname Resolution Issues ###
+
+The deployments/sandbox.xml topology file has the host mapping feature enabled.
+This is required due to the way networking is setup in the Sandbox VM.
+Specifically the VM's internal hostname is sandbox.hortonworks.com.
+Since this hostname cannot be resolved to the actual VM Knox needs to map that hostname to something resolvable.
+
+If for example host mapping is disabled but the Sandbox VM is still used you will see an error in the diagnostic output similar to the below.
+
+    13/11/18 19:11:35 WARN hadoop.gateway: Connection exception dispatching request: http://sandbox.hortonworks.com:50075/webhdfs/v1/user/guest/example/README?op=CREATE&namenoderpcaddress=sandbox.hortonworks.com:8020&user.name=guest&overwrite=false java.net.UnknownHostException: sandbox.hortonworks.com
+    java.net.UnknownHostException: sandbox.hortonworks.com
+    	at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
+
+On the other hand if you are migrating from the Sandbox based configuration to a cluster you have deployment you may see a similar error.
+However in this case you may need to disable host mapping.
+This can be done by modifying the topology file (e.g. deployments/sandbox.xml) for the cluster.
+
+    ...
+    <provider>
+        <role>hostmap</role>
+        <name>static</name>
+        <enabled>false</enabled>
+        <param><name>localhost</name><value>sandbox,sandbox.hortonworks.com</value></param>
+    </provider>
+    ....
+
+
+### Job Submission Issues - HDFS Home Directories ###
+
+If you see error like the following in your console  while submitting a Job using groovy shell, it is likely that the authenticated user does not have a home directory on HDFS.
+
+    Caught: org.apache.hadoop.gateway.shell.HadoopException: org.apache.hadoop.gateway.shell.ErrorResponse: HTTP/1.1 403 Forbidden
+    org.apache.hadoop.gateway.shell.HadoopException: org.apache.hadoop.gateway.shell.ErrorResponse: HTTP/1.1 403 Forbidden
+
+You would also see this error if you try file operation on the home directory of the authenticating user.
+
+The error would look a little different as shown below  if you are attempting to the operation with cURL.
+
+    {"RemoteException":{"exception":"AccessControlException","javaClassName":"org.apache.hadoop.security.AccessControlException","message":"Permission denied: user=tom, access=WRITE, inode=\"/user\":hdfs:hdfs:drwxr-xr-x"}}* 
+
+#### Resolution
+
+Create the home directory for the user on HDFS.
+The home directory is typically of the form `/user/{userid}` and should be owned by the user.
+user 'hdfs' can create such a directory and make the user owner of the directory.
+
+
+### Job Submission Issues - OS Accounts ###
+
+If the hadoop cluster is not secured with Kerberos, the user submitting a job need not have an OS account on the hadoop nodemanagers.
+
+If the hadoop cluster is secured with Kerberos, the user submitting the job should have an OS account on hadoop nodemanagers.
+
+In either case if the user does not have such OS account, his file permissions are based on user ownership of files or "other" permission in "ugo" posix permission.
+The user does not get any file permission as a member of any group if you are using default hadoop.security.group.mapping.
+
+TODO: add sample error message from running test on secure cluster with missing OS account
+
+### HBase Issues ###
+
+If you experience problems running the HBase samples with the Sandbox VM it may be necessary to restart HBase and Stargate.
+This can sometimes occur with the Sandbox VM is restarted from a saved state.
+If the client hangs after emitting the last line in the sample output below you are most likely affected.
+
+    System version : {...}
+    Cluster version : 0.96.0.2.0.6.0-76-hadoop2
+    Status : {...}
+    Creating table 'test_table'...
+
+HBase and Starget can be restred using the following commands on the Hadoop Sandbox VM.
+You will need to ssh into the VM in order to run these commands.
+
+    sudo -u hbase /usr/lib/hbase/bin/hbase-daemon.sh stop master
+    sudo -u hbase /usr/lib/hbase/bin/hbase-daemon.sh start master
+    sudo -u hbase /usr/lib/hbase/bin/hbase-daemon.sh restart rest -p 60080
+
+
+### SSL Certificate Issues ###
+
+Clients that do not trust the certificate presented by the server will behave in different ways.
+A browser will typically warn you of the inability to trust the receieved certificate and give you an opportunity to add an exception for the particular certificate.
+Curl will present you with the follow message and instructions for turning of certificate verification:
+
+    curl performs SSL certificate verification by default, using a "bundle" 
+     of Certificate Authority (CA) public keys (CA certs). If the default
+     bundle file isn't adequate, you can specify an alternate file
+     using the --cacert option.
+    If this HTTPS server uses a certificate signed by a CA represented 
+     the bundle, the certificate verification probably failed due to a
+     problem with the certificate (it might be expired, or the name might
+     not match the domain name in the URL).
+    If you'd like to turn off curl's verification of the certificate, use
+     the -k (or --insecure) option.
+
+
+### SPNego Authentication Issues ###
+
+Calls from Knox to Secure Hadoop Cluster fails, with SPNego authentication problems,
+if there was a TGT for knox in disk cache when Knox was started.
+
+You are likely to run into this situation on developer machines where develeoper could have knited for some testing.
+
+Work Around: clear TGT of Knox from disk cache ( calling kdestroy would do it), before starting knox
+
+### Filing Bugs ###
+
+Bugs can be filed using [Jira][jira].
+Please include the results of this command below in the Environment section.
+Also include the version of Hadoop being used in the same section.
+
+    cd {GATEWAY_HOME}
+    java -jar bin/gateway.jar -version
+

Added: knox/trunk/books/0.7.0/config.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.7.0/config.md?rev=1681785&view=auto
==============================================================================
--- knox/trunk/books/0.7.0/config.md (added)
+++ knox/trunk/books/0.7.0/config.md Tue May 26 16:07:07 2015
@@ -0,0 +1,478 @@
+<!---
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+--->
+
+### Configuration ###
+
+### Related Cluster Configuration ###
+
+The following configuration changes must be made to your cluster to allow Apache Knox to
+dispatch requests to the various service components on behalf of end users.
+
+#### Grant Proxy privileges for Knox user in `core-site.xml` on Hadoop master nodes ####
+
+Update `core-site.xml` and add the following lines towards the end of the file.
+
+Replace FQDN_OF_KNOX_HOST with the fully qualified domain name of the host running the gateway.
+You can usually find this by running `hostname -f` on that host.
+
+You could use * for local developer testing if Knox host does not have static IP.
+
+    <property>
+        <name>hadoop.proxyuser.knox.groups</name>
+        <value>users</value>
+    </property>
+    <property>
+        <name>hadoop.proxyuser.knox.hosts</name>
+        <value>FQDN_OF_KNOX_HOST</value>
+    </property>
+
+#### Grant proxy privilege for Knox in `webhcat-site.xml` on Hadoop master nodes ####
+
+Update `webhcat-site.xml` and add the following lines towards the end of the file.
+
+Replace FQDN_OF_KNOX_HOST with right value in your cluster.
+You could use * for local developer testing if Knox host does not have static IP.
+
+    <property>
+        <name>webhcat.proxyuser.knox.groups</name>
+        <value>users</value>
+    </property>
+    <property>
+        <name>webhcat.proxyuser.knox.hosts</name>
+        <value>FQDN_OF_KNOX_HOST</value>
+    </property>
+
+#### Grant proxy privilege for Knox in `oozie-site.xml` on Oozie host ####
+
+Update `oozie-site.xml` and add the following lines towards the end of the file.
+
+Replace FQDN_OF_KNOX_HOST with right value in your cluster.
+You could use * for local developer testing if Knox host does not have static IP.
+
+    <property>
+       <name>oozie.service.ProxyUserService.proxyuser.knox.groups</name>
+       <value>users</value>
+    </property>
+    <property>
+       <name>oozie.service.ProxyUserService.proxyuser.knox.hosts</name>
+       <value>FQDN_OF_KNOX_HOST</value>
+    </property>
+
+#### Enable http transport mode and use substitution in Hive Server2 ####
+
+Update `hive-site.xml` and set the following properties on Hive Server2 hosts.
+Some of the properties may already be in the hive-site.xml. 
+Ensure that the values match the ones below.
+
+    <property>
+      <name>hive.server2.allow.user.substitution</name>
+      <value>true</value>
+    </property>
+
+    <property>
+	    <name>hive.server2.transport.mode</name>
+	    <value>http</value>
+	    <description>Server transport mode. "binary" or "http".</description>
+    </property>
+
+    <property>
+	    <name>hive.server2.thrift.http.port</name>
+	    <value>10001</value>
+	    <description>Port number when in HTTP mode.</description>
+    </property>
+
+    <property>
+	    <name>hive.server2.thrift.http.path</name>
+	    <value>cliservice</value>
+	    <description>Path component of URL endpoint when in HTTP mode.</description>
+    </property>
+
+#### Topology Descriptors ####
+
+The topology descriptor files provide the gateway with per-cluster configuration information.
+This includes configuration for both the providers within the gateway and the services within the Hadoop cluster.
+These files are located in `{GATEWAY_HOME}/conf/topologies`.
+The general outline of this document looks like this.
+
+    <topology>
+        <gateway>
+            <provider>
+            </provider>
+        </gateway>
+        <service>
+        </service>
+    </topology>
+
+There are typically multiple `<provider>` and `<service>` elements.
+
+/topology
+: Defines the provider and configuration and service topology for a single Hadoop cluster.
+
+/topology/gateway
+: Groups all of the provider elements
+
+/topology/gateway/provider
+: Defines the configuration of a specific provider for the cluster.
+
+/topology/service
+: Defines the location of a specific Hadoop service within the Hadoop cluster.
+
+##### Provider Configuration #####
+
+Provider configuration is used to customize the behavior of a particular gateway feature.
+The general outline of a provider element looks like this.
+
+    <provider>
+        <role>authentication</role>
+        <name>ShiroProvider</name>
+        <enabled>true</enabled>
+        <param>
+            <name></name>
+            <value></value>
+        </param>
+    </provider>
+
+/topology/gateway/provider
+: Groups information for a specific provider.
+
+/topology/gateway/provider/role
+: Defines the role of a particular provider.
+There are a number of pre-defined roles used by out-of-the-box provider plugins for the gateay.
+These roles are: authentication, identity-assertion, authentication, rewrite and hostmap
+
+/topology/gateway/provider/name
+: Defines the name of the provider for which this configuration applies.
+There can be multiple provider implementations for a given role.
+Specifying the name is used identify which particular provider is being configured.
+Typically each topology descriptor should contain only one provider for each role but there are exceptions.
+
+/topology/gateway/provider/enabled
+: Allows a particular provider to be enabled or disabled via `true` or `false` respectively.
+When a provider is disabled any filters associated with that provider are excluded from the processing chain.
+
+/topology/gateway/provider/param
+: These elements are used to supply provider configuration.
+There can be zero or more of these per provider.
+
+/topology/gateway/provider/param/name
+: The name of a parameter to pass to the provider.
+
+/topology/gateway/provider/param/value
+: The value of a parameter to pass to the provider.
+
+##### Service Configuration #####
+
+Service configuration is used to specify the location of services within the Hadoop cluster.
+The general outline of a service element looks like this.
+
+    <service>
+        <role>WEBHDFS</role>
+        <url>http://localhost:50070/webhdfs</url>
+    </service>
+
+/topology/service
+: Provider information about a particular service within the Hadoop cluster.
+Not all services are necessarily exposed as gateway endpoints.
+
+/topology/service/role
+: Identifies the role of this service.
+Currently supported roles are: WEBHDFS, WEBHCAT, WEBHBASE, OOZIE, HIVE, NAMENODE, JOBTRACKER, RESOURCEMANAGER
+Additional service roles can be supported via plugins.
+
+topology/service/url
+: The URL identifying the location of a particular service within the Hadoop cluster.
+
+#### Hostmap Provider ####
+
+The purpose of the Hostmap provider is to handle situations where host are known by one name within the cluster and another name externally.
+This frequently occurs when virtual machines are used and in particular when using cloud hosting services.
+Currently, the Hostmap provider is configured as part of the topology file.
+The basic structure is shown below.
+
+    <topology>
+        <gateway>
+            ...
+            <provider>
+                <role>hostmap</role>
+                <name>static</name>
+                <enabled>true</enabled>
+                <param><name>external-host-name</name><value>internal-host-name</value></param>
+            </provider>
+            ...
+        </gateway>
+        ...
+    </topology>
+
+This mapping is required because the Hadoop servies running within the cluster are unaware that they are being accessed from outside the cluster.
+Therefore URLs returned as part of REST API responses will typically contain internal host names.
+Since clients outside the cluster will be unable to resolve those host name they must be mapped to external host names.
+
+##### Hostmap Provider Example - EC2 #####
+
+Consider an EC2 example where two VMs have been allocated.
+Each VM has an external host name by which it can be accessed via the internet.
+However the EC2 VM is unaware of this external host name and instead is configured with the internal host name.
+
+    External HOSTNAMES:
+    ec2-23-22-31-165.compute-1.amazonaws.com
+    ec2-23-23-25-10.compute-1.amazonaws.com
+
+    Internal HOSTNAMES:
+    ip-10-118-99-172.ec2.internal
+    ip-10-39-107-209.ec2.internal
+
+The Hostmap configuration required to allow access external to the Hadoop cluster via the Apache Knox Gateway would be this.
+
+    <topology>
+        <gateway>
+            ...
+            <provider>
+                <role>hostmap</role>
+                <name>static</name>
+                <enabled>true</enabled>
+                <param>
+                    <name>ec2-23-22-31-165.compute-1.amazonaws.com</name>
+                    <value>ip-10-118-99-172.ec2.internal</value>
+                </param>
+                <param>
+                    <name>ec2-23-23-25-10.compute-1.amazonaws.com</name>
+                    <value>ip-10-39-107-209.ec2.internal</value>
+                </param>
+            </provider>
+            ...
+        </gateway>
+        ...
+    </topology>
+
+##### Hostmap Provider Example - Sandbox #####
+
+The Hortonworks Sandbox 2.x poses a different challenge for host name mapping.
+This version of the Sandbox uses port mapping to make the Sandbox VM appear as though it is accessible via localhost.
+However the Sandbox VM is internally configured to consider sandbox.hortonworks.com as the host name.
+So from the perspective of a client accessing Sandbox the external host name is localhost.
+The Hostmap configuration required to allow access to Sandbox from the host operating system is this.
+
+    <topology>
+        <gateway>
+            ...
+            <provider>
+                <role>hostmap</role>
+                <name>static</name>
+                <enabled>true</enabled>
+                <param><name>localhost</name><value>sandbox,sandbox.hortonworks.com</value></param>
+            </provider>
+            ...
+        </gateway>
+        ...
+    </topology>
+
+##### Hostmap Provider Configuration #####
+
+Details about each provider configuration element is enumerated below.
+
+topology/gateway/provider/role
+: The role for a Hostmap provider must always be `hostmap`.
+
+topology/gateway/provider/name
+: The Hostmap provider supplied out-of-the-box is selected via the name `static`.
+
+topology/gateway/provider/enabled
+: Host mapping can be enabled or disabled by providing `true` or `false`.
+
+topology/gateway/provider/param
+: Host mapping is configured by providing parameters for each external to internal mapping.
+
+topology/gateway/provider/param/name
+: The parameter names represent an external host names associated with the internal host names provided by the value element.
+This can be a comma separated list of host names that all represent the same physical host.
+When mapping from internal to external host name the first external host name in the list is used.
+
+topology/gateway/provider/param/value
+: The parameter values represent the internal host names associated with the external host names provider by the name element.
+This can be a comma separated list of host names that all represent the same physical host.
+When mapping from external to internal host names the first internal host name in the list is used.
+
+
+#### Logging ####
+
+If necessary you can enable additional logging by editing the `log4j.properties` file in the `conf` directory.
+Changing the rootLogger value from `ERROR` to `DEBUG` will generate a large amount of debug logging.
+A number of useful, more fine loggers are also provided in the file.
+
+
+#### Java VM Options ####
+
+TODO - Java VM options doc.
+
+
+#### Persisting the Master Secret ####
+
+The master secret is required to start the server.
+This secret is used to access secured artifacts by the gateway instance.
+Keystore, trust stores and credential stores are all protected with the master secret.
+
+You may persist the master secret by supplying the *\-persist-master* switch at startup.
+This will result in a warning indicating that persisting the secret is less secure than providing it at startup.
+We do make some provisions in order to protect the persisted password.
+
+It is encrypted with AES 128 bit encryption and where possible the file permissions are set to only be accessible by the user that the gateway is running as.
+
+After persisting the secret, ensure that the file at config/security/master has the appropriate permissions set for your environment.
+This is probably the most important layer of defense for master secret.
+Do not assume that the encryption if sufficient protection.
+
+A specific user should be created to run the gateway this user will be the only user with permissions for the persisted master file.
+
+See the Knox CLI section for descriptions of the command line utilties related to the master secret.
+
+#### Management of Security Artifacts ####
+
+There are a number of artifacts that are used by the gateway in ensuring the security of wire level communications, access to protected resources and the encryption of sensitive data.
+These artifacts can be managed from outside of the gateway instances or generated and populated by the gateway instance itself.
+
+The following is a description of how this is coordinated with both standalone (development, demo, etc) gateway instances and instances as part of a cluster of gateways in mind.
+
+Upon start of the gateway server we:
+
+1. Look for an identity store at `data/security/keystores/gateway.jks`.
+   The identity store contains the certificate and private key used to represent the identity of the server for SSL connections and signature creation.
+    * If there is no identity store we create one and generate a self-signed certificate for use in standalone/demo mode.
+      The certificate is stored with an alias of gateway-identity.
+    * If there is an identity store found than we ensure that it can be loaded using the provided master secret and that there is an alias called gateway-identity.
+2. Look for a credential store at `data/security/keystores/__gateway-credentials.jceks`.
+   This credential store is used to store secrets/passwords that are used by the gateway.
+   For instance, this is where the passphrase for accessing the gateway-identity certificate is kept.
+    * If there is no credential store found then we create one and populate it with a generated passphrase for the alias `gateway-identity-passphrase`.
+      This is coordinated with the population of the self-signed cert into the identity-store.
+    * If a credential store is found then we ensure that it can be loaded using the provided master secret and that the expected aliases have been populated with secrets.
+
+Upon deployment of a Hadoop cluster topology within the gateway we:
+
+1. Look for a credential store for the topology. For instance, we have a sample topology that gets deployed out of the box.  We look for `data/security/keystores/sandbox-credentials.jceks`. This topology specific credential store is used for storing secrets/passwords that are used for encrypting sensitive data with topology specific keys.
+    * If no credential store is found for the topology being deployed then one is created for it.
+      Population of the aliases is delegated to the configured providers within the system that will require the use of a  secret for a particular task.
+      They may programmatic set the value of the secret or choose to have the value for the specified alias generated through the AliasService.
+    * If a credential store is found then we ensure that it can be loaded with the provided master secret and the configured providers have the opportunity to ensure that the aliases are populated and if not to populate them.
+
+By leveraging the algorithm described above we can provide a window of opportunity for management of these artifacts in a number of ways.
+
+1. Using a single gateway instance as a master instance the artifacts can be generated or placed into the expected location and then replicated across all of the slave instances before startup.
+2. Using an NFS mount as a central location for the artifacts would provide a single source of truth without the need to replicate them over the network. Of course, NFS mounts have their own challenges.
+3. Using the KnoxCLI to create and manage the security artifacts.
+
+See the Knox CLI section for descriptions of the command line utilties related to the security artifact management.
+
+#### Keystores ####
+In order to provide your own certificate for use by the gateway, you will need to either import an existing key pair into a Java keystore or generate a self-signed cert using the Java keytool.
+
+##### Importing a key pair into a Java keystore #####
+One way to accomplish this is to start with a PKCS12 store for your key pair and then convert it to a Java keystore or JKS.
+
+The following example uses openssl to create a PKCS12 encoded store from your provided certificate and private key that are in PEM format.
+
+    openssl pkcs12 -export -in cert.pem -inkey key.pem > server.p12
+
+The next example converts the PKCS12 store into a Java keystore (JKS). It should prompt you for the keystore and key passwords for the destination keystore. You must use the master-secret for the keystore password and keep track of the password that you use for the key passphrase.
+
+    keytool -importkeystore -srckeystore {server.p12} -destkeystore gateway.jks -srcstoretype pkcs12
+
+While using this approach a couple of important things to be aware of:
+
+1. the alias MUST be "gateway-identity". You may need to change it using keytool after the import of the PKCS12 store. You can use keytool to do this - for example: 
+
+    keytool -changealias -alias "1" -destalias "gateway-identity" -keystore gateway.jks -storepass {knoxpw}
+    
+2. the name of the expected identity keystore for the gateway MUST be gateway.jks
+3. the passwords for the keystore and the imported key may both be set to the master secret for the gateway install. You can change the key passphrase after import using keytool as well. You may need to do this in order to provision the password in the credential store as described later in this section. For example:
+
+    keytool -keypasswd -alias gateway-identity -keystore gateway.jks
+
+NOTE: The password for the keystore as well as that of the imported key may be the master secret for the gateway instance or you may set the gateway-identity-passphrase alias using the Knox CLI to the actual key passphrase. See the Knox CLI section for details.
+
+The following will allow you to provision the passphrase for the private key that you set during keystore creation above - it will prompt you for the actual passphrase.
+
+    bin/knoxcli.sh create-alias gateway-identity-passphrase
+
+##### Generating a self-signed cert for use in testing or development environments #####
+
+    keytool -genkey -keyalg RSA -alias gateway-identity -keystore gateway.jks \
+        -storepass {master-secret} -validity 360 -keysize 2048
+
+Keytool will prompt you for a number of elements used will comprise the distiniguished name (DN) within your certificate. 
+
+*NOTE:* When it prompts you for your First and Last name be sure to type in the hostname of the machine that your gateway instance will be running on. This is used by clients during hostname verification to ensure that the presented certificate matches the hostname that was used in the URL for the connection - so they need to match.
+
+*NOTE:* When it prompts for the key password just press enter to ensure that it is the same as the keystore password. Which, as was described earlier, must match the master secret for the gateway instance. Alternatively, you can set it to another passphrase - take note of it and set the gateway-identity-passphrase alias to that passphrase using the Knox CLI.
+
+See the Knox CLI section for descriptions of the command line utilties related to the management of the keystores.
+
+##### Using a CA Signed Key Pair #####
+For certain deployments a certificate key pair that is signed by a trusted certificate authority is required. There are a number of different ways in which these certificates are acquired and can be converted and imported into the Apache Knox keystore.
+
+The following steps have been used to do this and are provided here for guidance in your installation.
+You may have to adjust according to your environment.
+
+General steps:
+
+1. stop gateway and back up all files in /var/lib/knox/data/security/keystores  
+gateway.sh stop
+2. create new master key for knox and persist, the master key will be referred to in following steps as $master-key  
+knoxcli.sh create-master -force
+3.  create identity keystore gateway.jks. cert in alias gateway-identity  
+    * cd /var/lib/knox/data/security/keystore  
+    * keytool -genkeypair -alias gateway-identity -keyalg RSA -keysize 1024 -dname "CN=$fqdn_knox,OU=hdp,O=sdge" -keypass $keypass -keystore gateway.jks -storepass $master-key -validity 300  
+NOTE: above $fqdn_knox is the hostname of the knox host. adjust validity as needed. some may choose $keypass to be the same as $master-key
+4. create credential store to store the $keypass in step 3.  this creates __gateway-credentials.jceks file  
+    * knoxcli.sh create-alias gateway-identity-passphrase --value $keypass
+5. generate a certificate signing request from the gateway.jks  
+    * keytool -keystore gateway.jks -storepass $master-key -alias gateway-identity -certreq -file knox.csr
+4. send the knox.csr file to the CA authority and get back the singed certificate, signed cert referred to as knox.signed in following steps. Also need the CA cert, which normally can be requested through openssl command or web browser.  (or can ask the CA authority to send a copy).
+5. import both the CA authority certificate (referred as corporateCA.cer) and the signed knox certificate back into gateway.jks  
+    * keytool -keystore gateway.jks -storepass $master-key -alias $hwhq -import -file corporateCA.cer  
+    * keytool -keystore gateway.jks -storepass $master-key -alias gateway-identity -import -file knox.signed  
+Note: use any alias appropriate for the corporate CA.
+6. restart gateway. check gateway.log to see that gateway started properly and clusters are deployed. Can check the timestamp on cluster deployment files 
+    * ls -alrt /var/lib/knox/data/deployment
+7. verify that clients can use the CA authority cert to access Knox (which is the goal of using public signed cert)  
+    * curl --cacert supwin12ad.cer -u hdptester:hadoop -X GET 'https://$fqdn_knox:8443/gateway/$topologyname/webhdfs/v1/tmp?op=LISTSTATUS'
+or can verify through client browser which already has the corporate CA cert installed.
+
+##### Credential Store #####
+Whenever you provide your own keystore with either a self-signed cert or an issued certificate signed by a trusted authority, you will need to set an alias for the gateway-identity-passphrase or create an empty credential store. This is necessary for the current release in order for the system to determine the correct password for the keystore and the key.
+
+The credential stores in Knox use the JCEKS keystore type as it allows for the storage of general secrets in addition to certificates.
+
+Keytool may be used to create credential stores but the Knox CLI section details how to create aliases. These aliases are managed within credential stores which are created by the CLI as needed. The simplest approach is to create the gateway-identity-passpharse alias with the Knox CLI. This will create the credential store if it doesn't already exist and add the key passphrase.
+
+See the Knox CLI section for descriptions of the command line utilties related to the management of the credential stores.
+
+##### Provisioning of Keystores #####
+Once you have created these keystores you must move them into place for the gateway to discover them and use them to represent its identity for SSL connections. This is done by copying the keystores to the `{GATEWAY_HOME}/data/security/keystores` directory for your gateway install.
+
+#### Summary of Secrets to be Managed ####
+
+1. Master secret - the same for all gateway instances in a cluster of gateways
+2. All security related artifacts are protected with the master secret
+3. Secrets used by the gateway itself are stored within the gateway credential store and are the same across all gateway instances in the cluster of gateways
+4. Secrets used by providers within cluster topologies are stored in topology specific credential stores and are the same for the same topology across the cluster of gateway instances.
+   However, they are specific to the topology - so secrets for one hadoop cluster are different from those of another.
+   This allows for fail-over from one gateway instance to another even when encryption is being used while not allowing the compromise of one encryption key to expose the data for all clusters.
+
+NOTE: the SSL certificate will need special consideration depending on the type of certificate. Wildcard certs may be able to be shared across all gateway instances in a cluster.
+When certs are dedicated to specific machines the gateway identity store will not be able to be blindly replicated as host name verification problems will ensue.
+Obviously, trust-stores will need to be taken into account as well.
+

Added: knox/trunk/books/0.7.0/config_advanced_ldap.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.7.0/config_advanced_ldap.md?rev=1681785&view=auto
==============================================================================
--- knox/trunk/books/0.7.0/config_advanced_ldap.md (added)
+++ knox/trunk/books/0.7.0/config_advanced_ldap.md Tue May 26 16:07:07 2015
@@ -0,0 +1,243 @@
+### Advanced LDAP Authentication
+
+The default configuration computes the bind DN for incoming user based on userDnTemplate.
+This does not work in enterprises where users could belong to multiple branches of LDAP tree.
+You could instead enable advanced configuration that would compute bind DN of incoming user with an LDAP search.
+
+#### Problem with  userDnTemplate based Authentication 
+
+UserDnTemplate based authentication uses configuration parameter ldapRealm.userDnTemplate.
+Typical value of userDNTemplate would look like uid={0},ou=people,dc=hadoop,dc=apache,dc=org.
+ 
+To compute bind DN of the client, we swap the place holder {0} with login id provided by the client.
+For example, if the login id provided by the client is  "guest',  
+the computed bind DN would be uid=guest,ou=people,dc=hadoop,dc=apache,dc=org.
+ 
+This keeps configuration simple.
+
+However, this does not work if users belong to different branches of LDAP DIT.
+For example, if there are some users under ou=people,dc=hadoop,dc=apache,dc=org 
+and some users under ou=contractors,dc=hadoop,dc=apache,dc=org,  
+we can not come up with userDnTemplate that would work for all the users.
+
+#### Using advanced LDAP Authentication
+
+With advanced LDAP authentication, we find the bind DN of the user by searching LDAP directory
+instead of interpolating bind DN from userDNTemplate. 
+
+
+#### Example search filter to find the client bind DN
+ 
+Assuming,  
+ldapRealm.userSearchAttributeName=uid
+ldapRealm.userObjectClass=person
+client  specified login id =  "guest"
+ 
+LDAP Filter for doing a search to find the bind DN would be
+(&(uid=guest)(objectclass=person))
+
+This could find bind DN to be 
+uid=guest,ou=people,dc=hadoop,dc=apache,dc=org
+
+Please note that the userSearchAttributeName need not be part of bindDN.
+
+For example, you could use 
+
+ldapRealm.userSearchAttributeName=email
+ldapRealm.userObjectClass=person
+client  specified login id =  "bill.clinton@gmail.com"
+
+LDAP Filter for doing a search to find the bind DN would be
+(&(email=bill.clinton@gmail.com)(objectclass=person))
+
+This could find bind DN to be 
+uid=billc,ou=contractors,dc=hadoop,dc=apache,dc=org
+
+#### Example provider configuration to use advanced LDAP authentication
+
+The example configuration appears verbose due to the presence of liberal comments 
+and illustration of optional parameters and default values.
+The configuration that you would use could be much shorter if you rely on default values.
+
+<provider>
+
+	<role>authentication</role>
+	<name>ShiroProvider</name>
+	<enabled>true</enabled>
+
+	<param>
+		<name>main.ldapRealm</name>
+		<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
+	</param>
+
+	<param>
+		<name>main.ldapContextFactory</name>
+		<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory
+		</value>
+	</param>
+
+	<param>
+		<name>main.ldapRealm.contextFactory</name>
+		<value>$ldapContextFactory</value>
+	</param>
+
+	<!-- update the value based on your ldap directory protocol, host and port -->
+	<param>
+		<name>main.ldapRealm.contextFactory.url</name>
+		<value>ldap://hdp.example.com:389</value>
+	</param>
+
+	<!-- optional, default value: simple
+	     Update the value based on mechanisms supported by your ldap directory -->
+	<param>
+		<name>main.ldapRealm.contextFactory.authenticationMechanism</name>
+		<value>simple</value>
+	</param>
+
+	<!-- optional, default value: {0}
+       update the value based on your ldap DIT(directory information tree).
+       ignored if value is defined for main.ldapRealm.userSearchAttributeName -->
+	<param>
+		<name>main.ldapRealm.userDnTemplate</name>
+		<value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
+	</param>
+
+	<!-- optional, default value: null
+	     If you specify a value for this attribute, useDnTemplate 
+		   specified above would be ignored and user bind DN would be computed using
+		   ldap search
+	     update the value based on your ldap DIT(directory information layout)
+	     value of search attribute should identity the user uniquely -->
+	<param>
+		<name>main.ldapRealm.userSearchAttributeName</name>
+		<value>uid</value>
+	</param>
+
+	<!-- optional, default value: false  
+	     If the value is true, groups in which user is a member are looked up 
+	     from LDAP and made available  for service level authorization checks -->
+	<param>
+		<name>main.ldapRealm.authorizationEnabled</name>
+		<value>true</value>
+	</param>
+
+	<!-- bind DN used to search for groups and user bind DN.  
+	     Required if a value is defined for main.ldapRealm.userSearchAttributeName
+	     or if the value of main.ldapRealm.authorizationEnabled is true -->
+	<param>
+		<name>main.ldapRealm.contextFactory.systemUsername</name>
+		<value>uid=guest,ou=people,dc=hadoop,dc=apache,dc=org</value>
+	</param>
+
+	<!-- password for systemUserName.
+	     Required if a value is defined for main.ldapRealm.userSearchAttributeName
+       or if the value of main.ldapRealm.authorizationEnabled is true -->
+	<param>
+		<name>main.ldapRealm.contextFactory.systemPassword</name>
+		<value>${ALIAS=ldcSystemPassword}</value>
+	</param>
+
+	<!-- optional, default value: simple
+	     Update the value based on mechanisms supported by your ldap directory -->
+	<param>
+		<name>main.ldapRealm.contextFactory.systemAuthenticationMechanism</name>
+		<value>simple</value>
+	</param>
+
+	<!-- optional, default value: person
+	     Objectclass to identify user entries in ldap, used to build search 
+		   filter to search for user bind DN -->
+	<param>
+		<name>main.ldapRealm.userObjectClass</name>
+		<value>person</value>
+	</param>
+
+	<!-- search base used to search for user bind DN and groups -->
+	<param>
+		<name>main.ldapRealm.searchBase</name>
+		<value>dc=hadoop,dc=apache,dc=org</value>
+	</param>
+
+	<!-- search base used to search for user bind DN.
+	     Defaults to the value of main.ldapRealm.searchBase. 
+	     If main.ldapRealm.userSearchAttributeName is defined, 
+	     vlaue for main.ldapRealm.searchBase  or main.ldapRealm.userSearchBase 
+	     should be defined -->
+	<param>
+		<name>main.ldapRealm.userSearchBase</name>
+		<value>dc=hadoop,dc=apache,dc=org</value>
+	</param>
+
+	<!-- search base used to search for groups.
+	     Defaults to the value of main.ldapRealm.searchBase.
+		   If value of main.ldapRealm.authorizationEnabled is true,
+	     vlaue for main.ldapRealm.searchBase  or main.ldapRealm.groupSearchBase should be defined -->
+	<param>
+		<name>main.ldapRealm.groupSearchBase</name>
+		<value>dc=hadoop,dc=apache,dc=org</value>
+	</param>
+
+	<!-- optional, default value: groupOfNames
+	     Objectclass to identify group entries in ldap, used to build search 
+       filter to search for group entires --> 
+	<param>
+		<name>main.ldapRealm.groupObjectClass</name>
+		<value>groupOfNames</value>
+	</param>
+  
+	<!-- optional, default value: member
+	     If value is memberUrl, we treat found groups as dynamic groups -->
+	<param>
+		<name>main.ldapRealm.memberAttribute</name>
+		<value>member</value>
+	</param>
+
+	<!-- optional, default value: uid={0}
+       Ignored if value is defined for main.ldapRealm.userSearchAttributeName -->
+  <param>
+    <name>main.ldapRealm.memberAttributeValueTemplate</name>
+    <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
+  </param>
+  
+	<!-- optional, default value: cn -->
+	<param>
+		<name>main.ldapRealm.groupIdAttribute</name>
+		<value>cn</value>
+	</param>
+
+	<param>
+		<name>urls./**</name>
+		<value>authcBasic</value>
+	</param>
+
+	<!-- optional, default value: 30min -->
+	<param>
+		<name>sessionTimeout</name>
+		<value>30</value>
+	</param>
+
+</provider>
+
+#### Special note on parameter main.ldapRealm.contextFactory.systemPassword
+
+The value for this could have one of the following 2 formats
+
+plantextpassword
+${ALIAS=ldcSystemPassword}
+
+The first format specifies the password in plain text in the provider configuration.
+Use of this format should be limited for testing and troubleshooting.
+
+We strongly recommend using the second format ${ALIAS=ldcSystemPassword}
+n production. This format uses an alias for the password stored in credential store.
+In the example ${ALIAS=ldcSystemPassword}, 
+ldcSystemPassword is the alias for the password stored in credential store.
+
+Assuming plain text password is "hadoop", and your topology file name is "hdp.xml",
+you would use following command to create the right password alias in credential store.
+
+$gateway_home/bin/knoxcli.sh  create-alias ldcSystemPassword --cluster hdp --value hadoop
+
+
+
+

Added: knox/trunk/books/0.7.0/config_audit.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.7.0/config_audit.md?rev=1681785&view=auto
==============================================================================
--- knox/trunk/books/0.7.0/config_audit.md (added)
+++ knox/trunk/books/0.7.0/config_audit.md Tue May 26 16:07:07 2015
@@ -0,0 +1,78 @@
+<!---
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+--->
+
+### Audit ###
+
+The Audit facility within the Knox Gateway introduces functionality for tracking actions that are executed by Knox per user's request or that are produced by Knox internal events like topology deploy, etc.
+The Knox Audit module is based on the [Apache log4j](http://logging.apache.org/log4j/1.2/).
+
+#### Configuration needed ####
+
+Out of the box, the Knox Gateway includes preconfigured auditing capabilities. To change its configuration please read following sections.
+
+#### Where audit logs go ####
+
+Audit module is preconfigured to write audit records to the log file `/var/log/knox/gateway-audit.log`.
+
+This behavior can be changed in the `conf/gateway-log4j.properties` file. `log4j.appender.auditfile.*` properties determine this behavior. For detailed information read [Apache log4j](http://logging.apache.org/log4j/1.2/).
+
+#### Audit format ####
+
+Out of the box, the audit record format is defined by org.apache.hadoop.gateway.audit.log4j.layout.AuditLayout.
+Its structure is following:
+
+	EVENT_PUBLISHING_TIME ROOT_REQUEST_ID|PARENT_REQUEST_ID|REQUEST_ID|LOGGER_NAME|TARGET_SERVICE_NAME|USER_NAME|PROXY_USER_NAME|SYSTEM_USER_NAME|ACTION|RESOURCE_TYPE|RESOURCE_NAME|OUTCOME|LOGGING_MESSAGE
+
+The audit record format can be changed by setting `log4j.appender.auditfile.layout` property in `conf/gateway-log4j.properties` to another class that extends org.apache.log4j.Layout or its subclasses.
+
+For detailed information read [Apache log4j](http://logging.apache.org/log4j/1.2/).
+
+##### How to interpret audit log #####
+
+Component | Description
+---------|-----------
+EVENT_PUBLISHING_TIME|Time when audit record was published.
+ROOT_REQUEST_ID|The root request ID if this is a sub-request. Currently it is empty.
+PARENT_REQUEST_ID|The parent request ID if this is a sub-request. Currently it is empty.
+REQUEST_ID|A unique value representing the current, active request. If the current request id value is different from the current parent request id value then the current request id value is moved to the parent request id before it is replaced by the provided request id. If the root request id is not set it will be set with the first non-null value of either the parent request id or the passed request id.
+LOGGER_NAME|The name of the logger
+TARGET_SERVICE_NAME|Name of Hadoop service. Can be empty if audit record is not linked to any Hadoop service, for example, audit record for topology deployment.
+USER_NAME|Name of user that initiated session with Knox
+PROXY_USER_NAME|Mapped user name. For detailed information read #[Identity Assertion].
+SYSTEM_USER_NAME|Currently is empty.
+ACTION|Type of action that was executed. Following actions are defined: authentication, authorization, redeploy, deploy, undeploy, identity-mapping, dispatch, access.
+RESOURCE_TYPE|Type of resource for which action was executed. Following resource types are defined: uri, topology, principal.
+RESOURCE_NAME|Name of resource. For resource of type topology it is name of topology. For resource of type uri it is inbound or dispatch request path. For resource of type principal it is a name of mapped user.
+OUTCOME|Action result type. Following outcomes are defined: success, failure, unavailable.
+LOGGING_MESSAGE| Logging message. Contains additional tracking information.
+
+#### Audit log rotation ####
+
+Audit logging is preconfigured with `org.apache.log4j.DailyRollingFileAppender`.
+[Apache log4j](http://logging.apache.org/log4j/1.2/) contains information about other Appenders.
+
+#### How to change audit level or disable it ####
+
+Audit configuration is stored in the `conf/gateway-log4j.properties` file.
+
+All audit messages are logged at `INFO` level and this behavior can't be changed.
+
+To change audit configuration `log4j.logger.audit*` and `log4j.appender.auditfile*` properties in `conf/gateway-log4j.properties` file should be modified.
+
+Their meaning can be found in [Apache log4j](http://logging.apache.org/log4j/1.2/).
+
+Disabling auditing can be done by decreasing log level for appender.

Added: knox/trunk/books/0.7.0/config_authn.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.7.0/config_authn.md?rev=1681785&view=auto
==============================================================================
--- knox/trunk/books/0.7.0/config_authn.md (added)
+++ knox/trunk/books/0.7.0/config_authn.md Tue May 26 16:07:07 2015
@@ -0,0 +1,161 @@
+<!---
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+--->
+
+### Authentication ###
+
+There are two types of providers supported in Knox for establishing a user's identity:
+
+1. Authentication Providers
+2. Federation Providers
+
+Authentication providers directly accept a user's credentials and validates them against some particular user store. Federation providers, on the other hand, validate a token that has been issued for the user by a trusted Identity Provider (IdP).
+
+The current release of Knox ships with an authentication provider based on the Apache Shiro project and is initially configured for BASIC authentication against an LDAP store. This has been specifically tested against Apache Directory Server and Active Directory.
+
+This section will cover the general approach to leveraging Shiro within the bundled provider including:
+
+1. General mapping of provider config to shiro.ini config
+2. Specific configuration for the bundled BASIC/LDAP configuration
+3. Some tips into what may need to be customized for your environment
+4. How to setup the use of LDAP over SSL or LDAPS
+
+#### General Configuration for Shiro Provider ####
+
+As is described in the configuration section of this document, providers have a name-value based configuration - as is the common pattern in the rest of Hadoop.
+
+The following example shows the format of the configuration for a given provider:
+
+    <provider>
+        <role>authentication</role>
+        <name>ShiroProvider</name>
+        <enabled>true</enabled>
+        <param>
+            <name>{name}</name>
+            <value>{value}</value>
+        </param>
+    </provider>
+
+Conversely, the Shiro provider currently expects a shiro.ini file in the web-inf directory of the cluster specific web application.
+
+The following example illustrates a configuration of the bundled BASIC/LDAP authentication config in a shiro.ini file:
+
+	[urls]
+	/**=authcBasic
+	[main]
+	ldapRealm=org.apache.shiro.realm.ldap.JndiLdapRealm
+	ldapRealm.contextFactory.authenticationMechanism=simple
+	ldapRealm.contextFactory.url=ldap://localhost:33389
+	ldapRealm.userDnTemplate=uid={0},ou=people,dc=hadoop,dc=apache,dc=org
+
+In order to fit into the context of an INI file format, at deployment time we interrogate the parameters provided in the provider configuration and parse the INI section out of the parameter names. The following provider config illustrates this approach. Notice that the section names in the above shiro.ini match the beginning of the param names that are in the following config:
+
+    <gateway>
+        <provider>
+            <role>authentication</role>
+            <name>ShiroProvider</name>
+            <enabled>true</enabled>
+            <param>
+                <name>main.ldapRealm</name>
+                <value>org.apache.shiro.realm.ldap.JndiLdapRealm</value>
+            </param>
+            <param>
+                <name>main.ldapRealm.userDnTemplate</name>
+                <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
+            </param>
+            <param>
+                <name>main.ldapRealm.contextFactory.url</name>
+                <value>ldap://localhost:33389</value>
+            </param>
+            <param>
+                <name>main.ldapRealm.contextFactory.authenticationMechanism</name>
+                <value>simple</value>
+            </param>
+            <param>
+                <name>urls./**</name>
+                <value>authcBasic</value>
+            </param>
+        </provider>
+
+This happens to be the way that we are currently configuring Shiro for BASIC/LDAP authentication. This same config approach may be used to achieve other authentication mechanisms or variations on this one. We however have not tested additional uses for it for this release.
+
+#### LDAP Configuration ####
+
+This section discusses the LDAP configuration used above for the Shiro Provider. Some of these configuration elements will need to be customized to reflect your deployment environment.
+
+**main.ldapRealm** - this element indicates the fully qualified classname of the Shiro realm to be used in authenticating the user. The classname provided by default in the sample is the `org.apache.shiro.realm.ldap.JndiLdapRealm` this implementation provides us with the ability to authenticate but by default has authorization disabled. In order to provide authorization - which is seen by Shiro as dependent on an LDAP schema that is specific to each organization - an extension of JndiLdapRealm is generally used to override and implement the doGetAuhtorizationInfo method. In this particular release we are providing a simple authorization provider that can be used along with the Shiro authentication provider.
+
+**main.ldapRealm.userDnTemplate** - in order to bind a simple username to an LDAP server that generally requires a full distinguished name (DN), we must provide the template into which the simple username will be inserted. This template allows for the creation of a DN by injecting the simple username into the common name (CN) portion of the DN. **This element will need to be customized to reflect your deployment environment.** The template provided in the sample is only an example and is valid only within the LDAP schema distributed with Knox and is represented by the users.ldif file in the {GATEWAY_HOME}/conf directory.
+
+**main.ldapRealm.contextFactory.url** - this element is the URL that represents the host and port of LDAP server. It also includes the scheme of the protocol to use. This may be either ldap or ldaps depending on whether you are communicating with the LDAP over SSL (higly recommended). **This element will need to be cusomized to reflect your deployment environment.**.
+
+**main.ldapRealm.contextFactory.authenticationMechanism** - this element indicates the type of authentication that should be performed against the LDAP server. The current default value is `simple` which indicates a simple bind operation. This element should not need to be modified and no mechanism other than a simple bind has been tested for this particular release.
+
+**urls./**** - this element represents a single URL_Ant_Path_Expression and the value the Shiro filter chain to apply to it. This particular sample indicates that all paths into the application have the same Shiro filter chain applied. The paths are relative to the application context path. The use of the value `authcBasic` here indicates that BASIC authentication is expected for every path into the application. Adding an additional Shiro filter to that chain for validating that the request isSecure() and over SSL can be achieved by changing the value to `ssl, authcBasic`. It is not likely that you need to change this element for your environment.
+
+#### Active Directory - Special Note ####
+
+You would use LDAP configuration as documented above to authenticate against Active Directory as well.
+
+Some Active Directory specifc things to keep in mind:
+
+Typical AD main.ldapRealm.userDnTemplate value looks slightly different, such as
+    cn={0},cn=users,DC=lab,DC=sample,dc=com
+
+Please compare this with a typical Apache DS main.ldapRealm.userDnTemplate value and make note of the difference.
+    uid={0},ou=people,dc=hadoop,dc=apache,dc=org
+
+If your AD is configured to authenticate based on just the cn and password and does not require user DN, you do not have to specify value for  main.ldapRealm.userDnTemplate.
+
+
+#### LDAP over SSL (LDAPS) Configuration ####
+In order to communicate with your LDAP server over SSL (again, highly recommended), you will need to modify the topology file in a couple ways and possibly provision some keying material.
+
+1. **main.ldapRealm.contextFactory.url** must be changed to have the `ldaps` protocol scheme and the port must be the SSL listener port on your LDAP server.
+2. Identity certificate (keypair) provisioned to LDAP server - your LDAP server specific documentation should indicate what is requried for providing a cert or keypair to represent the LDAP server identity to connecting clients.
+3. Trusting the LDAP Server's public key - if the LDAP Server's identity certificate is issued by a well known and trusted certificate authority and is already represented in the JRE's cacerts truststore then you don't need to do anything for trusting the LDAP server's cert. If, however, the cert is selfsigned or issued by an untrusted authority you will need to either add it to the cacerts keystore or to another truststore that you may direct Knox to utilize through a system property.
+
+#### Session Configuration ####
+
+Knox maps each cluster topology to a web application and leverages standard JavaEE session management.
+
+To configure session idle timeout for the topology, please specify value of parameter sessionTimeout for ShiroProvider in your topology file.  If you do not specify the value for this parameter, it defaults to 30minutes.
+
+The definition would look like the following in the topoloogy file:
+
+    ...
+    <provider>
+        <role>authentication</role>
+        <name>ShiroProvider</name>
+        <enabled>true</enabled>
+        <param>
+            <!--
+            Session timeout in minutes. This is really idle timeout.
+            Defaults to 30 minutes, if the property value is not defined.
+            Current client authentication will expire if client idles
+            continuously for more than this value
+            -->
+            <name>sessionTimeout</name>
+            <value>30</value>
+        </param>
+    <provider>
+    ...
+
+
+At present, ShiroProvider in Knox leverages JavaEE session to maintain authentication state for a user across requests using JSESSIONID cookie.  So, a client that authenticated with Knox could pass the JSESSIONID cookie with repeated requests as long as the session has not timed out instead of submitting userid/password with every request.  Presenting a valid session cookie in place of userid/password would also perform better as additional credential store lookups are avoided.
+
+
+

Added: knox/trunk/books/0.7.0/config_authz.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.7.0/config_authz.md?rev=1681785&view=auto
==============================================================================
--- knox/trunk/books/0.7.0/config_authz.md (added)
+++ knox/trunk/books/0.7.0/config_authz.md Tue May 26 16:07:07 2015
@@ -0,0 +1,349 @@
+<!---
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+--->
+
+### Authorization ###
+
+#### Service Level Authorization ####
+
+The Knox Gateway has an out-of-the-box authorization provider that allows administrators to restrict access to the individual services within a Hadoop cluster.
+
+This provider utilizes a simple and familiar pattern of using ACLs to protect Hadoop resources by specifying users, groups and ip addresses that are permitted access.
+
+Note: In the examples below \{serviceName\} represents a real service name (e.g. WEBHDFS) and would be replaced with these values in an actual configuration.
+
+##### Usecases #####
+
+###### USECASE-1: Restrict access to specific Hadoop services to specific Users
+
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>guest;*;*</value>
+    </param>
+
+###### USECASE-2: Restrict access to specific Hadoop services to specific Groups
+
+    <param>
+        <name>{serviceName}.acls</name>
+        <value>*;admins;*</value>
+    </param>
+
+###### USECASE-3: Restrict access to specific Hadoop services to specific Remote IPs
+
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>*;*;127.0.0.1</value>
+    </param>
+
+###### USECASE-4: Restrict access to specific Hadoop services to specific Users OR users within specific Groups
+
+    <param>
+        <name>{serviceName}.acl.mode</name>
+        <value>OR</value>
+    </param>
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>guest;admin;*</value>
+    </param>
+
+###### USECASE-5: Restrict access to specific Hadoop services to specific Users OR users from specific Remote IPs
+
+    <param>
+        <name>{serviceName}.acl.mode</name>
+        <value>OR</value>
+    </param>
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>guest;*;127.0.0.1</value>
+    </param>
+
+###### USECASE-6: Restrict access to specific Hadoop services to users within specific Groups OR from specific Remote IPs
+
+    <param>
+        <name>{serviceName}.acl.mode</name>
+        <value>OR</value>
+    </param>
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>*;admin;127.0.0.1</value>
+    </param>
+
+###### USECASE-7: Restrict access to specific Hadoop services to specific Users OR users within specific Groups OR from specific Remote IPs
+
+    <param>
+        <name>{serviceName}.acl.mode</name>
+        <value>OR</value>
+    </param>
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>guest;admin;127.0.0.1</value>
+    </param>
+
+###### USECASE-8: Restrict access to specific Hadoop services to specific Users AND users within specific Groups
+
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>guest;admin;*</value>
+    </param>
+
+###### USECASE-9: Restrict access to specific Hadoop services to specific Users AND users from specific Remote IPs
+
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>guest;*;127.0.0.1</value>
+    </param>
+
+###### USECASE-10: Restrict access to specific Hadoop services to users within specific Groups AND from specific Remote IPs
+
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>*;admins;127.0.0.1</value>
+    </param>
+
+###### USECASE-11: Restrict access to specific Hadoop services to specific Users AND users within specific Groups AND from specific Remote IPs
+
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>guest;admins;127.0.0.1</value>
+    </param>
+
+#### Configuration ####
+
+ACLs are bound to services within the topology descriptors by introducing the authorization provider with configuration like:
+
+    <provider>
+        <role>authorization</role>
+        <name>AclsAuthz</name>
+        <enabled>true</enabled>
+    </provider>
+
+The above configuration enables the authorization provider but does not indicate any ACLs yet and therefore there is no restriction to accessing the Hadoop services. In order to indicate the resources to be protected and the specific users, groups or ip's to grant access, we need to provide parameters like the following:
+
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>username[,*|username...];group[,*|group...];ipaddr[,*|ipaddr...]</value>
+    </param>
+    
+where `{serviceName}` would need to be the name of a configured Hadoop service within the topology.
+
+NOTE: ipaddr is unique among the parts of the ACL in that you are able to specify a wildcard within an ipaddr to indicate that the remote address must being with the String prior to the asterisk within the ipaddr acl. For instance:
+
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>*;*;192.168.*</value>
+    </param>
+    
+This indicates that the request must come from an IP address that begins with '192.168.' in order to be granted access.
+
+Note also that configuration without any ACLs defined is equivalent to:
+
+    <param>
+        <name>{serviceName}.acl</name>
+        <value>*;*;*</value>
+    </param>
+
+meaning: all users, groups and IPs have access.
+Each of the elements of the acl param support multiple values via comma separated list and the `*` wildcard to match any.
+
+For instance:
+
+    <param>
+        <name>webhdfs.acl</name>
+        <value>hdfs;admin;127.0.0.2,127.0.0.3</value>
+    </param>
+
+this configuration indicates that ALL of the following are satisfied:
+
+1. the user "hdfs" has access AND
+2. users in the group "admin" have access AND
+3. any authenticated user from either 127.0.0.2 or 127.0.0.3 will have access
+
+This allows us to craft policy that restricts the members of a large group to a subset that should have access.
+The user being removed from the group will allow access to be denied even though their username may have been in the ACL.
+
+An additional configuration element may be used to alter the processing of the ACL to be OR instead of the default AND behavior:
+
+    <param>
+        <name>{serviceName}.acl.mode</name>
+        <value>OR</value>
+    </param>
+
+this processing behavior requires that the effective user satisfy one of the parts of the ACL definition in order to be granted access.
+For instance:
+
+    <param>
+        <name>webhdfs.acl</name>
+        <value>hdfs,guest;admin;127.0.0.2,127.0.0.3</value>
+    </param>
+
+You may also set the ACL processing mode at the top level for the topology. This essentially sets the default for the managed cluster.
+It may then be overridden at the service level as well.
+
+    <param>
+        <name>acl.mode</name>
+        <value>OR</value>
+    </param>
+
+this configuration indicates that ONE of the following must be satisfied to be granted access:
+
+1. the user is "hdfs" or "guest" OR
+2. the user is in "admin" group OR
+3. the request is coming from 127.0.0.2 or 127.0.0.3
+
+#### Other Related Configuration ####
+
+The principal mapping aspect of the identity assertion provider is important to understand in order to fully utilize the authorization features of this provider.
+
+This feature allows us to map the authenticated principal to a runas or impersonated principal to be asserted to the Hadoop services in the backend.
+When a principal mapping is defined that results in an impersonated principal being created the impersonated principal is then the effective principal.
+If there is no mapping to another principal then the authenticated or primary principal is then the effective principal.
+Principal mapping has actually been available in the identity assertion provider from the beginning of Knox and is documented fully in the Identity Assertion section of this guide.
+
+    <param>
+        <name>principal.mapping</name>
+        <value>{primaryPrincipal}[,...]={impersonatedPrincipal}[;...]</value>
+    </param>
+
+For instance:
+
+    <param>
+        <name>principal.mapping</name>
+        <value>guest=hdfs</value>
+    </param>
+
+In addition, we allow the administrator to map groups to effective principals. This is done through another param within the identity assertion provider:
+
+    <param>
+        <name>group.principal.mapping</name>
+        <value>{userName[,*|userName...]}={groupName[,groupName...]}[,...]</value>
+    </param>
+
+For instance:
+
+    <param>
+        <name>group.principal.mapping</name>
+        <value>*=users;hdfs=admin</value>
+    </param>
+
+this configuration indicates that all (*) authenticated users are members of the "users" group and that user "hdfs" is a member of the admin group. Group principal mapping has been added along with the authorization provider described in this document.
+
+For more information on principal and group principal mapping see the Identity Assertion section of this guide.
+
+These additional mapping capabilities are used together with the authorization ACL policy.
+An example of a full topology that illustrates these together is below.
+
+    <topology>
+        <gateway>
+            <provider>
+                <role>authentication</role>
+                <name>ShiroProvider</name>
+                <enabled>true</enabled>
+                <param>
+                    <name>main.ldapRealm</name>
+                    <value>org.apache.shiro.realm.ldap.JndiLdapRealm</value>
+                </param>
+                <param>
+                    <name>main.ldapRealm.userDnTemplate</name>
+                    <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
+                </param>
+                <param>
+                    <name>main.ldapRealm.contextFactory.url</name>
+                    <value>ldap://localhost:33389</value>
+                </param>
+                <param>
+                    <name>main.ldapRealm.contextFactory.authenticationMechanism</name>
+                    <value>simple</value>
+                </param>
+                <param>
+                    <name>urls./**</name>
+                    <value>authcBasic</value>
+                </param>
+            </provider>
+            <provider>
+                <role>identity-assertion</role>
+                <name>Default</name>
+                <enabled>true</enabled>
+                <param>
+                    <name>principal.mapping</name>
+                    <value>guest=hdfs;</value>
+                </param>
+                <param>
+                    <name>group.principal.mapping</name>
+                    <value>*=users;hdfs=admin</value>
+                </param>
+            </provider>
+            <provider>
+                <role>authorization</role>
+                <name>AclsAuthz</name>
+                <enabled>true</enabled>
+                <param>
+                    <name>acl.mode</name>
+                    <value>OR</value>
+                </param>
+                <param>
+                    <name>webhdfs.acl.mode</name>
+                    <value>AND</value>
+                </param>
+                <param>
+                    <name>webhdfs.acl</name>
+                    <value>hdfs;admin;127.0.0.2,127.0.0.3</value>
+                </param>
+                <param>
+                    <name>webhcat.acl</name>
+                    <value>hdfs;admin;127.0.0.2,127.0.0.3</value>
+                </param>
+            </provider>
+            <provider>
+                <role>hostmap</role>
+                <name>static</name>
+                <enabled>true</enabled>
+                <param>
+                    <name>localhost</name>
+                    <value>sandbox,sandbox.hortonworks.com</value>
+                </param>
+            </provider>
+        </gateway>
+
+		<service>
+        	<role>JOBTRACKER</role>
+        	<url>rpc://localhost:8050</url>
+    	</service>
+
+    	<service>
+        	<role>WEBHDFS</role>
+        	<url>http://localhost:50070/webhdfs</url>
+    	</service>
+
+    	<service>
+        	<role>WEBHCAT</role>
+        	<url>http://localhost:50111/templeton</url>
+    	</service>
+
+    	<service>
+        	<role>OOZIE</role>
+        	<url>http://localhost:11000/oozie</url>
+    	</service>
+
+    	<service>
+        	<role>WEBHBASE</role>
+        	<url>http://localhost:60080</url>
+    	</service>
+
+    	<service>
+        	<role>HIVE</role>
+        	<url>http://localhost:10001/cliservice</url>
+    	</service>
+    </topology>

Added: knox/trunk/books/0.7.0/config_ha.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.7.0/config_ha.md?rev=1681785&view=auto
==============================================================================
--- knox/trunk/books/0.7.0/config_ha.md (added)
+++ knox/trunk/books/0.7.0/config_ha.md Tue May 26 16:07:07 2015
@@ -0,0 +1,124 @@
+<!---
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+--->
+
+### High Availability ###
+
+#### Configure Knox instances ####
+
+All Knox instances must be synced to use the same topologies credentials keystores.
+These files are located under {GATEWAY_HOME}/conf/security/keystores/{TOPOLOGY_NAME}-credentials.jceks.
+They are generated after the first topology deployment.
+Currently these files can be synced just manually. There is no automation tool.
+Here are the steps to sync topologies credentials keystores:
+
+1. Choose Knox instance that will be the source for topologies credentials keystores. Let's call it keystores master
+1. Replace topologies credentials keystores in the other Knox instance with topologies credentials keystores from keystores master
+1. Restart Knox instances
+
+#### High Availability with Apache HTTP Server + mod_proxy + mod_proxy_balancer ####
+
+##### 1 - Requirements #####
+
+###### openssl-devel ######
+
+openssl-devel is required for Apache Module mod_ssl.
+
+    sudo yum install openssl-devel
+
+###### Apache HTTP Server ######
+
+Apache HTTP Server 2.4.6 or later is required. See this document for installing and setting up Apache HTTP Server: http://httpd.apache.org/docs/2.4/install.html
+
+Hint: pass --enable-ssl to ./configure command to enable Apache Module mod_ssl generation.
+
+###### Apache Module mod_proxy ######
+
+See this document for setting up Apache Module mod_proxy: http://httpd.apache.org/docs/2.4/mod/mod_proxy.html
+
+###### Apache Module mod_proxy_balancer ######
+
+See this document for setting up Apache Module mod_proxy_balancer: http://httpd.apache.org/docs/2.4/mod/mod_proxy_balancer.html
+
+###### Apache Module mod_ssl ######
+
+See this document for setting up Apache Module mod_ssl: http://httpd.apache.org/docs/2.4/mod/mod_ssl.html
+
+##### 2 - Configuration example #####
+
+###### Generate certificate for Apache HTTP Server ######
+
+See this document for an example: http://www.akadia.com/services/ssh_test_certificate.html
+
+By convention, Apache HTTP Server and Knox certificates are put into /etc/apache2/ssl/ folder.
+
+###### Update Apache HTTP Server configuration file ######
+
+This file is located under {APACHE_HOME}/conf/httpd.conf.
+
+Following directives have to be added or uncommented in the configuration file:
+
+* LoadModule proxy_module modules/mod_proxy.so
+* LoadModule proxy_http_module modules/mod_proxy_http.so
+* LoadModule proxy_balancer_module modules/mod_proxy_balancer.so
+* LoadModule ssl_module modules/mod_ssl.so
+* LoadModule lbmethod_byrequests_module modules/mod_lbmethod_byrequests.so
+* LoadModule lbmethod_bytraffic_module modules/mod_lbmethod_bytraffic.so
+* LoadModule lbmethod_bybusyness_module modules/mod_lbmethod_bybusyness.so
+* LoadModule lbmethod_heartbeat_module modules/mod_lbmethod_heartbeat.so
+* LoadModule slotmem_shm_module modules/mod_slotmem_shm.so
+
+Also following lines have to be added to file. Replace placeholders (${...}) with real data:
+
+    Listen 443
+    <VirtualHost *:443>
+       SSLEngine On
+       SSLProxyEngine On
+       SSLCertificateFile ${PATH_TO_CERTICICATE_FILE}
+       SSLCertificateKeyFile ${PATH_TO_CERTICICATE_KEY_FILE}
+       SSLProxyCACertificateFile ${PATH_TO_PROXY_CA_CERTICICATE_FILE}
+
+       ProxyRequests Off
+       ProxyPreserveHost Off
+
+       Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
+       <Proxy balancer://mycluster>
+         BalancerMember ${HOST_#1} route=1
+         BalancerMember ${HOST_#2} route=2
+         ...
+         BalancerMember ${HOST_#N} route=N
+
+         ProxySet failontimeout=On lbmethod=${LB_METHOD} stickysession=ROUTEID 
+       </Proxy>
+
+       ProxyPass / balancer://mycluster/
+       ProxyPassReverse / balancer://mycluster/
+    </VirtualHost>
+
+Note:
+
+* SSLProxyEngine enables SSL between Apache HTTP Server and Knox instances;
+* SSLCertificateFile and SSLCertificateKeyFile have to point to certificate data of Apache HTTP Server. User will use this certificate for communications with Apache HTTP Server;
+* SSLProxyCACertificateFile has to point to Knox certificates.
+
+###### Start/stop Apache HTTP Server ######
+
+    APACHE_HOME/bin/apachectl -k start
+    APACHE_HOME/bin/apachectl -k stop
+
+###### Verify ######
+
+Use Knox samples.



Mime
View raw message