knox-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From m...@apache.org
Subject svn commit: r1850181 [6/13] - in /knox: site/books/knox-1-3-0/ site/books/knox-1-3-0/adminui/ trunk/books/1.3.0/ trunk/books/1.3.0/dev-guide/ trunk/books/1.3.0/img/ trunk/books/1.3.0/img/adminui/
Date Wed, 02 Jan 2019 17:31:31 GMT
Added: knox/trunk/books/1.3.0/book_limitations.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/1.3.0/book_limitations.md?rev=1850181&view=auto
==============================================================================
--- knox/trunk/books/1.3.0/book_limitations.md (added)
+++ knox/trunk/books/1.3.0/book_limitations.md Wed Jan  2 17:31:29 2019
@@ -0,0 +1,39 @@
+<!---
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+--->
+
+## Limitations ##
+
+
+### Secure Oozie POST/PUT Request Payload Size Restriction ###
+
+With one exception there are no known size limits for requests or responses payloads that
pass through the gateway.
+The exception involves POST or PUT request payload sizes for Oozie in a Kerberos secured
Hadoop cluster.
+In this one case there is currently a 4Kb payload size limit for the first request made to
the Hadoop cluster.
+This is a result of how the gateway negotiates a trust relationship between itself and the
cluster via SPNEGO.
+There is an undocumented configuration setting to modify this limit's value if required.
+In the future this will be made more easily configurable and at that time it will be documented.
+
+### Group Membership Propagation ###
+
+Groups that are acquired via Shiro Group Lookup and/or Identity Assertion Group Principal
Mapping are not propagated to the Hadoop services.
+Therefore, groups used for Service Level Authorization policy may not match those acquired
within the cluster via GroupMappingServiceProvider plugins.
+
+### Knox Consumer Restriction ###
+
+Consumption of messages via Knox at this time is not supported.  The Confluent Kafka REST
Proxy that Knox relies upon is stateful when used for
+consumption of messages.
+

Added: knox/trunk/books/1.3.0/book_service-details.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/1.3.0/book_service-details.md?rev=1850181&view=auto
==============================================================================
--- knox/trunk/books/1.3.0/book_service-details.md (added)
+++ knox/trunk/books/1.3.0/book_service-details.md Wed Jan  2 17:31:29 2019
@@ -0,0 +1,97 @@
+<!---
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+## Service Details ##
+
+In the sections that follow, the integrations currently available out of the box with the
gateway will be described.
+In general these sections will include examples that demonstrate how to access each of these
services via the gateway.
+In many cases this will include both the use of [cURL][curl] as a REST API client as well
as the use of the Knox Client DSL.
+You may notice that there are some minor differences between using the REST API of a given
service via the gateway.
+In general this is necessary in order to achieve the goal of not leaking internal Hadoop
cluster details to the client.
+
+Keep in mind that the gateway uses a plugin model for supporting Hadoop services.
+Check back with the [Apache Knox][site] site for the latest news on plugin availability.
+You can also create your own custom plugin to extend the capabilities of the gateway.
+
+These are the current Hadoop services with built-in support.
+
+* #[WebHDFS]
+* #[WebHCat]
+* #[Oozie]
+* #[HBase]
+* #[Hive]
+* #[Yarn]
+* #[Kafka]
+* #[Storm]
+* #[Solr]
+* #[Avatica]
+* #[Livy Server]
+* #[Elasticsearch]
+
+### Assumptions
+
+This document assumes a few things about your environment in order to simplify the examples.
+
+* The JVM is executable as simply `java`.
+* The Apache Knox Gateway is installed and functional.
+* The example commands are executed within the context of the `GATEWAY_HOME` current directory.
+The `GATEWAY_HOME` directory is the directory within the Apache Knox Gateway installation
that contains the README file and the bin, conf and deployments directories.
+* The [cURL][curl] command line HTTP client utility is installed and functional.
+* A few examples optionally require the use of commands from a standard Groovy installation.
+These examples are optional but to try them you will need Groovy [installed](http://groovy.codehaus.org/Installing+Groovy).
+* The default configuration for all of the samples is setup for use with Hortonworks' [Sandbox][sandbox]
version 2.
+
+### Customization
+
+Using these samples with other Hadoop installations will require changes to the steps described
here as well as changes to referenced sample scripts.
+This will also likely require changes to the gateway's default configuration.
+In particular host names, ports, user names and password may need to be changed to match
your environment.
+These changes may need to be made to gateway configuration and also the Groovy sample script
files in the distribution.
+All of the values that may need to be customized in the sample scripts can be found together
at the top of each of these files.
+
+### cURL
+
+The cURL HTTP client command line utility is used extensively in the examples for each service.
+In particular this form of the cURL command line is used repeatedly.
+
+    curl -i -k -u guest:guest-password ...
+
+The option `-i` (aka `--include`) is used to output HTTP response header information.
+This will be important when the content of the HTTP Location header is required for subsequent
requests.
+
+The option `-k` (aka `--insecure`) is used to avoid any issues resulting from the use of
demonstration SSL certificates.
+
+The option `-u` (aka `--user`) is used to provide the credentials to be used when the client
is challenged by the gateway.
+
+Keep in mind that the samples do not use the cookie features of cURL for the sake of simplicity.
+Therefore each request via cURL will result in an authentication.
+
+<<service_webhdfs.md>>
+<<service_webhcat.md>>
+<<service_oozie.md>>
+<<service_hbase.md>>
+<<service_hive.md>>
+<<service_yarn.md>>
+<<service_kafka.md>>
+<<service_storm.md>>
+<<service_solr.md>>
+<<service_config.md>>
+<<service_default_ha.md>>
+<<service_avatica.md>>
+<<service_livy.md>>
+<<service_elasticsearch.md>>
+<<service_service_test.md>>

Added: knox/trunk/books/1.3.0/book_topology_port_mapping.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/1.3.0/book_topology_port_mapping.md?rev=1850181&view=auto
==============================================================================
--- knox/trunk/books/1.3.0/book_topology_port_mapping.md (added)
+++ knox/trunk/books/1.3.0/book_topology_port_mapping.md Wed Jan  2 17:31:29 2019
@@ -0,0 +1,35 @@
+#### Topology Port Mapping #####
+This feature allows mapping of a topology to a port, as a result one can have a specific
topology listening on a configured port. This feature 
+routes URLs to these port-mapped topologies without the additional context that the gateway
uses for differentiating from one Hadoop cluster to another,
+just like the #[Default Topology URLs] feature, but on a dedicated port. 
+
+The configuration for Topology Port Mapping goes in `gateway-site.xml` file. The configuration
uses the property name and value model
+to configure the settings for this feature. The format for the property name is `gateway.port.mapping.{topologyName}`
and value is the port number that this
+topology would listen on. 
+
+In the following example, the topology `development` will listen on 9443 (if the port is
not already taken).
+
+      <property>
+          <name>gateway.port.mapping.development</name>
+          <value>9443</value>
+          <description>Topology and Port mapping</description>
+      </property>
+
+An example of how one can access WebHDFS URL using the above configuration is
+
+     https://{gateway-host}:9443/webhdfs
+     https://{gateway-host}:9443/{gateway-path}/development/webhdfs
+     https://{gateway-host}:{gateway-port}/{gateway-path}/development/webhdfs
+
+All of the above URL will be valid URLs for the above described configuration.
+
+This feature is turned on by default, to turn it off use the property `gateway.port.mapping.enabled`.

+e.g.
+
+     <property>
+         <name>gateway.port.mapping.enabled</name>
+         <value>false</value>
+         <description>Enable/Disable port mapping feature.</description>
+     </property>
+
+If a topology mapped port is in use by another topology or process then an ERROR message
is logged and gateway startup continues as normal.

Added: knox/trunk/books/1.3.0/book_troubleshooting.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/1.3.0/book_troubleshooting.md?rev=1850181&view=auto
==============================================================================
--- knox/trunk/books/1.3.0/book_troubleshooting.md (added)
+++ knox/trunk/books/1.3.0/book_troubleshooting.md Wed Jan  2 17:31:29 2019
@@ -0,0 +1,320 @@
+<!---
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+--->
+
+## Troubleshooting ##
+
+### Finding Logs ###
+
+When things aren't working the first thing you need to do is examine the diagnostic logs.
+Depending upon how you are running the gateway these diagnostic logs will be output to different
locations.
+
+#### java -jar bin/gateway.jar ####
+
+When the gateway is run this way the diagnostic output is written directly to the console.
+If you want to capture that output you will need to redirect the console output to a file
using OS specific techniques.
+
+    java -jar bin/gateway.jar > gateway.log
+
+#### bin/gateway.sh start ####
+
+When the gateway is run this way the diagnostic output is written to `{GATEWAY_HOME}/log/knox.out`
and `{GATEWAY_HOME}/log/knox.err`.
+Typically only knox.out will have content.
+
+
+### Increasing Logging ###
+
+The `log4j.properties` files `{GATEWAY_HOME}/conf` can be used to change the granularity
of the logging done by Knox.
+The Knox server must be restarted in order for these changes to take effect.
+There are various useful loggers pre-populated but commented out.
+
+    log4j.logger.org.apache.knox.gateway=DEBUG # Use this logger to increase the debugging
of Apache Knox itself.
+    log4j.logger.org.apache.shiro=DEBUG          # Use this logger to increase the debugging
of Apache Shiro.
+    log4j.logger.org.apache.http=DEBUG           # Use this logger to increase the debugging
of Apache HTTP components.
+    log4j.logger.org.apache.http.client=DEBUG    # Use this logger to increase the debugging
of Apache HTTP client component.
+    log4j.logger.org.apache.http.headers=DEBUG   # Use this logger to increase the debugging
of Apache HTTP header.
+    log4j.logger.org.apache.http.wire=DEBUG      # Use this logger to increase the debugging
of Apache HTTP wire traffic.
+
+
+### LDAP Server Connectivity Issues ###
+
+If the gateway cannot contact the configured LDAP server you will see errors in the gateway
diagnostic output.
+
+    13/11/15 16:30:17 DEBUG authc.BasicHttpAuthenticationFilter: Attempting to execute login
with headers [Basic Z3Vlc3Q6Z3Vlc3QtcGFzc3dvcmQ=]
+    13/11/15 16:30:17 DEBUG ldap.JndiLdapRealm: Authenticating user 'guest' through LDAP
+    13/11/15 16:30:17 DEBUG ldap.JndiLdapContextFactory: Initializing LDAP context using
URL 	[ldap://localhost:33389] and principal [uid=guest,ou=people,dc=hadoop,dc=apache,dc=org]
with pooling disabled
+    13/11/15 16:30:17 DEBUG servlet.SimpleCookie: Added HttpServletResponse Cookie [rememberMe=deleteMe;
Path=/gateway/vaultservice; Max-Age=0; Expires=Thu, 14-Nov-2013 21:30:17 GMT]
+    13/11/15 16:30:17 DEBUG authc.BasicHttpAuthenticationFilter: Authentication required:
sending 401 Authentication challenge response.
+
+The client should see something along the lines of:
+
+    HTTP/1.1 401 Unauthorized
+    WWW-Authenticate: BASIC realm="application"
+    Content-Length: 0
+    Server: Jetty(8.1.12.v20130726)
+
+Resolving this will require ensuring that the LDAP server is running and that connection
information is correct.
+The LDAP server connection information is configured in the cluster's topology file (e.g.
{GATEWAY_HOME}/deployments/sandbox.xml).
+
+
+### Hadoop Cluster Connectivity Issues ###
+
+If the gateway cannot contact one of the services in the configured Hadoop cluster you will
see errors in the gateway diagnostic output.
+
+    13/11/18 18:49:45 WARN knox.gateway: Connection exception dispatching request: http://localhost:50070/webhdfs/v1/?user.name=guest&op=LISTSTATUS
org.apache.http.conn.HttpHostConnectException: Connection to http://localhost:50070 refused
+    org.apache.http.conn.HttpHostConnectException: Connection to http://localhost:50070 refused
+      at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190)
+      at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
+      at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
+      at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
+      at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
+      at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
+      at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
+      at org.apache.knox.gateway.dispatch.HttpClientDispatch.executeRequest(HttpClientDispatch.java:99)
+
+The resulting behavior on the client will differ by client.
+For the client DSL executing the `{GATEWAY_HOME}/samples/ExampleWebHdfsLs.groovy` the output
will look like this.
+
+    Caught: org.apache.knox.gateway.shell.HadoopException: org.apache.knox.gateway.shell.ErrorResponse:
HTTP/1.1 500 Server Error
+    org.apache.knox.gateway.shell.HadoopException: org.apache.knox.gateway.shell.ErrorResponse:
HTTP/1.1 500 Server Error
+      at org.apache.knox.gateway.shell.AbstractRequest.now(AbstractRequest.java:72)
+      at org.apache.knox.gateway.shell.AbstractRequest$now.call(Unknown Source)
+      at ExampleWebHdfsLs.run(ExampleWebHdfsLs.groovy:28)
+
+When executing commands requests via cURL the output might look similar to the following
example.
+
+    Set-Cookie: JSESSIONID=16xwhpuxjr8251ufg22f8pqo85;Path=/gateway/sandbox;Secure
+    Content-Type: text/html;charset=ISO-8859-1
+    Cache-Control: must-revalidate,no-cache,no-store
+    Content-Length: 21856
+    Server: Jetty(8.1.12.v20130726)
+
+    <html>
+    <head>
+    <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
+    <title>Error 500 Server Error</title>
+    </head>
+    <body><h2>HTTP ERROR 500</h2>
+
+Resolving this will require ensuring that the Hadoop services are running and that connection
information is correct.
+Basic Hadoop connectivity can be evaluated using cURL as described elsewhere.
+Otherwise the Hadoop cluster connection information is configured in the cluster's topology
file (e.g. `{GATEWAY_HOME}/deployments/sandbox.xml`).
+
+### HTTP vs HTTPS protocol issues ###
+When Knox is configured to accept requests over SSL and is presented with a request over
plain HTTP, the client is presented with an error such as seen in the following:
+
+    curl -i -k -u guest:guest-password -X GET 'http://localhost:8443/gateway/sandbox/webhdfs/v1/?op=LISTSTATUS'
+    the following error is returned
+    curl: (52) Empty reply from server
+
+This is the default behavior for Jetty SSL listener. While the credentials to the default
authentication provider continue to be username and password, we do not want to encourage
sending these in clear text. Since preemptively sending BASIC credentials is a common pattern
with REST APIs it would be unwise to redirect to a HTTPS listener thus allowing clear text
passwords.
+
+To resolve this issue, we have two options:
+
+1. change the scheme in the URL to https and deal with any trust relationship issues with
the presented server certificate
+2. Disabling SSL in gateway-site.xml - this is not encouraged due to the reasoning described
above.
+
+### Check Hadoop Cluster Access via cURL ###
+
+When you are experiencing connectivity issue it can be helpful to "bypass" the gateway and
invoke the Hadoop REST APIs directly.
+This can easily be done using the cURL command line utility or many other REST/HTTP clients.
+Exactly how to use cURL depends on the configuration of your Hadoop cluster.
+In general however you will use a command line the one that follows.
+
+    curl -ikv -X GET 'http://namenode-host:50070/webhdfs/v1/?op=LISTSTATUS'
+
+If you are using Sandbox the WebHDFS or NameNode port will be mapped to localhost so this
command can be used.
+
+    curl -ikv -X GET 'http://localhost:50070/webhdfs/v1/?op=LISTSTATUS'
+
+If you are using a cluster secured with Kerberos you will need to have used `kinit` to authenticate
to the KDC.
+Then the command below should verify that WebHDFS in the Hadoop cluster is accessible.
+
+    curl -ikv --negotiate -u : -X 'http://localhost:50070/webhdfs/v1/?op=LISTSTATUS'
+
+
+### Authentication Issues ###
+The following log information is available when you enable debug level logging for shiro.
This can be done within the conf/log4j.properties file. Not the "Password not correct for
user" message.
+
+    13/11/15 16:37:15 DEBUG authc.BasicHttpAuthenticationFilter: Attempting to execute login
with headers [Basic Z3Vlc3Q6Z3Vlc3QtcGFzc3dvcmQw]
+    13/11/15 16:37:15 DEBUG ldap.JndiLdapRealm: Authenticating user 'guest' through LDAP
+    13/11/15 16:37:15 DEBUG ldap.JndiLdapContextFactory: Initializing LDAP context using
URL [ldap://localhost:33389] and principal [uid=guest,ou=people,dc=hadoop,dc=apache,dc=org]
with pooling disabled
+    2013-11-15 16:37:15,899 INFO  Password not correct for user 'uid=guest,ou=people,dc=hadoop,dc=apache,dc=org'
+    2013-11-15 16:37:15,899 INFO  Authenticator org.apache.directory.server.core.authn.SimpleAuthenticator@354c78e3
failed to authenticate: BindContext for DN 'uid=guest,ou=people,dc=hadoop,dc=apache,dc=org',
credentials <0x67 0x75 0x65 0x73 0x74 0x2D 0x70 0x61 0x73 0x73 0x77 0x6F 0x72 0x64 0x30
>
+    2013-11-15 16:37:15,899 INFO  Cannot bind to the server
+    13/11/15 16:37:15 DEBUG servlet.SimpleCookie: Added HttpServletResponse Cookie [rememberMe=deleteMe;
Path=/gateway/vaultservice; Max-Age=0; Expires=Thu, 14-Nov-2013 21:37:15 GMT]
+    13/11/15 16:37:15 DEBUG authc.BasicHttpAuthenticationFilter: Authentication required:
sending 401 Authentication challenge response.
+
+The client will likely see something along the lines of:
+
+    HTTP/1.1 401 Unauthorized
+    WWW-Authenticate: BASIC realm="application"
+    Content-Length: 0
+    Server: Jetty(8.1.12.v20130726)
+
+#### Using ldapsearch to verify LDAP connectivity and credentials
+
+If your authentication to Knox fails and you believe you're using correct credentials, you
could try to verify the connectivity and credentials using ldapsearch, assuming you are using
LDAP directory for authentication.
+
+Assuming you are using the default values that came out of box with Knox, your ldapsearch
command would be like the following
+
+    ldapsearch -h localhost -p 33389 -D "uid=guest,ou=people,dc=hadoop,dc=apache,dc=org"
-w guest-password -b "uid=guest,ou=people,dc=hadoop,dc=apache,dc=org" "objectclass=*"
+
+This should produce output like the following
+
+    # extended LDIF
+    
+    LDAPv3
+    base <uid=guest,ou=people,dc=hadoop,dc=apache,dc=org> with scope subtree
+    filter: objectclass=*
+    requesting: ALL
+    
+    
+    # guest, people, hadoop.apache.org
+    dn: uid=guest,ou=people,dc=hadoop,dc=apache,dc=org
+    objectClass: organizationalPerson
+    objectClass: person
+    objectClass: inetOrgPerson
+    objectClass: top
+    uid: guest
+    cn: Guest
+    sn: User
+    userpassword:: Z3Vlc3QtcGFzc3dvcmQ=
+    
+    # search result
+    search: 2
+    result: 0 Success
+    
+    # numResponses: 2
+    # numEntries: 1
+
+In a more general form the ldapsearch command would be
+
+    ldapsearch -h {HOST} -p {PORT} -D {DN of binding user} -w {bind password} -b {DN of binding
user} "objectclass=*}
+
+### Hostname Resolution Issues ###
+
+The deployments/sandbox.xml topology file has the host mapping feature enabled.
+This is required due to the way networking is setup in the Sandbox VM.
+Specifically the VM's internal hostname is sandbox.hortonworks.com.
+Since this hostname cannot be resolved to the actual VM Knox needs to map that hostname to
something resolvable.
+
+If for example host mapping is disabled but the Sandbox VM is still used you will see an
error in the diagnostic output similar to the below.
+
+    13/11/18 19:11:35 WARN knox.gateway: Connection exception dispatching request: http://sandbox.hortonworks.com:50075/webhdfs/v1/user/guest/example/README?op=CREATE&namenoderpcaddress=sandbox.hortonworks.com:8020&user.name=guest&overwrite=false
java.net.UnknownHostException: sandbox.hortonworks.com
+    java.net.UnknownHostException: sandbox.hortonworks.com
+      at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
+
+On the other hand if you are migrating from the Sandbox based configuration to a cluster
you have deployment you may see a similar error.
+However in this case you may need to disable host mapping.
+This can be done by modifying the topology file (e.g. deployments/sandbox.xml) for the cluster.
+
+    ...
+    <provider>
+        <role>hostmap</role>
+        <name>static</name>
+        <enabled>false</enabled>
+        <param><name>localhost</name><value>sandbox,sandbox.hortonworks.com</value></param>
+    </provider>
+    ....
+
+
+### Job Submission Issues - HDFS Home Directories ###
+
+If you see error like the following in your console  while submitting a Job using groovy
shell, it is likely that the authenticated user does not have a home directory on HDFS.
+
+    Caught: org.apache.knox.gateway.shell.HadoopException: org.apache.knox.gateway.shell.ErrorResponse:
HTTP/1.1 403 Forbidden
+    org.apache.knox.gateway.shell.HadoopException: org.apache.knox.gateway.shell.ErrorResponse:
HTTP/1.1 403 Forbidden
+
+You would also see this error if you try file operation on the home directory of the authenticating
user.
+
+The error would look a little different as shown below  if you are attempting to the operation
with cURL.
+
+    {"RemoteException":{"exception":"AccessControlException","javaClassName":"org.apache.hadoop.security.AccessControlException","message":"Permission
denied: user=tom, access=WRITE, inode=\"/user\":hdfs:hdfs:drwxr-xr-x"}}* 
+
+#### Resolution
+
+Create the home directory for the user on HDFS.
+The home directory is typically of the form `/user/{userid}` and should be owned by the user.
+user 'hdfs' can create such a directory and make the user owner of the directory.
+
+
+### Job Submission Issues - OS Accounts ###
+
+If the Hadoop cluster is not secured with Kerberos, the user submitting a job need not have
an OS account on the Hadoop NodeManagers.
+
+If the Hadoop cluster is secured with Kerberos, the user submitting the job should have an
OS account on Hadoop NodeManagers.
+
+In either case if the user does not have such OS account, his file permissions are based
on user ownership of files or "other" permission in "ugo" posix permission.
+The user does not get any file permission as a member of any group if you are using default
`hadoop.security.group.mapping`.
+
+TODO: add sample error message from running test on secure cluster with missing OS account
+
+### HBase Issues ###
+
+If you experience problems running the HBase samples with the Sandbox VM it may be necessary
to restart HBase and the HBASE REST API.
+This can sometimes occur with the Sandbox VM is restarted from a saved state.
+If the client hangs after emitting the last line in the sample output below you are most
likely affected.
+
+    System version : {...}
+    Cluster version : 0.96.0.2.0.6.0-76-hadoop2
+    Status : {...}
+    Creating table 'test_table'...
+
+HBase and the HBASE REST API can be restarted using the following commands on the Hadoop
Sandbox VM.
+You will need to ssh into the VM in order to run these commands.
+
+    sudo -u hbase /usr/lib/hbase/bin/hbase-daemon.sh stop master
+    sudo -u hbase /usr/lib/hbase/bin/hbase-daemon.sh start master
+    sudo -u hbase /usr/lib/hbase/bin/hbase-daemon.sh restart rest
+
+
+### SSL Certificate Issues ###
+
+Clients that do not trust the certificate presented by the server will behave in different
ways.
+A browser will typically warn you of the inability to trust the received certificate and
give you an opportunity to add an exception for the particular certificate.
+Curl will present you with the follow message and instructions for turning of certificate
verification:
+
+    curl performs SSL certificate verification by default, using a "bundle" 
+     of Certificate Authority (CA) public keys (CA certs). If the default
+     bundle file isn't adequate, you can specify an alternate file
+     using the --cacert option.
+    If this HTTPS server uses a certificate signed by a CA represented 
+     the bundle, the certificate verification probably failed due to a
+     problem with the certificate (it might be expired, or the name might
+     not match the domain name in the URL).
+    If you'd like to turn off curl's verification of the certificate, use
+     the -k (or --insecure) option.
+
+
+### SPNego Authentication Issues ###
+
+Calls from Knox to Secure Hadoop Cluster fails, with SPNego authentication problems,
+if there was a TGT for Knox in disk cache when Knox was started.
+
+You are likely to run into this situation on developer machines where the developer could
have kinited for some testing.
+
+Work Around: clear TGT of Knox from disk cache (calling `kdestroy` would do it), before starting
Knox
+
+### Filing Bugs ###
+
+Bugs can be filed using [Jira][jira].
+Please include the results of this command below in the Environment section.
+Also include the version of Hadoop being used in the same section.
+
+    cd {GATEWAY_HOME}
+    java -jar bin/gateway.jar -version
+

Added: knox/trunk/books/1.3.0/book_ui_service_details.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/1.3.0/book_ui_service_details.md?rev=1850181&view=auto
==============================================================================
--- knox/trunk/books/1.3.0/book_ui_service_details.md (added)
+++ knox/trunk/books/1.3.0/book_ui_service_details.md Wed Jan  2 17:31:29 2019
@@ -0,0 +1,479 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+## UI Service Details ##
+
+In the sections that follow, the integrations for proxying various UIs currently available
out of the box with the
+gateway will be described. These sections will include examples that demonstrate how to access
each of these services
+via the gateway.
+
+These are the current Hadoop services with built-in support for their UIs.
+
+* #[Name Node UI]
+* #[Job History UI]
+* #[Oozie UI]
+* #[HBase UI]
+* #[Yarn UI]
+* #[Spark UI]
+* #[Ambari UI]
+* #[Ranger Admin Console]
+* #[Atlas UI]
+* #[Zeppelin UI]
+* #[Nifi UI]
+
+### Assumptions
+
+This section assumes an environment setup similar to the one in the REST services section
#[Service Details]
+
+### Name Node UI ###
+
+The Name Node UI is available on the same host and port combination that WebHDFS is available
on. As mentioned in the
+WebHDFS REST service configuration section, the values for the host and port can be obtained
from the following
+properties in hdfs-site.xml
+
+    <property>
+        <name>dfs.namenode.http-address</name>
+        <value>sandbox.hortonworks.com:50070</value>
+    </property>
+    <property>
+        <name>dfs.https.namenode.https-address</name>
+        <value>sandbox.hortonworks.com:50470</value>
+    </property>
+
+The values above need to be reflected in each topology descriptor file deployed to the gateway.
+The gateway by default includes a sample topology descriptor file `{GATEWAY_HOME}/deployments/sandbox.xml`.
+The values in this sample are configured to work with an installed Sandbox VM.
+
+    <service>
+        <role>HDFSUI</role>
+        <url>http://sandbox.hortonworks.com:50070</url>
+    </service>
+
+In addition to the service configuration for HDFSUI, the REST service configuration for WEBHDFS
is also required.
+
+    <service>
+        <role>NAMENODE</role>
+        <url>hdfs://sandbox.hortonworks.com:8020</url>
+    </service>
+    <service>
+        <role>WEBHDFS</role>
+        <url>http://sandbox.hortonworks.com:50070/webhdfs</url>
+    </service>
+
+By default the gateway is configured to use the HTTP endpoint for WebHDFS in the Sandbox.
+This could alternatively be configured to use the HTTPS endpoint by providing the correct
address.
+
+#### Name Node UI URL Mapping ####
+
+For Name Node UI URLs, the mapping of Knox Gateway accessible HDFS UI URLs to direct HDFS
UI URLs is:
+
+| ------- | -----------------------------------------------------------------------------
|
+| Gateway | `https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/hdfs` |
+| Cluster | `http://{webhdfs-host}:50070/`                                             |
+
+For example to browse the file system using the NameNode UI the URL in a web browser would
be:
+
+    http://sandbox.hortonworks.com:50070/explorer.html#
+
+And using the gateway to access the same page the URL would be (where the gateway host:port
is 'localhost:8443')
+
+    https://localhost:8443/gateway/sandbox/hdfs/explorer.html#
+
+
+### Job History UI ###
+
+The Job History UI service can be configured in a topology by adding the following snippet.
The values in this sample
+are configured to work with an installed Sandbox VM.
+
+    <service>
+        <role>JOBHISTORYUI</role>
+        <url>http://sandbox.hortonworks.com:19888</url>
+    </service>
+
+The values for the host and port can be obtained from the following property in mapred-site.xml
+
+    <property>
+        <name>mapreduce.jobhistory.webapp.address</name>
+        <value>sandbox.hortonworks.com:19888</value>
+    </property>
+
+
+
+#### Job History UI URL Mapping ####
+
+For Job History UI URLs, the mapping of Knox Gateway accessible Job History UI URLs to direct
Job History UI URLs is:
+
+| ------- | -----------------------------------------------------------------------------
|
+| Gateway | `https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/jobhistory`
|
+| Cluster | `http://{jobhistory-host}:19888/jobhistory`                                 
    |
+
+
+### Oozie UI ###
+
+The Oozie UI service can be configured in a topology by adding the following snippet. The
values in this sample
+are configured to work with an installed Sandbox VM.
+
+    <service>
+        <role>OOZIEUI</role>
+        <url>http://sandbox.hortonworks.com:11000/oozie</url>
+    </service>
+
+The value for the URL can be obtained from the following property in oozie-site.xml
+
+    <property>
+        <name>oozie.base.url</name>
+        <value>http://sandbox.hortonworks.com:11000/oozie</value>
+    </property>
+
+
+
+#### Oozie UI URL Mapping ####
+
+For Oozie UI URLs, the mapping of Knox Gateway accessible Oozie UI URLs to direct Oozie UI
URLs is:
+
+| ------- | -----------------------------------------------------------------------------
|
+| Gateway | `https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/oozie/`
|
+| Cluster | `http://{oozie-host}:11000/oozie/`                                          
  |
+
+
+### HBase UI ###
+
+The HBase UI service can be configured in a topology by adding the following snippet. The
values in this sample
+are configured to work with an installed Sandbox VM.
+
+    <service>
+        <role>HBASEUI</role>
+        <url>http://sandbox.hortonworks.com:16010</url>
+    </service>
+
+The values for the host and port can be obtained from the following property in hbase-site.xml.
+Below the hostname of the HBase master is used since the bindAddress is 0.0.0.0
+
+    <property>
+        <name>hbase.master.info.bindAddress</name>
+        <value>0.0.0.0</value>
+    </property>
+    <property>
+        <name>hbase.master.info.port</name>
+        <value>16010</value>
+    </property>
+
+#### HBase UI URL Mapping ####
+
+For HBase UI URLs, the mapping of Knox Gateway accessible HBase UI URLs to direct HBase Master
+UI URLs is:
+
+| ------- | -------------------------------------------------------------------------------------|
+| Gateway | `https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/hbase/webui/`
  |
+| Cluster | `http://{hbase-master-host}:16010/`                                         
        |
+
+### YARN UI ###
+
+The YARN UI service can be configured in a topology by adding the following snippet. The
values in this sample
+are configured to work with an installed Sandbox VM.
+
+    <service>
+        <role>YARNUI</role>
+        <url>http://sandbox.hortonworks.com:8088</url>
+    </service>
+
+The values for the host and port can be obtained from the following property in mapred-site.xml
+
+    <property>
+        <name>yarn.resourcemanager.webapp.address</name>
+        <value>sandbox.hortonworks.com:8088</value>
+    </property>
+
+#### YARN UI URL Mapping ####
+
+For Resource Manager UI URLs, the mapping of Knox Gateway accessible Resource Manager UI
URLs to direct Resource Manager
+UI URLs is:
+
+| ------- | -----------------------------------------------------------------------------
|
+| Gateway | `https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/yarn`  
 |
+| Cluster | `http://{resource-manager-host}:8088/cluster`                               
 |
+
+### Spark UI ###
+
+The Spark History UI service can be configured in a topology by adding the following snippet.
The values in this sample
+are configured to work with an installed Sandbox VM.
+
+
+    <service>
+        <role>SPARKHISTORYUI</role>
+        <url>http://sandbox.hortonworks.com:18080/</url>
+    </service>
+
+Please, note that for Spark versions older than 2.4.0 you need to set `spark.ui.proxyBase`
to
+`/{gateway-path}/{cluster-name}/sparkhistory` (for more information, please refer to SPARK-24209).
+
+Moreover, before Spark 2.3.1, there is a bug is Spark which prevents the UI to work properly
when the proxy is accessed
+using a link which ends with "/" (SPARK-23644). So if the list of the applications is empty
when you access the UI
+though the gateway, please check that there is a "/" at the end of the URL you are using.
+
+#### Spark History UI URL Mapping ####
+
+For Spark History UI URLs, the mapping of Knox Gateway accessible Spark History UI URLs to
direct Spark History
+UI URLs is:
+
+| ------- | -----------------------------------------------------------------------------
|
+| Gateway | `https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/sparkhistory`
   |
+| Cluster | `http://{spark-history-host}:18080`                                         
         |
+
+
+### Ambari UI ###
+
+Ambari UI has functionality around provisioning and managing services in a Hadoop cluster.
This UI can now be used 
+behind the Knox gateway.
+
+
+To enable this functionality, a topology file needs to have the following configuration (for
AMBARIUI and AMBARIWS):
+
+    <service>
+        <role>AMBARIUI</role>
+        <url>http://<hostname>:<port></url>
+    </service>
+
+    <service>
+        <role>AMBARIWS</role>
+        <url>ws://<hostname>:<port></url>
+    </service>
+
+The default Ambari http port is 8080. Also, please note that the UI service also requires
the Ambari REST API service and Ambari Websocket service
+ to be enabled to function properly. An example of a more complete topology is given below.
+ 
+
+#### Ambari UI URL Mapping ####
+
+For Ambari UI URLs, the mapping of Knox Gateway accessible URLs to direct Ambari UI URLs
is the following.
+
+| ------- | -------------------------------------------------------------------------------------
|
+| Gateway | `https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/ambari/`
        |
+| Cluster | `http://{ambari-host}:{ambari-port}/}`                                      
         |
+
+#### Example Topology ####
+ 
+The Ambari UI service may require a separate topology file due to its requirements around
authentication. Knox passes
+through authentication challenge and credentials to the service in this case.
+
+
+    <topology>
+        <gateway>
+            <provider>
+                <role>authentication</role>
+                <name>Anonymous</name>
+                <enabled>true</enabled>
+            </provider>
+            <provider>
+                <role>identity-assertion</role>
+                <name>Default</name>
+                <enabled>false</enabled>
+            </provider>
+        </gateway>
+        <service>
+            <role>AMBARI</role>
+            <url>http://localhost:8080</url>
+        </service>
+    
+        <service>
+            <role>AMBARIUI</role>
+            <url>http://localhost:8080</url>
+        </service>
+        <service>
+            <role>AMBARIWS</role>
+            <url>ws://localhost:8080</url>
+        </service>
+    </topology>
+    
+Please look at JIRA issue [KNOX-705] for a known issue with this release.
+
+### Ranger Admin Console ###
+
+The Ranger Admin console can now be used behind the Knox gateway.
+
+To enable this functionality, a topology file needs to have the following configuration:
+
+    <service>
+        <role>RANGERUI</role>
+        <url>http://<hostname>:<port></url>
+    </service>
+
+The default Ranger http port is 8060. Also, please note that the UI service also requires
the Ranger REST API service
+ to be enabled to function properly. An example of a more complete topology is given below.
+ 
+
+#### Ranger Admin Console URL Mapping ####
+
+For Ranger Admin console URLs, the mapping of Knox Gateway accessible URLs to direct Ranger
Admin console URLs is the following.
+
+| ------- | -------------------------------------------------------------------------------------
|
+| Gateway | `https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/ranger/`
        |
+| Cluster | `http://{ranger-host}:{ranger-port}/}`                                      
         |
+
+#### Example Topology ####
+ 
+The Ranger UI service may require a separate topology file due to its requirements around
authentication. Knox passes
+through authentication challenge and credentials to the service in this case.
+
+
+    <topology>
+        <gateway>
+            <provider>
+                <role>authentication</role>
+                <name>Anonymous</name>
+                <enabled>true</enabled>
+            </provider>
+            <provider>
+                <role>identity-assertion</role>
+                <name>Default</name>
+                <enabled>false</enabled>
+            </provider>
+        </gateway>
+        <service>
+            <role>RANGER</role>
+            <url>http://localhost:8060</url>
+        </service>
+    
+        <service>
+            <role>RANGERUI</role>
+            <url>http://localhost:8060</url>
+        </service>
+    </topology>
+
+        </service>
+    </topology>
+    
+   
+
+### Atlas UI ###
+
+### Atlas Rest API ###
+
+The Atlas Rest API can now be used behind the Knox gateway.
+To enable this functionality, a topology file needs to have the following configuration.
+
+    <service>
+        <role>ATLAS-API</role>
+        <url>http://<ATLAS_HOST>:<ATLAS_PORT></url>
+    </service>
+
+The default Atlas http port is 21000. Also, please note that the UI service also requires
the Atlas REST API
+service to be enabled to function properly. An example of a more complete topology is given
below.
+
+Atlas Rest API URL Mapping
+For Atlas Rest URLs, the mapping of Knox Gateway accessible URLs to direct Atlas Rest URLs
is the following.
+
+
+| ------- | -------------------------------------------------------------------------------------
|
+| Gateway | `https://{gateway-host}:{gateway-port}/{gateway-path}/{topology}/atlas/`    
    |
+| Cluster | `http://{atlas-host}:{atlas-port}/}`                                        
       |
+
+
+
+Access Atlas API using cULR call
+
+     curl -i -k -L -u admin:admin -X GET \
+               'https://knox-gateway:8443/gateway/{topology}/atlas/api/atlas/v2/types/typedefs?type=classification&_=1495442879421'
+
+
+###   Atlas UI   ###
+In addition to the Atlas REST API, from this release there is the ability to access some
of the functionality via a web. The initial functionality is very limited and serves more
as a starting point/placeholder. The details are below.
+Atlas UI URL
+
+The URL mapping for the Atlas UI is:
+
+| ------- | -------------------------------------------------------------------------------------
|
+|Gateway  |  `https://{gateway-host}:{gateway-port}/{gateway-path}/{topology}/atlas/index.html`
+
+#### Example Topology for Atlas ####
+
+                <topology>
+                    <gateway>
+                        <provider>
+                            <role>authentication</role>
+                            <name>Anonymous</name>
+                            <enabled>true</enabled>
+                        </provider>
+                        <provider>
+                            <role>identity-assertion</role>
+                            <name>Default</name>
+                            <enabled>false</enabled>
+                        </provider>
+                    </gateway>
+
+                    <service>
+                        <role>ATLAS-API</role>
+                        <url>http://<ATLAS_HOST>:<ATLAS_PORT></url>
+                    </service>
+
+                    <service>
+                        <role>ATLAS</role>
+                        <url>http://<ATLAS_HOST>:<ATLAS_PORT></url>
+                    </service>
+                </topology>
+
+Note: This feature will allow for 'anonymous' authentication. Essentially bypassing any LDAP
or other authentication done by Knox and allow the proxied service to do the actual authentication.
+
+
+### Zeppelin UI ###
+Apache Knox can be used to proxy Zeppelin UI and also supports WebSocket protocol used by
Zeppelin. 
+
+The URL mapping for the Zeppelin UI is:
+
+| ------- | -------------------------------------------------------------------------------------
|
+|Gateway  |  `https://{gateway-host}:{gateway-port}/{gateway-path}/{topology}/zeppelin/`
+
+By default WebSocket functionality is disabled, it needs to be enabled for Zeppelin UI to
work properly, it can be enabled by changing the `gateway.websocket.feature.enabled` property
to 'true' in `<KNOX-HOME>/conf/gateway-site.xml` file, for e.g.
+
+    <property>
+        <name>gateway.websocket.feature.enabled</name>
+        <value>true</value>
+        <description>Enable/Disable websocket feature.</description>
+    </property>
+
+Example service definition for Zeppelin in topology file is as follows, note that both ZEPPELINWS
and ZEPPELINUI service declarations are required.
+
+    <service>
+        <role>ZEPPELINWS</role>
+        <url>ws://<ZEPPELIN_HOST>:<ZEPPELIN_PORT>/ws</url>
+    </service>
+
+    <service>
+        <role>ZEPPELINUI</role>
+        <url>http://<ZEPPELIN_HOST>:<ZEPPELIN_PORT></url>
+    </service>
+
+Knox also supports secure Zeppelin UIs, for secure UIs one needs to provision Zeppelin certificate
into Knox truststore.  
+
+### Nifi UI ###
+
+You can use the Apache Knox Gateway to provide authentication access security for your NiFi
services.
+
+The Gateway can be configured for Nifi by modifying the topology XML file.
+
+In the topology XML file, add the following with the correct hostname and port:
+
+    <service>
+      <role>NIFI</role>
+      <url><NIFI_HTTP_SCHEME>://<NIFI_HOST>:<NIFI_HTTP_SCHEME_PORT></url>
+      <param name="useTwoWaySsl" value="true"/>
+    </service>
+
+Note the setting of the useTwoWaySsl param above. Nifi requires mutual authentication
+via SSL and this param tells the dispatch to present a client cert to the server.
+
+The above is an example request body to be used to create a Spark session via Livy server
and illustrates the "proxyUser" that requires rewrite.
\ No newline at end of file



Mime
View raw message