knox-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lmc...@apache.org
Subject svn commit: r1787275 - in /knox: site/books/knox-0-12-0/user-guide.html trunk/books/0.12.0/book.md trunk/books/0.12.0/book_client-details.md
Date Fri, 17 Mar 2017 01:16:35 GMT
Author: lmccay
Date: Fri Mar 17 01:16:34 2017
New Revision: 1787275

URL: http://svn.apache.org/viewvc?rev=1787275&view=rev
Log:
Adding docs for KnoxShell to Cllient Details in 0.12.0

Modified:
    knox/site/books/knox-0-12-0/user-guide.html
    knox/trunk/books/0.12.0/book.md
    knox/trunk/books/0.12.0/book_client-details.md

Modified: knox/site/books/knox-0-12-0/user-guide.html
URL: http://svn.apache.org/viewvc/knox/site/books/knox-0-12-0/user-guide.html?rev=1787275&r1=1787274&r2=1787275&view=diff
==============================================================================
--- knox/site/books/knox-0-12-0/user-guide.html (original)
+++ knox/site/books/knox-0-12-0/user-guide.html Fri Mar 17 01:16:34 2017
@@ -67,7 +67,15 @@
   </ul></li>
   <li><a href="#Websocket+Support">Websocket Support</a></li>
   <li><a href="#Audit">Audit</a></li>
-  <li><a href="#Client+Details">Client Details</a></li>
+  <li><a href="#Client+Details">Client Details</a>
+  <ul>
+    <li><a href="#Client+Quickstart">Client Quickstart</a></li>
+    <li><a href="#Client+Token+Sessions">Client Token Sessions</a>
+    <ul>
+      <li><a href="#Server+Setup">Server Setup</a></li>
+    </ul></li>
+    <li><a href="#Client+DSL+and+SDK+Details">Client DSL and SDK Details</a></li>
+  </ul></li>
   <li><a href="#Service+Details">Service Details</a>
   <ul>
     <li><a href="#WebHDFS">WebHDFS</a></li>
@@ -3114,7 +3122,185 @@ APACHE_HOME/bin/apachectl -k stop
       <td>Logging message. Contains additional tracking information.</td>
     </tr>
   </tbody>
-</table><h4><a id="Audit+log+rotation">Audit log rotation</a> <a
href="#Audit+log+rotation"><img src="markbook-section-link.png"/></a></h4><p>Audit
logging is preconfigured with <code>org.apache.log4j.DailyRollingFileAppender</code>.
<a href="http://logging.apache.org/log4j/1.2/">Apache log4j</a> contains information
about other Appenders.</p><h4><a id="How+to+change+the+audit+level+or+disable+it">How
to change the audit level or disable it</a> <a href="#How+to+change+the+audit+level+or+disable+it"><img
src="markbook-section-link.png"/></a></h4><p>All audit messages are logged
at <code>INFO</code> level and this behavior can&rsquo;t be changed.</p><p>Disabling
auditing can be done by decreasing the log level for the Audit appender or setting it to <code>OFF</code>.</p><h2><a
id="Client+Details">Client Details</a> <a href="#Client+Details"><img src="markbook-section-link.png"/></a></h2><p>Hadoop
requires a client that can be used to interact remotely with the services provided by Had
 oop cluster. This will also be true when using the Apache Knox Gateway to provide perimeter
security and centralized access for these services. The two primary existing clients for Hadoop
are the CLI (i.e. Command Line Interface, hadoop) and <a href="http://gethue.com/">Hue</a>
(i.e. Hadoop User Experience). For several reasons however, neither of these clients can <em>currently</em>
be used to access Hadoop services via the Apache Knox Gateway.</p><p>This led
to thinking about a very simple client that could help people use and evaluate the gateway.
The list below outlines the general requirements for such a client.</p>
+</table><h4><a id="Audit+log+rotation">Audit log rotation</a> <a
href="#Audit+log+rotation"><img src="markbook-section-link.png"/></a></h4><p>Audit
logging is preconfigured with <code>org.apache.log4j.DailyRollingFileAppender</code>.
<a href="http://logging.apache.org/log4j/1.2/">Apache log4j</a> contains information
about other Appenders.</p><h4><a id="How+to+change+the+audit+level+or+disable+it">How
to change the audit level or disable it</a> <a href="#How+to+change+the+audit+level+or+disable+it"><img
src="markbook-section-link.png"/></a></h4><p>All audit messages are logged
at <code>INFO</code> level and this behavior can&rsquo;t be changed.</p><p>Disabling
auditing can be done by decreasing the log level for the Audit appender or setting it to <code>OFF</code>.</p><h2><a
id="Client+Details">Client Details</a> <a href="#Client+Details"><img src="markbook-section-link.png"/></a></h2><p>The
KnoxShell release artifact provides a small footprint client environment that removes all
un
 necessary server dependencies, configuration, binary scripts, etc. It is comprised a couple
different things that empower different sorts of users.</p>
+<ul>
+  <li>A set of SDK type classes for providing access to Hadoop resources over HTTP</li>
+  <li>A Groovy based DSL for scripting access to Hadoop resources based on the underlying
SDK classes</li>
+  <li>A KnoxShell Token based Sessions to provide a CLI SSO session for executing multiple
scripts</li>
+</ul><p>The following sections provide an overview and quickstart for the KnoxShell.</p><h3><a
id="Client+Quickstart">Client Quickstart</a> <a href="#Client+Quickstart"><img
src="markbook-section-link.png"/></a></h3><p>The following installation
and setup instructions should get you started with using the KnoxShell very quickly.</p>
+<ol>
+  <li><p>Download a knoxshell-x.x.x.zip or tar file and unzip it in your preferred
location {GATEWAY_CLIENT_HOME}</p>
+  <pre><code>home:knoxshell-0.12.0 larry$ ls -l
+total 296
+-rw-r--r--@  1 larry  staff  71714 Mar 14 14:06 LICENSE
+-rw-r--r--@  1 larry  staff    164 Mar 14 14:06 NOTICE
+-rw-r--r--@  1 larry  staff  71714 Mar 15 20:04 README
+drwxr-xr-x@ 12 larry  staff    408 Mar 15 21:24 bin
+drwxr--r--@  3 larry  staff    102 Mar 14 14:06 conf
+drwxr-xr-x+  3 larry  staff    102 Mar 15 12:41 logs
+drwxr-xr-x@ 18 larry  staff    612 Mar 14 14:18 samples
+</code></pre>
+  <table>
+    <thead>
+      <tr>
+        <th>Directory </th>
+        <th>Description </th>
+      </tr>
+    </thead>
+    <tbody>
+      <tr>
+        <td>bin </td>
+        <td>contains the main knoxshell jar and related shell scripts</td>
+      </tr>
+      <tr>
+        <td>conf </td>
+        <td>only contains log4j config</td>
+      </tr>
+      <tr>
+        <td>logs </td>
+        <td>contains the knoxshell.log file</td>
+      </tr>
+      <tr>
+        <td>samples </td>
+        <td>has numerous examples to help you get started</td>
+      </tr>
+    </tbody>
+  </table></li>
+  <li><p>cd {GATEWAY_CLIENT_HOME}</p></li>
+  <li>Get/setup truststore for the target Knox instance or fronting load balancer
+  <ul>
+    <li>if you have access to the server you may use the command knoxcli.sh export-cert
&ndash;type JKS</li>
+    <li>copy the resulting gateway-client-identity.jks to your user home directory</li>
+  </ul></li>
+  <li><p>Execute the an example script from the {GATEWAY_CLIENT_HOME}/samples
directory - for instance:</p>
+  <ul>
+    <li>bin/knoxshell.sh samples/ExampleWebHdfsLs.groovy</li>
+  </ul>
+  <pre><code>home:knoxshell-0.12.0 larry$ bin/knoxshell.sh samples/ExampleWebHdfsLs.groovy
+Enter username: guest
+Enter password:
+[app-logs, apps, mapred, mr-history, tmp, user]
+</code></pre></li>
+</ol><p>At this point, you should have seen something similar to the above output
- probably with different directories listed. You should get the idea from the above. Take
a look at the sample that we ran above:</p>
+<pre><code>import groovy.json.JsonSlurper
+import org.apache.hadoop.gateway.shell.Hadoop
+import org.apache.hadoop.gateway.shell.hdfs.Hdfs
+
+import org.apache.hadoop.gateway.shell.Credentials
+
+gateway = &quot;https://localhost:8443/gateway/sandbox&quot;
+
+credentials = new Credentials()
+credentials.add(&quot;ClearInput&quot;, &quot;Enter username: &quot;, &quot;user&quot;)
+                .add(&quot;HiddenInput&quot;, &quot;Enter pas&quot; + &quot;sword:
&quot;, &quot;pass&quot;)
+credentials.collect()
+
+username = credentials.get(&quot;user&quot;).string()
+pass = credentials.get(&quot;pass&quot;).string()
+
+session = Hadoop.login( gateway, username, pass )
+
+text = Hdfs.ls( session ).dir( &quot;/&quot; ).now().string
+json = (new JsonSlurper()).parseText( text )
+println json.FileStatuses.FileStatus.pathSuffix
+session.shutdown()
+</code></pre><p>Some things to note about this sample:</p>
+<ol>
+  <li>the gateway URL is hardcoded
+  <ul>
+    <li>alternatives would be passing it as an argument to the script, using an environment
variable or prompting for it with a ClearInput credential collector</li>
+  </ul></li>
+  <li>credential collectors are used to gather credentials or other input from various
sources. In this sample the HiddenInput and ClearInput collectors prompt the user for the
input with the provided prompt text and the values are acquired by a subsequent get call with
the provided name value.</li>
+  <li>The Hadoop.login method establishes a login session of sorts which will need
to be provided to the various API classes as an argument.</li>
+  <li>the response text is easily retrieved as a string and can be parsed by the JsonSlurper
or whatever you like</li>
+</ol><h3><a id="Client+Token+Sessions">Client Token Sessions</a>
<a href="#Client+Token+Sessions"><img src="markbook-section-link.png"/></a></h3><p>Building
on the Quickstart above we will drill into some of the token session details here and walk
through another sample.</p><p>Unlike the quickstart, token sessions require the
server to be configured in specific ways to allow the use of token sessions/federation.</p><h4><a
id="Server+Setup">Server Setup</a> <a href="#Server+Setup"><img src="markbook-section-link.png"/></a></h4>
+<ol>
+  <li><p>KnoxToken service should be added to your sandbox.xml topology - see
the <a href="#KnoxToken+Configuration">KnoxToken Configuration Section</a></p>
+  <pre><code>&lt;service&gt;
+   &lt;role&gt;KNOXTOKEN&lt;/role&gt;
+   &lt;param&gt;
+      &lt;name&gt;knox.token.ttl&lt;/name&gt;
+      &lt;value&gt;36000000&lt;/value&gt;
+   &lt;/param&gt;
+   &lt;param&gt;
+      &lt;name&gt;knox.token.audiences&lt;/name&gt;
+      &lt;value&gt;tokenbased&lt;/value&gt;
+   &lt;/param&gt;
+   &lt;param&gt;
+      &lt;name&gt;knox.token.target.url&lt;/name&gt;
+      &lt;value&gt;https://localhost:8443/gateway/tokenbased&lt;/value&gt;
+   &lt;/param&gt;
+&lt;/service&gt;
+</code></pre></li>
+  <li><p>tokenbased.xml topology to accept tokens as federation tokens for access
to exposed resources with JWTProvider <a href="#JWT+Provider">JWT Provider</a></p>
+  <pre><code>&lt;provider&gt;
+   &lt;role&gt;federation&lt;/role&gt;
+   &lt;name&gt;JWTProvider&lt;/name&gt;
+   &lt;enabled&gt;true&lt;/enabled&gt;
+   &lt;param&gt;
+       &lt;name&gt;knox.token.audiences&lt;/name&gt;
+       &lt;value&gt;tokenbased&lt;/value&gt;
+   &lt;/param&gt;
+&lt;/provider&gt;
+</code></pre></li>
+  <li>Use the KnoxShell token commands to establish and manage your session
+  <ul>
+    <li>bin/knoxshell.sh init <a href="https://localhost:8443/gateway/sandbox">https://localhost:8443/gateway/sandbox</a>
to acquire a token and cache in user home directory</li>
+    <li>bin/knoxshell.sh list to display the details of the cached token, the expiration
time and optionally the target url</li>
+    <li>bin/knoxshell destroy to remove the cached session token and terminate the
session</li>
+  </ul></li>
+  <li><p>Execute a script that can take advantage of the token credential collector
and target url</p>
+  <pre><code>import groovy.json.JsonSlurper
+import java.util.HashMap
+import java.util.Map
+import org.apache.hadoop.gateway.shell.Credentials
+import org.apache.hadoop.gateway.shell.Hadoop
+import org.apache.hadoop.gateway.shell.hdfs.Hdfs
+
+credentials = new Credentials()
+credentials.add(&quot;KnoxToken&quot;, &quot;none: &quot;, &quot;token&quot;)
+credentials.collect()
+
+token = credentials.get(&quot;token&quot;).string()
+
+gateway = System.getenv(&quot;KNOXSHELL_TOPOLOGY_URL&quot;)
+if (gateway == null || gateway.equals(&quot;&quot;)) {
+  gateway = credentials.get(&quot;token&quot;).getTargetUrl()
+}
+
+println &quot;&quot;
+println &quot;*****************************GATEWAY INSTANCE**********************************&quot;
+println gateway
+println &quot;*******************************************************************************&quot;
+println &quot;&quot;
+
+headers = new HashMap()
+headers.put(&quot;Authorization&quot;, &quot;Bearer &quot; + token)
+
+session = Hadoop.login( gateway, headers )
+
+if (args.length &gt; 0) {
+  dir = args[0]
+} else {
+  dir = &quot;/&quot;
+}
+
+text = Hdfs.ls( session ).dir( dir ).now().string
+json = (new JsonSlurper()).parseText( text )
+statuses = json.get(&quot;FileStatuses&quot;);
+
+println statuses
+
+session.shutdown()
+</code></pre></li>
+</ol><p>Note the following about the above sample script:</p>
+<ol>
+  <li>use of the KnoxToken credential collector</li>
+  <li>use of the targetUrl from the credential collector</li>
+  <li>optional override of the target url with environment variable</li>
+  <li>the passing of the headers map to the session creation in Hadoop.login</li>
+  <li>the passing of an argument for the ls command for the path to list or default
to &ldquo;/&rdquo;</li>
+</ol><p>Also note that there is no reason to prompt for username and password
as long as the token has not been destroyed or expired. There is also no hardcoded endpoint
for using the token - it is specified in the token cache or overridden by environment variable.</p><h2><a
id="Client+DSL+and+SDK+Details">Client DSL and SDK Details</a> <a href="#Client+DSL+and+SDK+Details"><img
src="markbook-section-link.png"/></a></h2><p>The lack of any formal SDK
or client for REST APIs in Hadoop led to thinking about a very simple client that could help
people use and evaluate the gateway. The list below outlines the general requirements for
such a client.</p>
 <ul>
   <li>Promote the evaluation and adoption of the Apache Knox Gateway</li>
   <li>Simple to deploy and use on data worker desktops for access to remote Hadoop
clusters</li>

Modified: knox/trunk/books/0.12.0/book.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.12.0/book.md?rev=1787275&r1=1787274&r2=1787275&view=diff
==============================================================================
--- knox/trunk/books/0.12.0/book.md (original)
+++ knox/trunk/books/0.12.0/book.md Fri Mar 17 01:16:34 2017
@@ -68,6 +68,10 @@
 * #[Websocket Support]
 * #[Audit]
 * #[Client Details]
+    * #[Client Quickstart]
+    * #[Client Token Sessions]
+        * #[Server Setup]
+    * #[Client DSL and SDK Details]
 * #[Service Details]
     * #[WebHDFS]
     * #[WebHCat]

Modified: knox/trunk/books/0.12.0/book_client-details.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.12.0/book_client-details.md?rev=1787275&r1=1787274&r2=1787275&view=diff
==============================================================================
--- knox/trunk/books/0.12.0/book_client-details.md (original)
+++ knox/trunk/books/0.12.0/book_client-details.md Fri Mar 17 01:16:34 2017
@@ -16,13 +16,180 @@
 --->
 
 ## Client Details ##
+The KnoxShell release artifact provides a small footprint client environment that removes
all unnecessary server dependencies, configuration, binary scripts, etc. It is comprised a
couple different things that empower different sorts of users.
 
-Hadoop requires a client that can be used to interact remotely with the services provided
by Hadoop cluster.
-This will also be true when using the Apache Knox Gateway to provide perimeter security and
centralized access for these services.
-The two primary existing clients for Hadoop are the CLI (i.e. Command Line Interface, hadoop)
and [Hue](http://gethue.com/) (i.e. Hadoop User Experience).
-For several reasons however, neither of these clients can _currently_ be used to access Hadoop
services via the Apache Knox Gateway.
+* A set of SDK type classes for providing access to Hadoop resources over HTTP
+* A Groovy based DSL for scripting access to Hadoop resources based on the underlying SDK
classes
+* A KnoxShell Token based Sessions to provide a CLI SSO session for executing multiple scripts
 
-This led to thinking about a very simple client that could help people use and evaluate the
gateway.
+The following sections provide an overview and quickstart for the KnoxShell.
+
+### Client Quickstart ###
+The following installation and setup instructions should get you started with using the KnoxShell
very quickly.
+
+1. Download a knoxshell-x.x.x.zip or tar file and unzip it in your preferred location {GATEWAY_CLIENT_HOME}
+
+        home:knoxshell-0.12.0 larry$ ls -l
+        total 296
+        -rw-r--r--@  1 larry  staff  71714 Mar 14 14:06 LICENSE
+        -rw-r--r--@  1 larry  staff    164 Mar 14 14:06 NOTICE
+        -rw-r--r--@  1 larry  staff  71714 Mar 15 20:04 README
+        drwxr-xr-x@ 12 larry  staff    408 Mar 15 21:24 bin
+        drwxr--r--@  3 larry  staff    102 Mar 14 14:06 conf
+        drwxr-xr-x+  3 larry  staff    102 Mar 15 12:41 logs
+        drwxr-xr-x@ 18 larry  staff    612 Mar 14 14:18 samples
+        
+    |Directory    | Description |
+    |-------------|-------------|
+    |bin          |contains the main knoxshell jar and related shell scripts|
+    |conf         |only contains log4j config|
+    |logs         |contains the knoxshell.log file|
+    |samples      |has numerous examples to help you get started|
+
+2. cd {GATEWAY_CLIENT_HOME}
+3. Get/setup truststore for the target Knox instance or fronting load balancer
+    - if you have access to the server you may use the command knoxcli.sh export-cert --type
JKS
+    - copy the resulting gateway-client-identity.jks to your user home directory
+4. Execute the an example script from the {GATEWAY_CLIENT_HOME}/samples directory - for instance:
+    - bin/knoxshell.sh samples/ExampleWebHdfsLs.groovy
+    
+        home:knoxshell-0.12.0 larry$ bin/knoxshell.sh samples/ExampleWebHdfsLs.groovy
+        Enter username: guest
+        Enter password:
+        [app-logs, apps, mapred, mr-history, tmp, user]
+
+At this point, you should have seen something similar to the above output - probably with
different directories listed. You should get the idea from the above. Take a look at the sample
that we ran above:
+
+    import groovy.json.JsonSlurper
+    import org.apache.hadoop.gateway.shell.Hadoop
+    import org.apache.hadoop.gateway.shell.hdfs.Hdfs
+
+    import org.apache.hadoop.gateway.shell.Credentials
+
+    gateway = "https://localhost:8443/gateway/sandbox"
+
+    credentials = new Credentials()
+    credentials.add("ClearInput", "Enter username: ", "user")
+                    .add("HiddenInput", "Enter pas" + "sword: ", "pass")
+    credentials.collect()
+
+    username = credentials.get("user").string()
+    pass = credentials.get("pass").string()
+
+    session = Hadoop.login( gateway, username, pass )
+
+    text = Hdfs.ls( session ).dir( "/" ).now().string
+    json = (new JsonSlurper()).parseText( text )
+    println json.FileStatuses.FileStatus.pathSuffix
+    session.shutdown()
+
+Some things to note about this sample:
+
+1. the gateway URL is hardcoded
+    - alternatives would be passing it as an argument to the script, using an environment
variable or prompting for it with a ClearInput credential collector
+2. credential collectors are used to gather credentials or other input from various sources.
In this sample the HiddenInput and ClearInput collectors prompt the user for the input with
the provided prompt text and the values are acquired by a subsequent get call with the provided
name value.
+3. The Hadoop.login method establishes a login session of sorts which will need to be provided
to the various API classes as an argument.
+4. the response text is easily retrieved as a string and can be parsed by the JsonSlurper
or whatever you like
+
+### Client Token Sessions ###
+Building on the Quickstart above we will drill into some of the token session details here
and walk through another sample.
+
+Unlike the quickstart, token sessions require the server to be configured in specific ways
to allow the use of token sessions/federation.
+
+#### Server Setup ####
+1. KnoxToken service should be added to your sandbox.xml topology - see the [KnoxToken Configuration
Section] (#KnoxToken+Configuration)
+
+        <service>
+           <role>KNOXTOKEN</role>
+           <param>
+              <name>knox.token.ttl</name>
+              <value>36000000</value>
+           </param>
+           <param>
+              <name>knox.token.audiences</name>
+              <value>tokenbased</value>
+           </param>
+           <param>
+              <name>knox.token.target.url</name>
+              <value>https://localhost:8443/gateway/tokenbased</value>
+           </param>
+        </service>
+
+2. tokenbased.xml topology to accept tokens as federation tokens for access to exposed resources
with JWTProvider [JWT Provider](#JWT+Provider)
+
+        <provider>
+           <role>federation</role>
+           <name>JWTProvider</name>
+           <enabled>true</enabled>
+           <param>
+               <name>knox.token.audiences</name>
+               <value>tokenbased</value>
+           </param>
+        </provider>
+3. Use the KnoxShell token commands to establish and manage your session
+    - bin/knoxshell.sh init https://localhost:8443/gateway/sandbox to acquire a token and
cache in user home directory
+    - bin/knoxshell.sh list to display the details of the cached token, the expiration time
and optionally the target url
+    - bin/knoxshell destroy to remove the cached session token and terminate the session
+
+4. Execute a script that can take advantage of the token credential collector and target
url
+
+        import groovy.json.JsonSlurper
+        import java.util.HashMap
+        import java.util.Map
+        import org.apache.hadoop.gateway.shell.Credentials
+        import org.apache.hadoop.gateway.shell.Hadoop
+        import org.apache.hadoop.gateway.shell.hdfs.Hdfs
+
+        credentials = new Credentials()
+        credentials.add("KnoxToken", "none: ", "token")
+        credentials.collect()
+
+        token = credentials.get("token").string()
+
+        gateway = System.getenv("KNOXSHELL_TOPOLOGY_URL")
+        if (gateway == null || gateway.equals("")) {
+          gateway = credentials.get("token").getTargetUrl()
+        }
+
+        println ""
+        println "*****************************GATEWAY INSTANCE**********************************"
+        println gateway
+        println "*******************************************************************************"
+        println ""
+
+        headers = new HashMap()
+        headers.put("Authorization", "Bearer " + token)
+
+        session = Hadoop.login( gateway, headers )
+
+        if (args.length > 0) {
+          dir = args[0]
+        } else {
+          dir = "/"
+        }
+
+        text = Hdfs.ls( session ).dir( dir ).now().string
+        json = (new JsonSlurper()).parseText( text )
+        statuses = json.get("FileStatuses");
+
+        println statuses
+
+        session.shutdown()
+
+Note the following about the above sample script:
+
+1. use of the KnoxToken credential collector
+2. use of the targetUrl from the credential collector
+3. optional override of the target url with environment variable
+4. the passing of the headers map to the session creation in Hadoop.login
+5. the passing of an argument for the ls command for the path to list or default to "/"
+
+Also note that there is no reason to prompt for username and password as long as the token
has not been destroyed or expired.
+There is also no hardcoded endpoint for using the token - it is specified in the token cache
or overridden by environment variable.
+
+## Client DSL and SDK Details ##
+
+The lack of any formal SDK or client for REST APIs in Hadoop led to thinking about a very
simple client that could help people use and evaluate the gateway.
 The list below outlines the general requirements for such a client.
 
 * Promote the evaluation and adoption of the Apache Knox Gateway



Mime
View raw message