knox-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pzamp...@apache.org
Subject svn commit: r1834176 - in /knox: site/books/knox-1-1-0/user-guide.html trunk/books/1.1.0/config_ha.md
Date Sat, 23 Jun 2018 00:30:39 GMT
Author: pzampino
Date: Sat Jun 23 00:30:39 2018
New Revision: 1834176

URL: http://svn.apache.org/viewvc?rev=1834176&view=rev
Log:
Documented ZooKeeper-based Knox HA support

Modified:
    knox/site/books/knox-1-1-0/user-guide.html
    knox/trunk/books/1.1.0/config_ha.md

Modified: knox/site/books/knox-1-1-0/user-guide.html
URL: http://svn.apache.org/viewvc/knox/site/books/knox-1-1-0/user-guide.html?rev=1834176&r1=1834175&r2=1834176&view=diff
==============================================================================
--- knox/site/books/knox-1-1-0/user-guide.html (original)
+++ knox/site/books/knox-1-1-0/user-guide.html Sat Jun 23 00:30:39 2018
@@ -3220,12 +3220,21 @@ exit
 </code></pre><p>Copy knox.service.keytab created on KDC host on to your
Knox host <code>{GATEWAY_HOME}/conf/knox.service.keytab</code></p>
 <pre><code>chown knox knox.service.keytab
 chmod 400 knox.service.keytab
-</code></pre><h4><a id="Update+`krb5.conf`+at+`{GATEWAY_HOME}/conf/krb5.conf`+on+Knox+host">Update
<code>krb5.conf</code> at <code>{GATEWAY_HOME}/conf/krb5.conf</code>
on Knox host</a> <a href="#Update+`krb5.conf`+at+`{GATEWAY_HOME}/conf/krb5.conf`+on+Knox+host"><img
src="markbook-section-link.png"/></a></h4><p>You could copy the <code>{GATEWAY_HOME}/templates/krb5.conf</code>
file provided in the Knox binary download and customize it to suit your cluster.</p><h4><a
id="Update+`krb5JAASLogin.conf`+at+`/etc/knox/conf/krb5JAASLogin.conf`+on+Knox+host">Update
<code>krb5JAASLogin.conf</code> at <code>/etc/knox/conf/krb5JAASLogin.conf</code>
on Knox host</a> <a href="#Update+`krb5JAASLogin.conf`+at+`/etc/knox/conf/krb5JAASLogin.conf`+on+Knox+host"><img
src="markbook-section-link.png"/></a></h4><p>You could copy the <code>{GATEWAY_HOME}/templates/krb5JAASLogin.conf</code>
file provided in the Knox binary download and customize it to suit your cluster.</p><h4><a
id="Update+`gateway-site.xm
 l`+on+Knox+host">Update <code>gateway-site.xml</code> on Knox host</a>
<a href="#Update+`gateway-site.xml`+on+Knox+host"><img src="markbook-section-link.png"/></a></h4><p>Update
<code>conf/gateway-site.xml</code> in your Knox installation and set the value
of <code>gateway.hadoop.kerberos.secured</code> to true.</p><h4><a
id="Restart+Knox">Restart Knox</a> <a href="#Restart+Knox"><img src="markbook-section-link.png"/></a></h4><p>After
you do the above configurations and restart Knox, Knox would use SPNego to authenticate with
Hadoop services and Oozie. There is no change in the way you make calls to Knox whether you
use Curl or Knox DSL.</p><h3><a id="High+Availability">High Availability</a>
<a href="#High+Availability"><img src="markbook-section-link.png"/></a></h3><p>This
describes how Knox itself can be made highly available.</p><h4><a id="Configure+Knox+instances">Configure
Knox instances</a> <a href="#Configure+Knox+instances"><img src="markbook-section-link.png"/></a></h4><p>A
 ll Knox instances must be synced to use the same topology credential keystores. These files
are located under <code>{GATEWAY_HOME}/conf/security/keystores/{TOPOLOGY_NAME}-credentials.jceks</code>.
They are generated after the first topology deployment. Currently these files need to be synced
manually. Here are the steps to sync topologies credentials keystores:</p>
+</code></pre><h4><a id="Update+`krb5.conf`+at+`{GATEWAY_HOME}/conf/krb5.conf`+on+Knox+host">Update
<code>krb5.conf</code> at <code>{GATEWAY_HOME}/conf/krb5.conf</code>
on Knox host</a> <a href="#Update+`krb5.conf`+at+`{GATEWAY_HOME}/conf/krb5.conf`+on+Knox+host"><img
src="markbook-section-link.png"/></a></h4><p>You could copy the <code>{GATEWAY_HOME}/templates/krb5.conf</code>
file provided in the Knox binary download and customize it to suit your cluster.</p><h4><a
id="Update+`krb5JAASLogin.conf`+at+`/etc/knox/conf/krb5JAASLogin.conf`+on+Knox+host">Update
<code>krb5JAASLogin.conf</code> at <code>/etc/knox/conf/krb5JAASLogin.conf</code>
on Knox host</a> <a href="#Update+`krb5JAASLogin.conf`+at+`/etc/knox/conf/krb5JAASLogin.conf`+on+Knox+host"><img
src="markbook-section-link.png"/></a></h4><p>You could copy the <code>{GATEWAY_HOME}/templates/krb5JAASLogin.conf</code>
file provided in the Knox binary download and customize it to suit your cluster.</p><h4><a
id="Update+`gateway-site.xm
 l`+on+Knox+host">Update <code>gateway-site.xml</code> on Knox host</a>
<a href="#Update+`gateway-site.xml`+on+Knox+host"><img src="markbook-section-link.png"/></a></h4><p>Update
<code>conf/gateway-site.xml</code> in your Knox installation and set the value
of <code>gateway.hadoop.kerberos.secured</code> to true.</p><h4><a
id="Restart+Knox">Restart Knox</a> <a href="#Restart+Knox"><img src="markbook-section-link.png"/></a></h4><p>After
you do the above configurations and restart Knox, Knox would use SPNego to authenticate with
Hadoop services and Oozie. There is no change in the way you make calls to Knox whether you
use Curl or Knox DSL.</p><h3><a id="High+Availability">High Availability</a>
<a href="#High+Availability"><img src="markbook-section-link.png"/></a></h3><p>This
describes how Knox itself can be made highly available.</p><p>All Knox instances
must be synced to use the same topology credential keystores. These files are located under
<code>{GATEWAY_HOME}/conf/security/keys
 tores/{TOPOLOGY_NAME}-credentials.jceks</code>. They are generated after the first
topology deployment.</p><p>In addition to these topology-specific credentials,
gateway credentials and topologies must also be kept in-sync for Knox to operate in an HA
manner.</p><h4><a id="Manually+Synchronize+Knox+Instances">Manually Synchronize
Knox Instances</a> <a href="#Manually+Synchronize+Knox+Instances"><img src="markbook-section-link.png"/></a></h4><p>Here
are the steps to manually sync topology credential keystores:</p>
 <ol>
   <li>Choose a Knox instance that will be the source for topology credential keystores.
Let&rsquo;s call it <em>keystores master</em></li>
   <li>Replace the topology credential keystores in the other Knox instances with topology
credential keystores from the <em>keystores master</em></li>
   <li>Restart Knox instances</li>
-</ol><h4><a id="High+Availability+with+Apache+HTTP+Server+++mod_proxy+++mod_proxy_balancer">High
Availability with Apache HTTP Server + mod_proxy + mod_proxy_balancer</a> <a href="#High+Availability+with+Apache+HTTP+Server+++mod_proxy+++mod_proxy_balancer"><img
src="markbook-section-link.png"/></a></h4><h5><a id="1+-+Requirements">1
- Requirements</a> <a href="#1+-+Requirements"><img src="markbook-section-link.png"/></a></h5><h6><a
id="openssl-devel">openssl-devel</a> <a href="#openssl-devel"><img src="markbook-section-link.png"/></a></h6><p>openssl-devel
is required for Apache Module mod_ssl.</p>
+</ol><p>Manually synchronizing the gateway credentials and topologies involves
using ssh/scp to copy the topology-related files to all the participating Knox instances,
and running the Knox CLI on each participating instance to define the credential aliases.</p><p>This
manual process can be tedious and error-prone. As such, ZooKeeper-based HA is recommended
to simplify the management of these deployments.</p><h4><a id="High+Availability+with+Apache+ZooKeeper">High
Availability with Apache ZooKeeper</a> <a href="#High+Availability+with+Apache+ZooKeeper"><img
src="markbook-section-link.png"/></a></h4><p>Rather than manually keeping
Knox HA instances in sync (in terms of credentials and topology), Knox can get it&rsquo;s
state from Apache ZooKeeper. By configuring all the Knox instances to monitor the same ZooKeeper
ensemble, they can be kept in-sync by modifying the topology-related configuration and/or
credential aliases at only one of the instances (using the Admin UI, Admin API, or
  Knox CLI).</p><h5><a id="What+is+Automatically+Synchronized+Across+Instances?">What
is Automatically Synchronized Across Instances?</a> <a href="#What+is+Automatically+Synchronized+Across+Instances?"><img
src="markbook-section-link.png"/></a></h5>
+<ul>
+  <li>Provider Configurations</li>
+  <li>Descriptors</li>
+  <li>Credential Aliases</li>
+</ul><p>When a provider configuration or descriptor is added or updated to the
ZooKeeper ensemble, all of the participating Knox instances will get the change, and the affected
topologies will be [re]generated and [re]deployed. Similarly, if one of these is deleted,
the affected topologies will be deleted and undeployed.</p><p>When provider configurations
and descriptors are added, modified or removed using the Admin UI or API (when the Knox instance
is configured to monitor a ZooKeeper ensemble), then those changes will be automatically reflected
in the associated ZooKeeper ensemble. Those changes will subsequently be consumed by all the
other Knox instances monitoring that ensemble. By using the Admin UI or API, ssh/scp access
to the Knox hosts can be avoided completely for the purpose of effecting topology changes.</p><p>Similarly,
when the Knox CLI is used to create or delete a gateway alias (when the Knox instance is configured
to monitor a ZooKeeper ensemble), that alias chang
 e is reflected in the ZooKeeper ensemble, and all other Knox instances montoring that ensemble
will apply the change.</p><h5><a id="What+is+NOT+Automatically+Synchronized+Across+Instances?">What
is NOT Automatically Synchronized Across Instances?</a> <a href="#What+is+NOT+Automatically+Synchronized+Across+Instances?"><img
src="markbook-section-link.png"/></a></h5>
+<ul>
+  <li>Topologies (XML)</li>
+  <li>Gateway config (e.g., gateway-site, gateway-logging, etc&hellip;)</li>
+</ul><p>If you&rsquo;re creating/modifying topology XML files directly, then
there is no automated support for keeping these in sync across Knox HA instances.</p><p>However,
if the Knox instances are running in an Apache Ambari-managed cluster, there is limited support
for keeping topology XML files and gateway configuration synchronized across those instances.</p><p><br></p><h4><a
id="High+Availability+with+Apache+HTTP+Server+++mod_proxy+++mod_proxy_balancer">High Availability
with Apache HTTP Server + mod_proxy + mod_proxy_balancer</a> <a href="#High+Availability+with+Apache+HTTP+Server+++mod_proxy+++mod_proxy_balancer"><img
src="markbook-section-link.png"/></a></h4><h5><a id="1+-+Requirements">1
- Requirements</a> <a href="#1+-+Requirements"><img src="markbook-section-link.png"/></a></h5><h6><a
id="openssl-devel">openssl-devel</a> <a href="#openssl-devel"><img src="markbook-section-link.png"/></a></h6><p>openssl-devel
is required for Apache Module mod_ssl.</p>
 <pre><code>sudo yum install openssl-devel
 </code></pre><h6><a id="Apache+HTTP+Server">Apache HTTP Server</a>
<a href="#Apache+HTTP+Server"><img src="markbook-section-link.png"/></a></h6><p>Apache
HTTP Server 2.4.6 or later is required. See this document for installing and setting up Apache
HTTP Server: <a href="http://httpd.apache.org/docs/2.4/install.html">http://httpd.apache.org/docs/2.4/install.html</a></p><p>Hint:
pass <code>--enable-ssl</code> to the <code>./configure</code> command
to enable the generation of the Apache Module <em>mod_ssl</em>.</p><h6><a
id="Apache+Module+mod_proxy">Apache Module mod_proxy</a> <a href="#Apache+Module+mod_proxy"><img
src="markbook-section-link.png"/></a></h6><p>See this document for setting
up Apache Module mod_proxy: <a href="http://httpd.apache.org/docs/2.4/mod/mod_proxy.html">http://httpd.apache.org/docs/2.4/mod/mod_proxy.html</a></p><h6><a
id="Apache+Module+mod_proxy_balancer">Apache Module mod_proxy_balancer</a> <a
href="#Apache+Module+mod_proxy_balancer"><img src="markbook-sectio
 n-link.png"/></a></h6><p>See this document for setting up Apache Module
mod_proxy_balancer: <a href="http://httpd.apache.org/docs/2.4/mod/mod_proxy_balancer.html">http://httpd.apache.org/docs/2.4/mod/mod_proxy_balancer.html</a></p><h6><a
id="Apache+Module+mod_ssl">Apache Module mod_ssl</a> <a href="#Apache+Module+mod_ssl"><img
src="markbook-section-link.png"/></a></h6><p>See this document for setting
up Apache Module mod_ssl: <a href="http://httpd.apache.org/docs/2.4/mod/mod_ssl.html">http://httpd.apache.org/docs/2.4/mod/mod_ssl.html</a></p><h5><a
id="2+-+Configuration+example">2 - Configuration example</a> <a href="#2+-+Configuration+example"><img
src="markbook-section-link.png"/></a></h5><h6><a id="Generate+certificate+for+Apache+HTTP+Server">Generate
certificate for Apache HTTP Server</a> <a href="#Generate+certificate+for+Apache+HTTP+Server"><img
src="markbook-section-link.png"/></a></h6><p>See this document for an
example: <a href="http://www.akadia.com/services/ssh_test_certif
 icate.html">http://www.akadia.com/services/ssh_test_certificate.html</a></p><p>By
convention, Apache HTTP Server and Knox certificates are put into /etc/apache2/ssl/ folder.</p><h6><a
id="Update+Apache+HTTP+Server+configuration+file">Update Apache HTTP Server configuration
file</a> <a href="#Update+Apache+HTTP+Server+configuration+file"><img src="markbook-section-link.png"/></a></h6><p>This
file is located under {APACHE_HOME}/conf/httpd.conf.</p><p>Following directives
have to be added or uncommented in the configuration file:</p>
 <ul>

Modified: knox/trunk/books/1.1.0/config_ha.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/1.1.0/config_ha.md?rev=1834176&r1=1834175&r2=1834176&view=diff
==============================================================================
--- knox/trunk/books/1.1.0/config_ha.md (original)
+++ knox/trunk/books/1.1.0/config_ha.md Sat Jun 23 00:30:39 2018
@@ -19,18 +19,55 @@
 
 This describes how Knox itself can be made highly available.
 
-#### Configure Knox instances ####
-
 All Knox instances must be synced to use the same topology credential keystores.
 These files are located under `{GATEWAY_HOME}/conf/security/keystores/{TOPOLOGY_NAME}-credentials.jceks`.
 They are generated after the first topology deployment.
-Currently these files need to be synced manually.
-Here are the steps to sync topologies credentials keystores:
+
+In addition to these topology-specific credentials, gateway credentials and topologies must
also be kept in-sync for Knox to operate in an HA manner.
+
+#### Manually Synchronize Knox Instances ####
+
+Here are the steps to manually sync topology credential keystores:
 
 1. Choose a Knox instance that will be the source for topology credential keystores. Let's
call it _keystores master_
 2. Replace the topology credential keystores in the other Knox instances with topology credential
keystores from the _keystores master_
 3. Restart Knox instances
 
+Manually synchronizing the gateway credentials and topologies involves using ssh/scp to copy
the topology-related files to all the participating Knox instances, and running the Knox CLI
on each participating instance to define the credential aliases.
+
+This manual process can be tedious and error-prone. As such, ZooKeeper-based HA is recommended
to simplify the management of these deployments.
+
+#### High Availability with Apache ZooKeeper ####
+
+Rather than manually keeping Knox HA instances in sync (in terms of credentials and topology),
Knox can get it's state from Apache ZooKeeper.
+By configuring all the Knox instances to monitor the same ZooKeeper ensemble, they can be
kept in-sync by modifying the topology-related
+configuration and/or credential aliases at only one of the instances (using the Admin UI,
Admin API, or Knox CLI).
+
+##### What is Automatically Synchronized Across Instances?
+
+* Provider Configurations
+* Descriptors
+* Credential Aliases
+
+When a provider configuration or descriptor is added or updated to the ZooKeeper ensemble,
all of the participating Knox instances will get the change, and the affected topologies will
be [re]generated and [re]deployed. Similarly, if one of these is deleted, the affected topologies
will be deleted and undeployed.
+
+When provider configurations and descriptors are added, modified or removed using the Admin
UI or API (when the Knox instance is configured to monitor a ZooKeeper ensemble), then those
changes will be automatically reflected in the associated ZooKeeper ensemble. Those changes
will subsequently be consumed by all the other Knox instances monitoring that ensemble.
+By using the Admin UI or API, ssh/scp access to the Knox hosts can be avoided completely
for the purpose of effecting topology changes.
+
+Similarly, when the Knox CLI is used to create or delete a gateway alias (when the Knox instance
is configured to monitor a ZooKeeper ensemble), that alias change is reflected in the ZooKeeper
ensemble, and all other Knox instances montoring that ensemble will apply the change.
+
+
+##### What is NOT Automatically Synchronized Across Instances?
+
+* Topologies (XML)
+* Gateway config (e.g., gateway-site, gateway-logging, etc...)
+
+If you're creating/modifying topology XML files directly, then there is no automated support
for keeping these in sync across Knox HA instances.
+
+However, if the Knox instances are running in an Apache Ambari-managed cluster, there is
limited support for keeping topology XML files and gateway configuration synchronized across
those instances.
+
+<br>
+
 #### High Availability with Apache HTTP Server + mod_proxy + mod_proxy_balancer ####
 
 ##### 1 - Requirements #####



Mime
View raw message