sqoop-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From a..@apache.org
Subject svn commit: r1641708 [2/12] - in /sqoop/site/trunk/content: ./ resources/docs/1.99.4/ resources/docs/1.99.4/_sources/ resources/docs/1.99.4/_static/ resources/docs/1.99.4/css/ resources/docs/1.99.4/src/ resources/docs/1.99.4/src/main/ resources/docs/1....
Date Tue, 25 Nov 2014 21:56:41 GMT
Added: sqoop/site/trunk/content/resources/docs/1.99.4/ConnectorDevelopment.html
URL: http://svn.apache.org/viewvc/sqoop/site/trunk/content/resources/docs/1.99.4/ConnectorDevelopment.html?rev=1641708&view=auto
==============================================================================
--- sqoop/site/trunk/content/resources/docs/1.99.4/ConnectorDevelopment.html (added)
+++ sqoop/site/trunk/content/resources/docs/1.99.4/ConnectorDevelopment.html Tue Nov 25 21:56:40 2014
@@ -0,0 +1,468 @@
+
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+
+<html xmlns="http://www.w3.org/1999/xhtml">
+  <head>
+    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+    
+    <title>Sqoop 2 Connector Development &mdash; Apache Sqoop  documentation</title>
+    
+    <link rel="stylesheet" href="_static/haiku.css" type="text/css" />
+    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
+    <link rel="stylesheet" href="_static/print.css" type="text/css" />
+    
+    <script type="text/javascript">
+      var DOCUMENTATION_OPTIONS = {
+        URL_ROOT:    '',
+        VERSION:     '',
+        COLLAPSE_INDEX: false,
+        FILE_SUFFIX: '.html',
+        HAS_SOURCE:  true
+      };
+    </script>
+    <script type="text/javascript" src="_static/jquery.js"></script>
+    <script type="text/javascript" src="_static/underscore.js"></script>
+    <script type="text/javascript" src="_static/doctools.js"></script>
+    <script type="text/javascript" src="_static/theme_extras.js"></script>
+    <link rel="top" title="Apache Sqoop  documentation" href="index.html" /> 
+  </head>
+  <body>
+      <div class="header"><img class="rightlogo" src="_static/sqoop-logo.png" alt="Logo"/><h1 class="heading"><a href="index.html">
+          <span>Apache Sqoop  documentation</span></a></h1>
+        <h2 class="heading"><span>Sqoop 2 Connector Development</span></h2>
+      </div>
+      <div class="topnav">
+      
+        <p>
+        <a class="uplink" href="index.html">Contents</a>
+        </p>
+
+      </div>
+      <div class="content">
+        
+        
+  <div class="section" id="sqoop-2-connector-development">
+<h1><a class="toc-backref" href="#id2">Sqoop 2 Connector Development</a><a class="headerlink" href="#sqoop-2-connector-development" title="Permalink to this headline">¶</a></h1>
+<p>This document describes how to implement a connector in the Sqoop 2 using the code sample from one of the built-in connectors ( <tt class="docutils literal"><span class="pre">GenericJdbcConnector</span></tt> ) as a reference. Sqoop 2 jobs support extraction from and/or loading to different data sources. Sqoop 2 connectors encapsulate the job lifecyle operations for extracting and/or loading data from and/or to
+different data sources. Each connector will primarily focus on a particular data source and its custom implementation for optimally reading and/or writing data in a distributed environment.</p>
+<div class="contents topic" id="contents">
+<p class="topic-title first">Contents</p>
+<ul class="simple">
+<li><a class="reference internal" href="#sqoop-2-connector-development" id="id2">Sqoop 2 Connector Development</a><ul>
+<li><a class="reference internal" href="#what-is-a-sqoop-connector" id="id3">What is a Sqoop Connector?</a><ul>
+<li><a class="reference internal" href="#when-do-we-add-a-new-connector" id="id4">When do we add a new connector?</a></li>
+</ul>
+</li>
+<li><a class="reference internal" href="#connector-implementation" id="id5">Connector Implementation</a><ul>
+<li><a class="reference internal" href="#from" id="id6">From</a><ul>
+<li><a class="reference internal" href="#initializer-and-destroyer" id="id7">Initializer and Destroyer</a></li>
+<li><a class="reference internal" href="#partitioner" id="id8">Partitioner</a></li>
+<li><a class="reference internal" href="#extractor" id="id9">Extractor</a></li>
+</ul>
+</li>
+<li><a class="reference internal" href="#to" id="id10">To</a><ul>
+<li><a class="reference internal" href="#id1" id="id11">Initializer and Destroyer</a></li>
+<li><a class="reference internal" href="#loader" id="id12">Loader</a></li>
+</ul>
+</li>
+</ul>
+</li>
+<li><a class="reference internal" href="#configurables" id="id13">Configurables</a><ul>
+<li><a class="reference internal" href="#configurable-registration" id="id14">Configurable registration</a></li>
+<li><a class="reference internal" href="#configurations" id="id15">Configurations</a><ul>
+<li><a class="reference internal" href="#empty-configuration" id="id16">Empty Configuration</a></li>
+</ul>
+</li>
+<li><a class="reference internal" href="#configuration-resourcebundle" id="id17">Configuration ResourceBundle</a></li>
+<li><a class="reference internal" href="#validations-for-configs-and-inputs" id="id18">Validations for Configs and Inputs</a></li>
+</ul>
+</li>
+<li><a class="reference internal" href="#sqoop-2-mapreduce-job-execution-lifecycle-with-connector-api" id="id19">Sqoop 2 MapReduce Job Execution Lifecycle with Connector API</a></li>
+</ul>
+</li>
+</ul>
+</div>
+<div class="section" id="what-is-a-sqoop-connector">
+<h2><a class="toc-backref" href="#id3">What is a Sqoop Connector?</a><a class="headerlink" href="#what-is-a-sqoop-connector" title="Permalink to this headline">¶</a></h2>
+<p>Connectors provide the facility to interact with many data sources and thus can be used as a means to transfer data between them in Sqoop. The connector implementation will provide logic to read from and/or write to a data source that it represents. For instance the ( <tt class="docutils literal"><span class="pre">GenericJdbcConnector</span></tt> ) encapsulates the logic to read from and/or write to jdbc enabled relational data sources. The connector part that enables reading from a data source and transferring this data to internal Sqoop format is called the FROM and the part that enables writng data to a data source by transferring data from Sqoop format is called TO. In order to interact with these data sources, the connector will provide one or many config classes and input fields within it.</p>
+<p>Broadly we support two main config types for connectors, link type represented by the enum <tt class="docutils literal"><span class="pre">ConfigType.LINK</span></tt> and job type represented by the enum <tt class="docutils literal"><span class="pre">ConfigType.JOB</span></tt>. Link config represents the properties to physically connect to the data source. Job config represent the properties that are required to invoke reading from and/or writing to particular dataset in the data source it connects to. If a connector supports both reading from and writing to, it will provide the <tt class="docutils literal"><span class="pre">FromJobConfig</span></tt> and <tt class="docutils literal"><span class="pre">ToJobConfig</span></tt> objects. Each of these config objects are custom to each connector and can have one or more inputs associated with each of the Link, FromJob and ToJob config types. Hence we call the connectors as configurables i.e an entity that can provide configs for interac
 ting with the data source it represents. As the connectors evolve over time to support new features in their data sources, the configs and inputs will change as well. Thus the connector API also provides methods for upgrading the config and input names and data related to these data sources across different versions.</p>
+<p>The connectors implement logic for various stages of the extract/load process using the connector API described below. While extracting/reading data from the data-source the main stages are <tt class="docutils literal"><span class="pre">Initializer</span></tt>, <tt class="docutils literal"><span class="pre">Partitioner</span></tt>, <tt class="docutils literal"><span class="pre">Extractor</span></tt> and <tt class="docutils literal"><span class="pre">Destroyer</span></tt>. While loading/writitng data to the data source the main stages currently supported are <tt class="docutils literal"><span class="pre">Initializer</span></tt>, <tt class="docutils literal"><span class="pre">Loader</span></tt> and <tt class="docutils literal"><span class="pre">Destroyer</span></tt>. Each stage has its unique set of responsibilities that are explained in detail below. Since connectors understand the internals of the data source they represent, they work in tandem with the sqoop supported execution 
 engines such as MapReduce or Spark (in future) to accomplish this process in a most optimal way.</p>
+<div class="section" id="when-do-we-add-a-new-connector">
+<h3><a class="toc-backref" href="#id4">When do we add a new connector?</a><a class="headerlink" href="#when-do-we-add-a-new-connector" title="Permalink to this headline">¶</a></h3>
+<p>You add a new connector when you need to extract/read data from a new data source, or load/write
+data into a new data source that is not supported yet in Sqoop 2.
+In addition to the connector API, Sqoop 2 also has an submission and execution engine interface.
+At the moment the only supported engine is MapReduce, but we may support additional engines in the future such as Spark. Since many parallel execution engines are capable of reading/writing data, there may be a question of whether adding support for a new data source should be done through the connector or the execution engine API.</p>
+<p><strong>Our guideline are as follows:</strong> Connectors should manage all data extract(reading) from and/or load(writing) into a data source. Submission and execution engine together manage the job submission and execution life cycle to read/write data from/to data sources in the most optimal way possible. If you need to support a new data store and details of linking to it and don&#8217;t care how the process of reading/writing from/to happens then you are looking to add a connector and you should continue reading the below Connector API details to contribute new connectors to Sqoop 2.</p>
+</div>
+</div>
+<div class="section" id="connector-implementation">
+<h2><a class="toc-backref" href="#id5">Connector Implementation</a><a class="headerlink" href="#connector-implementation" title="Permalink to this headline">¶</a></h2>
+<p>The <tt class="docutils literal"><span class="pre">SqoopConnector</span></tt> class defines an API for the connectors that must be implemented by the connector developers. Each Connector must extend <tt class="docutils literal"><span class="pre">SqoopConnector</span></tt> and override the methods shown below.</p>
+<div class="highlight-none"><div class="highlight"><pre>public abstract String getVersion();
+public abstract ResourceBundle getBundle(Locale locale);
+public abstract Class getLinkConfigurationClass();
+public abstract Class getJobConfigurationClass(Direction direction);
+public abstract From getFrom();
+public abstract To getTo();
+public abstract ConnectorConfigurableUpgrader getConfigurableUpgrader()
+</pre></div>
+</div>
+<p>Connectors can optionally override the following methods:</p>
+<div class="highlight-none"><div class="highlight"><pre>public List&lt;Direction&gt; getSupportedDirections();
+public Class&lt;? extends IntermediateDataFormat&lt;?&gt;&gt; getIntermediateDataFormat()
+</pre></div>
+</div>
+<p>The <tt class="docutils literal"><span class="pre">getFrom</span></tt> method returns <a class="reference internal" href="#from">From</a> instance
+which is a <tt class="docutils literal"><span class="pre">Transferable</span></tt> entity that encapsulates the operations
+needed to read from the data source that the connector represents.</p>
+<p>The <tt class="docutils literal"><span class="pre">getTo</span></tt> method returns <a class="reference internal" href="#to">To</a> instance
+which is a <tt class="docutils literal"><span class="pre">Transferable</span></tt> entity that encapsulates the operations
+needed to write to the data source that the connector represents.</p>
+<p>Methods such as <tt class="docutils literal"><span class="pre">getBundle</span></tt> , <tt class="docutils literal"><span class="pre">getLinkConfigurationClass</span></tt> , <tt class="docutils literal"><span class="pre">getJobConfigurationClass</span></tt>
+are related to <a class="reference internal" href="#configurations">Configurations</a></p>
+<p>Since a connector represents a data source and it can support one of the two directions, either reading FROM its data source or writing to its data souurce or both, the <tt class="docutils literal"><span class="pre">getSupportedDirections</span></tt> method returns a list of directions that a connector will implement. This should be a subset of the values in the <tt class="docutils literal"><span class="pre">Direction</span></tt> enum we provide:</p>
+<div class="highlight-none"><div class="highlight"><pre>public List&lt;Direction&gt; getSupportedDirections() {
+    return Arrays.asList(new Direction[]{
+        Direction.FROM,
+        Direction.TO
+    });
+}
+</pre></div>
+</div>
+<div class="section" id="from">
+<h3><a class="toc-backref" href="#id6">From</a><a class="headerlink" href="#from" title="Permalink to this headline">¶</a></h3>
+<p>The <tt class="docutils literal"><span class="pre">getFrom</span></tt> method returns <a class="reference internal" href="#from">From</a> instance which is a <tt class="docutils literal"><span class="pre">Transferable</span></tt> entity that encapsulates the operations needed to read from the data source the connector represents. The built-in <tt class="docutils literal"><span class="pre">GenericJdbcConnector</span></tt> defines <tt class="docutils literal"><span class="pre">From</span></tt> like this.</p>
+<div class="highlight-none"><div class="highlight"><pre>private static final From FROM = new From(
+      GenericJdbcFromInitializer.class,
+      GenericJdbcPartitioner.class,
+      GenericJdbcExtractor.class,
+      GenericJdbcFromDestroyer.class);
+...
+
+@Override
+public From getFrom() {
+  return FROM;
+}
+</pre></div>
+</div>
+<div class="section" id="initializer-and-destroyer">
+<h4><a class="toc-backref" href="#id7">Initializer and Destroyer</a><a class="headerlink" href="#initializer-and-destroyer" title="Permalink to this headline">¶</a></h4>
+<p>Initializer is instantiated before the submission of sqoop job to the execution engine and doing preparations such as connecting to the data source, creating temporary tables or adding dependent jar files. Initializers are executed as the first step in the sqoop job lifecyle. Here is the <tt class="docutils literal"><span class="pre">Initializer</span></tt> API.</p>
+<div class="highlight-none"><div class="highlight"><pre>public abstract void initialize(InitializerContext context, LinkConfiguration linkConfiguration,
+    JobConfiguration jobConfiguration);
+
+public List&lt;String&gt; getJars(InitializerContext context, LinkConfiguration linkConfiguration,
+    JobConfiguration jobConfiguration);
+
+public abstract Schema getSchema(InitializerContext context, LinkConfiguration linkConfiguration,
+    JobConfiguration jobConfiguration);
+</pre></div>
+</div>
+<p>In addition to the initialize() method where the job execution preparation activities occur, the <tt class="docutils literal"><span class="pre">Initializer</span></tt> must also implement the getSchema() method for the direction it supports. The getSchema() method is used by the sqoop system to match the data extracted/read by the <tt class="docutils literal"><span class="pre">From</span></tt> instance of connector data source with the data loaded/written to the <tt class="docutils literal"><span class="pre">To</span></tt> instance of the connector data source. In case of a relational database or columnar database, the returned Schema object will include collection of columns with their data types. If the data source is schema-less, such as a file, an empty Schema can be returned (i.e a Schema object without any columns).</p>
+<p>NOTE: Sqoop 2 currently does not support extract and load between two connectors that represent schema-less data sources. We expect that atleast the <tt class="docutils literal"><span class="pre">From</span></tt> instance of the connector or the <tt class="docutils literal"><span class="pre">To</span></tt> instance of the connector in the sqoop job will have a schema. If both <tt class="docutils literal"><span class="pre">From</span></tt> and <tt class="docutils literal"><span class="pre">To</span></tt> have a associated non empty schema, Sqoop 2 will load data by column name, i.e, data in column &#8220;A&#8221; in <tt class="docutils literal"><span class="pre">From</span></tt> instance of the connector for the job will be loaded to column &#8220;A&#8221; in the <tt class="docutils literal"><span class="pre">To</span></tt> instance of the connector for that job.</p>
+<p><tt class="docutils literal"><span class="pre">Destroyer</span></tt> is instantiated after the execution engine finishes its processing. It is the last step in the sqoop job lifecyle, so pending clean up tasks such as dropping temporary tables and closing connections. The term destroyer is a little misleading. It represents the phase where the final output commits to the data source can also happen in case of the <tt class="docutils literal"><span class="pre">TO</span></tt> instance of the connector code.</p>
+</div>
+<div class="section" id="partitioner">
+<h4><a class="toc-backref" href="#id8">Partitioner</a><a class="headerlink" href="#partitioner" title="Permalink to this headline">¶</a></h4>
+<p>The <tt class="docutils literal"><span class="pre">Partitioner</span></tt> creates <tt class="docutils literal"><span class="pre">Partition</span></tt> instances ranging from 1..N. The N is driven by a configuration as well. The default set of partitions created is set to 10 in the sqoop code. Here is the <tt class="docutils literal"><span class="pre">Partitioner</span></tt> API</p>
+<p><tt class="docutils literal"><span class="pre">Partitioner</span></tt> must implement the <tt class="docutils literal"><span class="pre">getPartitions</span></tt> method in the <tt class="docutils literal"><span class="pre">Partitioner</span></tt> API.</p>
+<div class="highlight-none"><div class="highlight"><pre>public abstract List&lt;Partition&gt; getPartitions(PartitionerContext context,
+    LinkConfiguration linkConfiguration, FromJobConfiguration jobConfiguration);
+</pre></div>
+</div>
+<p><tt class="docutils literal"><span class="pre">Partition</span></tt> instances are passed to <a class="reference internal" href="#extractor">Extractor</a> as the argument of <tt class="docutils literal"><span class="pre">extract</span></tt> method.
+<a class="reference internal" href="#extractor">Extractor</a> determines which portion of the data to extract by a given partition.</p>
+<p>There is no actual convention for Partition classes other than being actually <tt class="docutils literal"><span class="pre">Writable</span></tt> and <tt class="docutils literal"><span class="pre">toString()</span></tt> -able. Here is the <tt class="docutils literal"><span class="pre">Partition</span></tt> API</p>
+<div class="highlight-none"><div class="highlight"><pre>public abstract class Partition {
+  public abstract void readFields(DataInput in) throws IOException;
+  public abstract void write(DataOutput out) throws IOException;
+  public abstract String toString();
+}
+</pre></div>
+</div>
+<p>Connectors can implement custom <tt class="docutils literal"><span class="pre">Partition</span></tt> classes. <tt class="docutils literal"><span class="pre">GenericJdbcPartitioner</span></tt> is one such example. It returns the <tt class="docutils literal"><span class="pre">GenericJdbcPartition</span></tt> objects.</p>
+</div>
+<div class="section" id="extractor">
+<h4><a class="toc-backref" href="#id9">Extractor</a><a class="headerlink" href="#extractor" title="Permalink to this headline">¶</a></h4>
+<p>Extractor (E for ETL) extracts data from a given data source
+<tt class="docutils literal"><span class="pre">Extractor</span></tt> must implement the <tt class="docutils literal"><span class="pre">extract</span></tt> method in the <tt class="docutils literal"><span class="pre">Extractor</span></tt> API.</p>
+<div class="highlight-none"><div class="highlight"><pre>public abstract void extract(ExtractorContext context,
+                             LinkConfiguration linkConfiguration,
+                             JobConfiguration jobConfiguration,
+                             SqoopPartition partition);
+</pre></div>
+</div>
+<p>The <tt class="docutils literal"><span class="pre">extract</span></tt> method extracts data from the data source using the link and job configuration properties and writes it to the <tt class="docutils literal"><span class="pre">DataWriter</span></tt> (provided by the extractor context) as the default <a class="reference external" href="https://cwiki.apache.org/confluence/display/SQOOP/Sqoop2+Intermediate+representation">Intermediate representation</a> .</p>
+<p>Extractors use Writer&#8217;s provided by the ExtractorContext to send a record through the sqoop system.</p>
+<div class="highlight-none"><div class="highlight"><pre>context.getDataWriter().writeArrayRecord(array);
+</pre></div>
+</div>
+<p>The extractor must iterate through the given partition in the <tt class="docutils literal"><span class="pre">extract</span></tt> method.</p>
+<div class="highlight-none"><div class="highlight"><pre>while (resultSet.next()) {
+  ...
+  context.getDataWriter().writeArrayRecord(array);
+  ...
+}
+</pre></div>
+</div>
+</div>
+</div>
+<div class="section" id="to">
+<h3><a class="toc-backref" href="#id10">To</a><a class="headerlink" href="#to" title="Permalink to this headline">¶</a></h3>
+<p>The <tt class="docutils literal"><span class="pre">getTo</span></tt> method returns <tt class="docutils literal"><span class="pre">TO</span></tt> instance which is a <tt class="docutils literal"><span class="pre">Transferable</span></tt> entity that encapsulates the operations needed to wtite data to the data source the connector represents. The built-in <tt class="docutils literal"><span class="pre">GenericJdbcConnector</span></tt> defines <tt class="docutils literal"><span class="pre">To</span></tt> like this.</p>
+<div class="highlight-none"><div class="highlight"><pre>private static final To TO = new To(
+      GenericJdbcToInitializer.class,
+      GenericJdbcLoader.class,
+      GenericJdbcToDestroyer.class);
+...
+
+@Override
+public To getTo() {
+  return TO;
+}
+</pre></div>
+</div>
+<div class="section" id="id1">
+<h4><a class="toc-backref" href="#id11">Initializer and Destroyer</a><a class="headerlink" href="#id1" title="Permalink to this headline">¶</a></h4>
+<p><a class="reference internal" href="#initializer">Initializer</a> and <a class="reference internal" href="#destroyer">Destroyer</a> of a <tt class="docutils literal"><span class="pre">To</span></tt> instance are used in a similar way to those of a <tt class="docutils literal"><span class="pre">From</span></tt> instance.
+Refer to the previous section for more details.</p>
+</div>
+<div class="section" id="loader">
+<h4><a class="toc-backref" href="#id12">Loader</a><a class="headerlink" href="#loader" title="Permalink to this headline">¶</a></h4>
+<p>A loader (L for ETL) receives data from the <tt class="docutils literal"><span class="pre">From</span></tt> instance of the sqoop connector associated with the sqoop job and then loads it to an <tt class="docutils literal"><span class="pre">TO</span></tt> instance of the connector associated with the same sqoop job</p>
+<p><tt class="docutils literal"><span class="pre">Loader</span></tt> must implement <tt class="docutils literal"><span class="pre">load</span></tt> method of the <tt class="docutils literal"><span class="pre">Loader</span></tt> API</p>
+<div class="highlight-none"><div class="highlight"><pre>public abstract void load(LoaderContext context,
+                          ConnectionConfiguration connectionConfiguration,
+                          JobConfiguration jobConfiguration) throws Exception;
+</pre></div>
+</div>
+<p>The <tt class="docutils literal"><span class="pre">load</span></tt> method reads data from <tt class="docutils literal"><span class="pre">DataReader</span></tt> (provided by context) in the default <a class="reference external" href="https://cwiki.apache.org/confluence/display/SQOOP/Sqoop2+Intermediate+representation">Intermediate representation</a> and loads it to data source.</p>
+<p>Loader must iterate in the <tt class="docutils literal"><span class="pre">load</span></tt> method until the data from <tt class="docutils literal"><span class="pre">DataReader</span></tt> is exhausted.</p>
+<div class="highlight-none"><div class="highlight"><pre>while ((array = context.getDataReader().readArrayRecord()) != null) {
+  ...
+}
+</pre></div>
+</div>
+<p>NOTE: we do not yet support a stage for connector developers to control how to balance the loading/writitng of data across the mutiple loaders. In future we may be adding this to the connector API to have custom logic to balance the loading across multiple reducers.</p>
+</div>
+</div>
+</div>
+<div class="section" id="configurables">
+<h2><a class="toc-backref" href="#id13">Configurables</a><a class="headerlink" href="#configurables" title="Permalink to this headline">¶</a></h2>
+<div class="section" id="configurable-registration">
+<h3><a class="toc-backref" href="#id14">Configurable registration</a><a class="headerlink" href="#configurable-registration" title="Permalink to this headline">¶</a></h3>
+<p>One of the currently supported configurable in Sqoop are the connectors. Sqoop 2 registers definitions of connectors from the file named <tt class="docutils literal"><span class="pre">sqoopconnector.properties</span></tt> which each connector implementation should provide to become available in Sqoop.</p>
+<div class="highlight-none"><div class="highlight"><pre># Generic JDBC Connector Properties
+org.apache.sqoop.connector.class = org.apache.sqoop.connector.jdbc.GenericJdbcConnector
+org.apache.sqoop.connector.name = generic-jdbc-connector
+</pre></div>
+</div>
+</div>
+<div class="section" id="configurations">
+<h3><a class="toc-backref" href="#id15">Configurations</a><a class="headerlink" href="#configurations" title="Permalink to this headline">¶</a></h3>
+<p>Implementations of <tt class="docutils literal"><span class="pre">SqoopConnector</span></tt> overrides methods such as <tt class="docutils literal"><span class="pre">getLinkConfigurationClass</span></tt> and <tt class="docutils literal"><span class="pre">getJobConfigurationClass</span></tt> returning configuration class.</p>
+<div class="highlight-none"><div class="highlight"><pre>@Override
+public Class getLinkConfigurationClass() {
+  return LinkConfiguration.class;
+}
+
+@Override
+public Class getJobConfigurationClass(Direction direction) {
+  switch (direction) {
+    case FROM:
+      return FromJobConfiguration.class;
+    case TO:
+      return ToJobConfiguration.class;
+    default:
+      return null;
+  }
+}
+</pre></div>
+</div>
+<p>Configurations are represented by annotations defined in <tt class="docutils literal"><span class="pre">org.apache.sqoop.model</span></tt> package.
+Annotations such as <tt class="docutils literal"><span class="pre">ConfigurationClass</span></tt> , <tt class="docutils literal"><span class="pre">ConfigClass</span></tt> , <tt class="docutils literal"><span class="pre">Config</span></tt> and <tt class="docutils literal"><span class="pre">Input</span></tt>
+are provided for defining configuration objects for each connector.</p>
+<p><tt class="docutils literal"><span class="pre">&#64;ConfigurationClass</span></tt> is a marker annotation for <tt class="docutils literal"><span class="pre">ConfigurationClasses</span></tt>  that hold a group or lis of <tt class="docutils literal"><span class="pre">ConfigClasses</span></tt> annotated with the marker <tt class="docutils literal"><span class="pre">&#64;ConfigClass</span></tt></p>
+<div class="highlight-none"><div class="highlight"><pre>@ConfigurationClass
+public class LinkConfiguration {
+
+  @Config public LinkConfig linkConfig;
+
+  public LinkConfiguration() {
+    linkConfig = new LinkConfig();
+  }
+}
+</pre></div>
+</div>
+<p>Each <tt class="docutils literal"><span class="pre">ConfigClass</span></tt> defines the different inputs it exposes for the link and job configs. These inputs are annotated with <tt class="docutils literal"><span class="pre">&#64;Input</span></tt> and the user will be asked to fill in when they create a sqoop job and choose to use this instance of the connector for either the <tt class="docutils literal"><span class="pre">From</span></tt> or <tt class="docutils literal"><span class="pre">To</span></tt> part of the job.</p>
+<div class="highlight-none"><div class="highlight"><pre>@ConfigClass(validators = {@Validator(LinkConfig.ConfigValidator.class)})
+public class LinkConfig {
+  @Input(size = 128, validators = {@Validator(NotEmpty.class), @Validator(ClassAvailable.class)} )
+  @Input(size = 128) public String jdbcDriver;
+  @Input(size = 128) public String connectionString;
+  @Input(size = 40)  public String username;
+  @Input(size = 40, sensitive = true) public String password;
+  @Input public Map&lt;String, String&gt; jdbcProperties;
+}
+</pre></div>
+</div>
+<p>Each <tt class="docutils literal"><span class="pre">ConfigClass</span></tt> and the  inputs within the configs annotated with <tt class="docutils literal"><span class="pre">Input</span></tt> can specifiy validators via the <tt class="docutils literal"><span class="pre">&#64;Validator</span></tt> annotation described below.</p>
+<div class="section" id="empty-configuration">
+<h4><a class="toc-backref" href="#id16">Empty Configuration</a><a class="headerlink" href="#empty-configuration" title="Permalink to this headline">¶</a></h4>
+<p>If a connector does not have any configuration inputs to specify for the <tt class="docutils literal"><span class="pre">ConfigType.LINK</span></tt> or <tt class="docutils literal"><span class="pre">ConfigType.JOB</span></tt> it is recommended to return the <tt class="docutils literal"><span class="pre">EmptyConfiguration</span></tt> class in the <tt class="docutils literal"><span class="pre">getLinkConfigurationClass()</span></tt> or <tt class="docutils literal"><span class="pre">getJobConfigurationClass(..)</span></tt> methods.</p>
+<div class="highlight-none"><div class="highlight"><pre>@ConfigurationClass
+public class EmptyConfiguration { }
+</pre></div>
+</div>
+</div>
+</div>
+<div class="section" id="configuration-resourcebundle">
+<h3><a class="toc-backref" href="#id17">Configuration ResourceBundle</a><a class="headerlink" href="#configuration-resourcebundle" title="Permalink to this headline">¶</a></h3>
+<p>The config and its corresponding input names, the input field description are represented in the config resource bundle defined per connector.</p>
+<div class="highlight-none"><div class="highlight"><pre># jdbc driver
+connection.jdbcDriver.label = JDBC Driver Class
+connection.jdbcDriver.help = Enter the fully qualified class name of the JDBC \
+                   driver that will be used for establishing this connection.
+
+# connect string
+connection.connectionString.label = JDBC Connection String
+connection.connectionString.help = Enter the value of JDBC connection string to be \
+                   used by this connector for creating connections.
+
+...
+</pre></div>
+</div>
+<p>Those resources are loaded by <tt class="docutils literal"><span class="pre">getBundle</span></tt> method of the <tt class="docutils literal"><span class="pre">SqoopConnector.</span></tt></p>
+<div class="highlight-none"><div class="highlight"><pre>@Override
+public ResourceBundle getBundle(Locale locale) {
+  return ResourceBundle.getBundle(
+  GenericJdbcConnectorConstants.RESOURCE_BUNDLE_NAME, locale);
+}
+</pre></div>
+</div>
+</div>
+<div class="section" id="validations-for-configs-and-inputs">
+<h3><a class="toc-backref" href="#id18">Validations for Configs and Inputs</a><a class="headerlink" href="#validations-for-configs-and-inputs" title="Permalink to this headline">¶</a></h3>
+<p>Validators validate the config objects and the inputs associated with the config objects. For config objects themselves we encourage developers to write custom valdiators for both the link and job config types.</p>
+<div class="highlight-none"><div class="highlight"><pre>@Input(size = 128, validators = {@Validator(value = StartsWith.class, strArg = &quot;jdbc:&quot;)} )
+
+@Input(size = 255, validators = { @Validator(NotEmpty.class) })
+</pre></div>
+</div>
+<p>Sqoop 2 provides a list of standard input validators that can be used by different connectors for the link and job type configuration inputs.</p>
+<div class="highlight-none"><div class="highlight"><pre>public class NotEmpty extends AbstractValidator&lt;String&gt; {
+@Override
+public void validate(String instance) {
+  if (instance == null || instance.isEmpty()) {
+   addMessage(Status.ERROR, &quot;Can&#39;t be null nor empty&quot;);
+  }
+ }
+}
+</pre></div>
+</div>
+<p>The validation logic is executed when users creating the sqoop jobs input values for the link and job configs associated with the <tt class="docutils literal"><span class="pre">From</span></tt> and <tt class="docutils literal"><span class="pre">To</span></tt> instances of the connectors associated with the job.</p>
+</div>
+</div>
+<div class="section" id="sqoop-2-mapreduce-job-execution-lifecycle-with-connector-api">
+<h2><a class="toc-backref" href="#id19">Sqoop 2 MapReduce Job Execution Lifecycle with Connector API</a><a class="headerlink" href="#sqoop-2-mapreduce-job-execution-lifecycle-with-connector-api" title="Permalink to this headline">¶</a></h2>
+<p>Sqoop 2 provides MapReduce utilities such as <tt class="docutils literal"><span class="pre">SqoopMapper</span></tt> and <tt class="docutils literal"><span class="pre">SqoopReducer</span></tt> that aid sqoop job execution.</p>
+<p>Note: Any class prefixed with Sqoop is a internal sqoop class provided for MapReduce and is not part of the conenector API. These internal classes work with the custom implementations of <tt class="docutils literal"><span class="pre">Extractor</span></tt>, <tt class="docutils literal"><span class="pre">Partitioner</span></tt> in the <tt class="docutils literal"><span class="pre">From</span></tt> instance and <tt class="docutils literal"><span class="pre">Loader</span></tt> in the <tt class="docutils literal"><span class="pre">To</span></tt> instance of the connector.</p>
+<p>When reading from a data source, the <tt class="docutils literal"><span class="pre">Extractor</span></tt> provided by the <tt class="docutils literal"><span class="pre">From</span></tt> instance of the connector extracts data from a corresponding data source it represents and the <tt class="docutils literal"><span class="pre">Loader</span></tt>, provided by the TO instance of the connector, loads data into the data source it represents.</p>
+<p>The diagram below describes the initialization phase of a job.
+<tt class="docutils literal"><span class="pre">SqoopInputFormat</span></tt> create splits using <tt class="docutils literal"><span class="pre">Partitioner</span></tt>.</p>
+<div class="highlight-none"><div class="highlight"><pre>    ,----------------.          ,-----------.
+    |SqoopInputFormat|          |Partitioner|
+    `-------+--------&#39;          `-----+-----&#39;
+ getSplits  |                         |
+-----------&gt;|                         |
+            |      getPartitions      |
+            |------------------------&gt;|
+            |                         |         ,---------.
+            |                         |-------&gt; |Partition|
+            |                         |         `----+----&#39;
+            |&lt;- - - - - - - - - - - - |              |
+            |                         |              |          ,----------.
+            |--------------------------------------------------&gt;|SqoopSplit|
+            |                         |              |          `----+-----&#39;
+</pre></div>
+</div>
+<p>The diagram below describes the map phase of a job.
+<tt class="docutils literal"><span class="pre">SqoopMapper</span></tt> invokes <tt class="docutils literal"><span class="pre">From</span></tt> connector&#8217;s extractor&#8217;s <tt class="docutils literal"><span class="pre">extract</span></tt> method.</p>
+<div class="highlight-none"><div class="highlight"><pre>    ,-----------.
+    |SqoopMapper|
+    `-----+-----&#39;
+   run    |
+---------&gt;|                                   ,------------------.
+          |----------------------------------&gt;|SqoopMapDataWriter|
+          |                                   `------+-----------&#39;
+          |                ,---------.               |
+          |--------------&gt; |Extractor|               |
+          |                `----+----&#39;               |
+          |      extract        |                    |
+          |--------------------&gt;|                    |
+          |                     |                    |
+         read from DB           |                    |
+&lt;-------------------------------|      write*        |
+          |                     |-------------------&gt;|
+          |                     |                    |           ,----.
+          |                     |                    |----------&gt;|Data|
+          |                     |                    |           `-+--&#39;
+          |                     |                    |
+          |                     |                    |      context.write
+          |                     |                    |--------------------------&gt;
+</pre></div>
+</div>
+<p>The diagram below decribes the reduce phase of a job.
+<tt class="docutils literal"><span class="pre">OutputFormat</span></tt> invokes <tt class="docutils literal"><span class="pre">To</span></tt> connector&#8217;s loader&#8217;s <tt class="docutils literal"><span class="pre">load</span></tt> method (via <tt class="docutils literal"><span class="pre">SqoopOutputFormatLoadExecutor</span></tt> ).</p>
+<div class="highlight-none"><div class="highlight"><pre>  ,------------.  ,---------------------.
+  |SqoopReducer|  |SqoopNullOutputFormat|
+  `---+--------&#39;  `----------+----------&#39;
+      |                 |   ,-----------------------------.
+      |                 |-&gt; |SqoopOutputFormatLoadExecutor|
+      |                 |   `--------------+--------------&#39;        ,----.
+      |                 |                  |---------------------&gt; |Data|
+      |                 |                  |                       `-+--&#39;
+      |                 |                  |   ,-----------------.   |
+      |                 |                  |-&gt; |SqoopRecordWriter|   |
+    getRecordWriter     |                  |   `--------+--------&#39;   |
+-----------------------&gt;| getRecordWriter  |            |            |
+      |                 |-----------------&gt;|            |            |     ,--------------.
+      |                 |                  |-----------------------------&gt; |ConsumerThread|
+      |                 |                  |            |            |     `------+-------&#39;
+      |                 |&lt;- - - - - - - - -|            |            |            |    ,------.
+&lt;- - - - - - - - - - - -|                  |            |            |            |---&gt;|Loader|
+      |                 |                  |            |            |            |    `--+---&#39;
+      |                 |                  |            |            |            |       |
+      |                 |                  |            |            |            | load  |
+ run  |                 |                  |            |            |            |------&gt;|
+-----&gt;|                 |     write        |            |            |            |       |
+      |------------------------------------------------&gt;| setContent |            | read* |
+      |                 |                  |            |-----------&gt;| getContent |&lt;------|
+      |                 |                  |            |            |&lt;-----------|       |
+      |                 |                  |            |            |            | - - -&gt;|
+      |                 |                  |            |            |            |       | write into DB
+      |                 |                  |            |            |            |       |--------------&gt;
+</pre></div>
+</div>
+</div>
+</div>
+
+
+      </div>
+      <div class="bottomnav">
+      
+        <p>
+        <a class="uplink" href="index.html">Contents</a>
+        </p>
+
+      </div>
+
+    <div class="footer">
+        &copy; Copyright 2009-2013 The Apache Software Foundation.
+    </div>
+  </body>
+</html>
\ No newline at end of file

Added: sqoop/site/trunk/content/resources/docs/1.99.4/DevEnv.html
URL: http://svn.apache.org/viewvc/sqoop/site/trunk/content/resources/docs/1.99.4/DevEnv.html?rev=1641708&view=auto
==============================================================================
--- sqoop/site/trunk/content/resources/docs/1.99.4/DevEnv.html (added)
+++ sqoop/site/trunk/content/resources/docs/1.99.4/DevEnv.html Tue Nov 25 21:56:40 2014
@@ -0,0 +1,94 @@
+
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+
+<html xmlns="http://www.w3.org/1999/xhtml">
+  <head>
+    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+    
+    <title>Sqoop 2 Development Environment Setup &mdash; Apache Sqoop  documentation</title>
+    
+    <link rel="stylesheet" href="_static/haiku.css" type="text/css" />
+    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
+    <link rel="stylesheet" href="_static/print.css" type="text/css" />
+    
+    <script type="text/javascript">
+      var DOCUMENTATION_OPTIONS = {
+        URL_ROOT:    '',
+        VERSION:     '',
+        COLLAPSE_INDEX: false,
+        FILE_SUFFIX: '.html',
+        HAS_SOURCE:  true
+      };
+    </script>
+    <script type="text/javascript" src="_static/jquery.js"></script>
+    <script type="text/javascript" src="_static/underscore.js"></script>
+    <script type="text/javascript" src="_static/doctools.js"></script>
+    <script type="text/javascript" src="_static/theme_extras.js"></script>
+    <link rel="top" title="Apache Sqoop  documentation" href="index.html" /> 
+  </head>
+  <body>
+      <div class="header"><img class="rightlogo" src="_static/sqoop-logo.png" alt="Logo"/><h1 class="heading"><a href="index.html">
+          <span>Apache Sqoop  documentation</span></a></h1>
+        <h2 class="heading"><span>Sqoop 2 Development Environment Setup</span></h2>
+      </div>
+      <div class="topnav">
+      
+        <p>
+        <a class="uplink" href="index.html">Contents</a>
+        </p>
+
+      </div>
+      <div class="content">
+        
+        
+  <div class="section" id="sqoop-2-development-environment-setup">
+<h1>Sqoop 2 Development Environment Setup<a class="headerlink" href="#sqoop-2-development-environment-setup" title="Permalink to this headline">¶</a></h1>
+<p>This document describes you how to setup development environment for Sqoop 2.</p>
+<div class="section" id="system-requirement">
+<h2>System Requirement<a class="headerlink" href="#system-requirement" title="Permalink to this headline">¶</a></h2>
+<div class="section" id="java">
+<h3>Java<a class="headerlink" href="#java" title="Permalink to this headline">¶</a></h3>
+<p>Sqoop written in Java and using version 1.6. You can <a class="reference external" href="http://www.oracle.com/technetwork/java/javase/downloads/index.html">download java</a> and install. Locate JAVA_HOME to installed directroy, e.g. export JAVA_HOME=/usr/lib/jvm/jdk1.6.0_32.</p>
+</div>
+<div class="section" id="maven">
+<h3>Maven<a class="headerlink" href="#maven" title="Permalink to this headline">¶</a></h3>
+<p>Sqoop uses Maven 3 for building the project. Download <a class="reference external" href="http://maven.apache.org/download.cgi">Maven</a> and its Installation instructions given in <a class="reference external" href="http://maven.apache.org/download.cgi#Maven_Documentation">link</a>.</p>
+</div>
+</div>
+<div class="section" id="eclipse-setup">
+<h2>Eclipse Setup<a class="headerlink" href="#eclipse-setup" title="Permalink to this headline">¶</a></h2>
+<p>Steps for downloading source code is given in <a class="reference external" href="BuildingSqoop2.html">Building Sqoop2</a></p>
+<p>Sqoop 2 project has multiple modules where one module is depend on another module for e.g. sqoop 2 client module has sqoop 2 common module dependency. Follow below step for creating eclipse&#8217;s project and classpath for each module.</p>
+<div class="highlight-none"><div class="highlight"><pre>//Install all package into local maven repository
+mvn clean install -DskipTests
+
+//Adding M2_REPO variable to eclipse workspace
+mvn eclipse:configure-workspace -Declipse.workspace=&lt;path-to-eclipse-workspace-dir-for-sqoop-2&gt;
+
+//Eclipse project creation with optional parameters
+mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true
+</pre></div>
+</div>
+<p>Alternatively, for manually adding M2_REPO classpath variable as maven repository path in eclipse-&gt; window-&gt; Java -&gt;Classpath Variables -&gt;Click &#8220;New&#8221; -&gt;In new dialog box, input Name as M2_REPO and Path as $HOME/.m2/repository -&gt;click Ok.</p>
+<p>On successful execution of above maven commands, Then import the sqoop project modules into eclipse-&gt; File -&gt; Import -&gt;General -&gt;Existing Projects into Workspace-&gt; Click Next-&gt; Browse Sqoop 2 directory ($HOME/git/sqoop2) -&gt;Click Ok -&gt;Import dialog shows multiple projects (sqoop-client, sqoop-common, etc.) -&gt; Select all modules -&gt; click Finish.</p>
+</div>
+</div>
+
+
+      </div>
+      <div class="bottomnav">
+      
+        <p>
+        <a class="uplink" href="index.html">Contents</a>
+        </p>
+
+      </div>
+
+    <div class="footer">
+        &copy; Copyright 2009-2013 The Apache Software Foundation.
+    </div>
+  </body>
+</html>
\ No newline at end of file

Added: sqoop/site/trunk/content/resources/docs/1.99.4/Installation.html
URL: http://svn.apache.org/viewvc/sqoop/site/trunk/content/resources/docs/1.99.4/Installation.html?rev=1641708&view=auto
==============================================================================
--- sqoop/site/trunk/content/resources/docs/1.99.4/Installation.html (added)
+++ sqoop/site/trunk/content/resources/docs/1.99.4/Installation.html Tue Nov 25 21:56:40 2014
@@ -0,0 +1,134 @@
+
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+
+<html xmlns="http://www.w3.org/1999/xhtml">
+  <head>
+    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+    
+    <title>Installation &mdash; Apache Sqoop  documentation</title>
+    
+    <link rel="stylesheet" href="_static/haiku.css" type="text/css" />
+    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
+    <link rel="stylesheet" href="_static/print.css" type="text/css" />
+    
+    <script type="text/javascript">
+      var DOCUMENTATION_OPTIONS = {
+        URL_ROOT:    '',
+        VERSION:     '',
+        COLLAPSE_INDEX: false,
+        FILE_SUFFIX: '.html',
+        HAS_SOURCE:  true
+      };
+    </script>
+    <script type="text/javascript" src="_static/jquery.js"></script>
+    <script type="text/javascript" src="_static/underscore.js"></script>
+    <script type="text/javascript" src="_static/doctools.js"></script>
+    <script type="text/javascript" src="_static/theme_extras.js"></script>
+    <link rel="top" title="Apache Sqoop  documentation" href="index.html" /> 
+  </head>
+  <body>
+      <div class="header"><img class="rightlogo" src="_static/sqoop-logo.png" alt="Logo"/><h1 class="heading"><a href="index.html">
+          <span>Apache Sqoop  documentation</span></a></h1>
+        <h2 class="heading"><span>Installation</span></h2>
+      </div>
+      <div class="topnav">
+      
+        <p>
+        <a class="uplink" href="index.html">Contents</a>
+        </p>
+
+      </div>
+      <div class="content">
+        
+        
+  <div class="section" id="installation">
+<h1>Installation<a class="headerlink" href="#installation" title="Permalink to this headline">¶</a></h1>
+<p>Sqoop ships as one binary package however it&#8217;s compound from two separate parts - client and server. You need to install server on single node in your cluster. This node will then serve as an entry point for all connecting Sqoop clients. Server acts as a mapreduce client and therefore Hadoop must be installed and configured on machine hosting Sqoop server. Clients can be installed on any arbitrary number of machines. Client is not acting as a mapreduce client and thus you do not need to install Hadoop on nodes that will act only as a Sqoop client.</p>
+<div class="section" id="server-installation">
+<h2>Server installation<a class="headerlink" href="#server-installation" title="Permalink to this headline">¶</a></h2>
+<p>Copy Sqoop artifact on machine where you want to run Sqoop server. This machine must have installed and configured Hadoop. You don&#8217;t need to run any Hadoop related services there, however the machine must be able to act as an Hadoop client. You should be able to list a HDFS for example:</p>
+<div class="highlight-none"><div class="highlight"><pre>hadoop dfs -ls
+</pre></div>
+</div>
+<p>Sqoop server supports multiple Hadoop versions. However as Hadoop major versions are not compatible with each other, Sqoop have multiple binary artefacts - one for each supported major version of Hadoop. You need to make sure that you&#8217;re using appropriated binary artifact for your specific Hadoop version. To install Sqoop server decompress appropriate distribution artifact in location at your convenience and change your working directory to this folder.</p>
+<div class="highlight-none"><div class="highlight"><pre># Decompress Sqoop distribution tarball
+tar -xvf sqoop-&lt;version&gt;-bin-hadoop&lt;hadoop-version&gt;.tar.gz
+
+# Move decompressed content to any location
+mv sqoop-&lt;version&gt;-bin-hadoop&lt;hadoop version&gt;.tar.gz /usr/lib/sqoop
+
+# Change working directory
+cd /usr/lib/sqoop
+</pre></div>
+</div>
+<div class="section" id="installing-dependencies">
+<h3>Installing Dependencies<a class="headerlink" href="#installing-dependencies" title="Permalink to this headline">¶</a></h3>
+<p>Hadoop libraries must be available on node where you are planning to run Sqoop server with proper configuration for major services - <tt class="docutils literal"><span class="pre">NameNode</span></tt> and either <tt class="docutils literal"><span class="pre">JobTracker</span></tt> or <tt class="docutils literal"><span class="pre">ResourceManager</span></tt> depending whether you are running Hadoop 1 or 2. There is no need to run any Hadoop service on the same node as Sqoop server, just the libraries and configuration files must be available.</p>
+<p>Path to Hadoop libraries is stored in file <tt class="docutils literal"><span class="pre">catalina.properties</span></tt> inside directory <tt class="docutils literal"><span class="pre">server/conf</span></tt>. You need to change property called <tt class="docutils literal"><span class="pre">common.loader</span></tt> to contain all directories with your Hadoop libraries. The default expected locations are <tt class="docutils literal"><span class="pre">/usr/lib/hadoop</span></tt> and <tt class="docutils literal"><span class="pre">/usr/lib/hadoop/lib/</span></tt>. Please check out the comments in the file for further description how to configure different locations.</p>
+<p>Lastly you might need to install JDBC drivers that are not bundled with Sqoop because of incompatible licenses. You can add any arbitrary Java jar file to Sqoop server by copying it into <tt class="docutils literal"><span class="pre">lib/</span></tt> directory. You can create this directory if it do not exists already.</p>
+</div>
+<div class="section" id="configuring-path">
+<h3>Configuring PATH<a class="headerlink" href="#configuring-path" title="Permalink to this headline">¶</a></h3>
+<p>All user and administrator facing shell commands are stored in <tt class="docutils literal"><span class="pre">bin/</span></tt> directory. It&#8217;s recommended to add this directory to your <tt class="docutils literal"><span class="pre">$PATH</span></tt> for their easier execution, for example:</p>
+<div class="highlight-none"><div class="highlight"><pre>PATH=$PATH:`pwd`/bin/
+</pre></div>
+</div>
+<p>Further documentation pages will assume that you have the binaries on your <tt class="docutils literal"><span class="pre">$PATH</span></tt>. You will need to call them specifying full path if you decide to skip this step.</p>
+</div>
+<div class="section" id="configuring-server">
+<h3>Configuring Server<a class="headerlink" href="#configuring-server" title="Permalink to this headline">¶</a></h3>
+<p>Before starting server you should revise configuration to match your specific environment. Server configuration files are stored in <tt class="docutils literal"><span class="pre">server/config</span></tt> directory of distributed artifact along side with other configuration files of Tomcat.</p>
+<p>File <tt class="docutils literal"><span class="pre">sqoop_bootstrap.properties</span></tt> specifies which configuration provider should be used for loading configuration for rest of Sqoop server. Default value <tt class="docutils literal"><span class="pre">PropertiesConfigurationProvider</span></tt> should be sufficient.</p>
+<p>Second configuration file <tt class="docutils literal"><span class="pre">sqoop.properties</span></tt> contains remaining configuration properties that can affect Sqoop server. File is very well documented, so check if all configuration properties fits your environment. Default or very little tweaking should be sufficient most common cases.</p>
+<p>You can verify the Sqoop server configuration using <a class="reference external" href="Tools.html#verify">Verify Tool</a>, for example:</p>
+<div class="highlight-none"><div class="highlight"><pre>sqoop2-tool verify
+</pre></div>
+</div>
+<p>Upon running the <tt class="docutils literal"><span class="pre">verify</span></tt> tool, you should see messages similar to the following:</p>
+<div class="highlight-none"><div class="highlight"><pre>Verification was successful.
+Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly
+</pre></div>
+</div>
+<p>Consult <a class="reference external" href="Tools.html#upgrade">Verify Tool</a> documentation page in case of any failure.</p>
+</div>
+<div class="section" id="server-life-cycle">
+<h3>Server Life Cycle<a class="headerlink" href="#server-life-cycle" title="Permalink to this headline">¶</a></h3>
+<p>After installation and configuration you can start Sqoop server with following command:</p>
+<div class="highlight-none"><div class="highlight"><pre>sqoop2-server start
+</pre></div>
+</div>
+<p>Similarly you can stop server using following command:</p>
+<div class="highlight-none"><div class="highlight"><pre>sqoop2-server stop
+</pre></div>
+</div>
+<p>By default Sqoop server daemons use ports 12000 and 12001. You can set <tt class="docutils literal"><span class="pre">SQOOP_HTTP_PORT</span></tt> and <tt class="docutils literal"><span class="pre">SQOOP_ADMIN_PORT</span></tt> in configuration file <tt class="docutils literal"><span class="pre">server/bin/setenv.sh</span></tt> to use different ports.</p>
+</div>
+</div>
+<div class="section" id="client-installation">
+<h2>Client installation<a class="headerlink" href="#client-installation" title="Permalink to this headline">¶</a></h2>
+<p>Client do not need extra installation and configuration steps. Just copy Sqoop distribution artifact on target machine and unzip it in desired location. You can start client with following command:</p>
+<div class="highlight-none"><div class="highlight"><pre>sqoop2-shell
+</pre></div>
+</div>
+<p>You can find more documentation to Sqoop client in <a class="reference external" href="CommandLineClient.html">Command Line Client</a> section.</p>
+</div>
+</div>
+
+
+      </div>
+      <div class="bottomnav">
+      
+        <p>
+        <a class="uplink" href="index.html">Contents</a>
+        </p>
+
+      </div>
+
+    <div class="footer">
+        &copy; Copyright 2009-2013 The Apache Software Foundation.
+    </div>
+  </body>
+</html>
\ No newline at end of file



Mime
View raw message