sqoop-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jar...@apache.org
Subject git commit: SQOOP-1655: SQOOP2 DOC: Document getSchema() and its use in the connector dev guide
Date Sun, 02 Nov 2014 23:17:43 GMT
Repository: sqoop
Updated Branches:
  refs/heads/sqoop2 8b51236c2 -> 85d5476f7


SQOOP-1655: SQOOP2 DOC: Document getSchema() and its use in the connector dev guide

(Gwen Shapira via Jarek Jarcec Cecho)


Project: http://git-wip-us.apache.org/repos/asf/sqoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/sqoop/commit/85d5476f
Tree: http://git-wip-us.apache.org/repos/asf/sqoop/tree/85d5476f
Diff: http://git-wip-us.apache.org/repos/asf/sqoop/diff/85d5476f

Branch: refs/heads/sqoop2
Commit: 85d5476f7ebb5c2514969aaa0100a5ec8440a012
Parents: 8b51236
Author: Jarek Jarcec Cecho <jarcec@apache.org>
Authored: Sun Nov 2 15:17:05 2014 -0800
Committer: Jarek Jarcec Cecho <jarcec@apache.org>
Committed: Sun Nov 2 15:17:05 2014 -0800

----------------------------------------------------------------------
 docs/src/site/sphinx/ConnectorDevelopment.rst | 23 +++++++++++++++-------
 docs/src/site/sphinx/index.rst                |  2 +-
 2 files changed, 17 insertions(+), 8 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/sqoop/blob/85d5476f/docs/src/site/sphinx/ConnectorDevelopment.rst
----------------------------------------------------------------------
diff --git a/docs/src/site/sphinx/ConnectorDevelopment.rst b/docs/src/site/sphinx/ConnectorDevelopment.rst
index d700e4c..e4b5402 100644
--- a/docs/src/site/sphinx/ConnectorDevelopment.rst
+++ b/docs/src/site/sphinx/ConnectorDevelopment.rst
@@ -70,7 +70,7 @@ Connectors can optionally override the following methods:
 The ``getFrom`` method returns From_ instance
 which is a placeholder for the modules needed to read from a data source.
 
-The ``getTo`` method returns Exporter_ instance
+The ``getTo`` method returns Extractor_ instance
 which is a placeholder for the modules needed to write to a data source.
 
 Methods such as ``getBundle`` , ``getConnectionConfigurationClass`` ,
@@ -170,11 +170,22 @@ Connectors can define the design of ``Partition`` on their own.
 
 Initializer and Destroyer
 -------------------------
+.. _Initializer:
+.. _Destroyer:
 
 Initializer is instantiated before the submission of MapReduce job
-for doing preparation such as adding dependent jar files.
+for doing preparation such as connecting to the data source, creating temporary tables or
adding dependent jar files.
 
-Destroyer is instantiated after MapReduce job is finished for clean up.
+In addition to the Initialize() method where the preparation activities occur, the Initializer
must implement a getSchema() method.
+This method is used by the framework to match the data extracted by the ``From`` connector
with the data as the ``To`` connector expects it.
+In case of a relational database or columnar database, the returned Schema object will include
collection of columns with their data types.
+If the data source is schema-less, such as a file, an empty Schema object can be returned
(i.e a Schema object without any columns).
+
+Note that Sqoop2 currently does not support ETL between two schema-less sources. We expect
for each job that either the connector providing
+the ``From`` instance or the connector providing the ``To`` instance will have a schema.
If both instances have a schema, Sqoop2 will load data by column name.
+I.e, data in column "A" in data source will be loaded to column "A" in target.
+
+Destroyer is instantiated after MapReduce job is finished for clean up, for example dropping
temporary tables and closing connections.
 
 
 To
@@ -226,10 +237,8 @@ Loader must iterate in the ``load`` method until the data from ``DataReader``
is
 Initializer and Destroyer
 -------------------------
 
-Initializer is instantiated before the submission of MapReduce job
-for doing preparation such as adding dependent jar files.
-
-Destroyer is instantiated after MapReduce job is finished for clean up.
+Initializer_ and Destroyer_ of a ``To`` instance are used in a similar way to those of a
``From`` instance.
+Refer to the previous section for more details.
 
 
 Connector Configurations

http://git-wip-us.apache.org/repos/asf/sqoop/blob/85d5476f/docs/src/site/sphinx/index.rst
----------------------------------------------------------------------
diff --git a/docs/src/site/sphinx/index.rst b/docs/src/site/sphinx/index.rst
index e9bfd51..1bea5c3 100644
--- a/docs/src/site/sphinx/index.rst
+++ b/docs/src/site/sphinx/index.rst
@@ -59,7 +59,7 @@ Developer Guide
 - `Building Sqoop2 <BuildingSqoop2.html>`_
 - `Development Environment Setup <DevEnv.html>`_
 - `Java Client API Guide <ClientAPI.html>`_
-- `Developping Connector <ConnectorDevelopment.html>`_
+- `Developing a Connector <ConnectorDevelopment.html>`_
 - `REST API Guide <RESTAPI.html>`_
 
 Overview


Mime
View raw message