sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Shreedharan" <hshreedha...@cloudera.com>
Subject Re: Review Request 12936: SQOOP-777. Sqoop2: Pluggable Intermediate Data Format
Date Thu, 01 Aug 2013 03:40:45 GMT


> On July 31, 2013, 11:25 p.m., Jarek Cecho wrote:
> > execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java,
line 78
> > <https://reviews.apache.org/r/12936/diff/4/?file=329175#file329175line78>
> >
> >     Can we throw more meaning full exception here? :-)

I added this to figure out which tests were failing because of lack of schema. This check
is not required, since the schema would exist in the conf anyway.


> On July 31, 2013, 11:25 p.m., Jarek Cecho wrote:
> > connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java,
lines 71-74
> > <https://reviews.apache.org/r/12936/diff/4/?file=329160#file329160line71>
> >
> >     I'm afraid that the schema generation for query based export will fail and I'm
not sure how to implement that, so I we might need to remove that functionality.

For now, I am going to leave it around. We should work on removing that in a separate jira.


> On July 31, 2013, 11:25 p.m., Jarek Cecho wrote:
> > connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java,
lines 80-100
> > <https://reviews.apache.org/r/12936/diff/4/?file=329160#file329160line80>
> >
> >     Similar code is also in importer, so maybe we could refactore that into some
Util/Helper class?

Done.


> On July 31, 2013, 11:25 p.m., Jarek Cecho wrote:
> > core/src/main/java/org/apache/sqoop/framework/FrameworkConstants.java, lines 59-61
> > <https://reviews.apache.org/r/12936/diff/4/?file=329163#file329163line59>
> >
> >     Nit: This property seems to be used mainly within the Execute/Submission engine,
so it's better placed in JobConstants instead.

Done


> On July 31, 2013, 11:25 p.m., Jarek Cecho wrote:
> > pom.xml, lines 326-330
> > <https://reviews.apache.org/r/12936/diff/4/?file=329183#file329183line326>
> >
> >     We already have guava dependency in Hadoop 100 profile. If the goal here is
to make it explicit, I would suggest to move it out from there and use the property guava.version
to specify the version.

Done


> On July 31, 2013, 11:25 p.m., Jarek Cecho wrote:
> > common/src/main/java/org/apache/sqoop/etl/io/CSVIntermediateDataFormat.java, line
19
> > <https://reviews.apache.org/r/12936/diff/4/?file=329153#file329153line19>
> >
> >     I would much rather see such class(es) in connector sdk than in common module
(that is for example shared with client).

Done


> On July 31, 2013, 11:25 p.m., Jarek Cecho wrote:
> > common/src/main/java/org/apache/sqoop/etl/io/DataReader.java, lines 24-34
> > <https://reviews.apache.org/r/12936/diff/4/?file=329154#file329154line24>
> >
> >     Nit: Can we add java docs describing purpose of each method?

Done


> On July 31, 2013, 11:25 p.m., Jarek Cecho wrote:
> > common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java, lines 24-34
> > <https://reviews.apache.org/r/12936/diff/4/?file=329155#file329155line24>
> >
> >     Nit: Can we add java docs describing purpose of each method?

DOne


> On July 31, 2013, 11:25 p.m., Jarek Cecho wrote:
> > common/src/main/java/org/apache/sqoop/etl/io/IntermediateDataFormat.java, lines
119-125
> > <https://reviews.apache.org/r/12936/diff/4/?file=329156#file329156line119>
> >
> >     Nit: Please add the java docs here as well.

Done


- Hari


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12936/#review24352
-----------------------------------------------------------


On July 27, 2013, 4:11 a.m., Hari Shreedharan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12936/
> -----------------------------------------------------------
> 
> (Updated July 27, 2013, 4:11 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-777
>     https://issues.apache.org/jira/browse/SQOOP-777
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> Implemented a pluggable intermediate data format that decouples the internal representation
of the data from the connector and the output formats. Connectors can choose to implement
and support a format that is more efficient for them. Also separated the SqoopWritable so
that we can use the intermediate data format independent of (current) Hadoop. 
> 
> I ran a full build - all tests including integration tests pass. I have not added any
new tests, yet. I will add unit tests for the new classes. Also, I have not tried running
this on an actual cluster - so things may be broken. I'd like some initial feedback based
on the current patch. 
> 
> I also implemented escaping of characters. There is some work remaining to support binary
format, but it is mostly integration, the basic implementation is in place.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml db11b5b 
>   common/src/main/java/org/apache/sqoop/etl/io/CSVIntermediateDataFormat.java PRE-CREATION

>   common/src/main/java/org/apache/sqoop/etl/io/DataReader.java 3e1adc7 
>   common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java d81364e 
>   common/src/main/java/org/apache/sqoop/etl/io/IntermediateDataFormat.java PRE-CREATION

>   common/src/main/java/org/apache/sqoop/schema/type/Column.java 8b630b2 
>   common/src/test/java/org/apache/sqoop/etl/io/CSVIntermediateDataFormatTest.java PRE-CREATION

>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java
e0da80f 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java
7212843 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java
aa1c4ff 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java
a7ed6ba 
>   core/src/main/java/org/apache/sqoop/framework/FrameworkConstants.java 4293dce 
>   core/src/main/java/org/apache/sqoop/framework/JobManager.java 9f09982 
>   core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java 53d0039 
>   execution/mapreduce/pom.xml f9a2a0e 
>   execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
767080c 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java
1978ec6 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java
a07c511 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java
4621942 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java PRE-CREATION

>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java
356ae8a 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java 59cf391

>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java
90de6ef 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java
739eb17 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java b31161c

>   execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java e21f15b 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b7079dd

>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java f849aae 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java 7b264c6 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java PRE-CREATION

>   execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java
bee8ab7 
>   pom.xml 520c107 
>   spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java 2becc56 
>   submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java
6fc485b 
> 
> Diff: https://reviews.apache.org/r/12936/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Hari Shreedharan
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message