sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Teoh <chris.t...@gmail.com>
Subject Re: Review Request 62492: SQOOP-3224: Mainframe FTP transfer should have an option to use binary mode for transfer
Date Fri, 17 Nov 2017 04:59:59 GMT


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/SqoopOptions.java
> > Lines 2421 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868647#file1868647line2421>
> >
> >     Please set the default value in the initDefaults() method.

Done.


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryRecord.java
> > Lines 24 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868652#file1868652line24>
> >
> >     Can you please explain what is the purpose of this new class? The name of its
field is not too descriptive, is it supposed to contain the content of one file on the mainframe?
> >     I am not too familiar with how the Mainframe connector works, why does the binary
import need a SqoopRecord implementation and text import doesn't?

This is required as the code generator defaults the data type to String, so any attempt to
write byte arrays will fail with a cast. See below for an example:-
```java
java.lang.Exception: java.lang.ClassCastException: [B cannot be cast to java.lang.String
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.ClassCastException: [B cannot be cast to java.lang.String
	at mydataset$1.setField(msisdn.java:46)
	at mydataset.setField(msisdn.java:260)
	at org.apache.sqoop.mapreduce.mainframe.MainframeDatasetFTPRecordReader.convertToSqoopRecord(MainframeDatasetFTPRecordReader.java:186)
	at org.apache.sqoop.mapreduce.mainframe.MainframeDatasetFTPRecordReader.getNextBinaryRecord(MainframeDatasetFTPRecordReader.java:157)
	at org.apache.sqoop.mapreduce.mainframe.MainframeDatasetFTPRecordReader.getNextRecord(MainframeDatasetFTPRecordReader.java:89)
	at org.apache.sqoop.mapreduce.mainframe.MainframeDatasetFTPRecordReader.getNextRecord(MainframeDatasetFTPRecordReader.java:40)
	at org.apache.sqoop.mapreduce.mainframe.MainframeDatasetRecordReader.nextKeyValue(MainframeDatasetRecordReader.java:77)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
	at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
```

MainframeDatasetRecordReader.initialize() method calls:-

```java
    inputClass = (Class<T>) (conf.getClass(
                DBConfiguration.INPUT_CLASS_PROPERTY, null));
```

Usually the value returned is the code generated class, setting this in DataDrivenImportJob.configureMapper()
alters the code generation to use MainframeDatasetBinaryRecord instead which fixes the class
cast problem.
```java
      Configuration conf = job.getConfiguration();
      conf.setClass(DBConfiguration.INPUT_CLASS_PROPERTY, MainframeDatasetBinaryRecord.class,
        DBWritable.class);
```


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
> > Lines 24 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868653#file1868653line24>
> >
> >     Please remove the star import and import the necessary classes only.

Done.


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
> > Lines 125 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868653#file1868653line125>
> >
> >     This method is quite complex and it is not covered with test cases. Can you
please fill this gap?

Done!


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
> > Lines 128 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868653#file1868653line128>
> >
> >     What is the purpose of this array, is it supposed to contain 1 record?

Yes it is supposed to contain 1 record. In the context of binary transfer, it's an arbitrary
number to read the bytes from the stream. So it may not contain 1 logical record as per the
dataset as there's no information that Sqoop knows about to tell it what constitutes a record.


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
> > Lines 138 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868653#file1868653line138>
> >
> >     I think we should use some kind of buffered input stream.

Done!


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
> > Lines 145 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868653#file1868653line145>
> >
> >     Why do we need to specify the cumulativeBytesRead as the offset in the buffer
array?

The inputStream may not always return the requested number of bytes, so it may take more than
one read operation to fill the buffer, this method means we can resume filling the buffer
from the offset from the last read operation.


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/test/org/apache/sqoop/tool/TestMainframeImportTool.java
> > Lines 199 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868657#file1868657line199>
> >
> >     Please use a method of org.junit.Assert here

Done and also adjusted my other unit tests in the same class.


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/test/org/apache/sqoop/tool/TestMainframeImportTool.java
> > Lines 211 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868657#file1868657line211>
> >
> >     Please use a method of org.junit.Assert here

Done!


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/test/org/apache/sqoop/tool/TestMainframeImportTool.java
> > Lines 224 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868657#file1868657line224>
> >
> >     Please use a method of org.junit.Assert here

Done!


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/test/org/apache/sqoop/tool/TestMainframeImportTool.java
> > Lines 237 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868657#file1868657line237>
> >
> >     Please use a method of org.junit.Assert here.

Done!


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/test/org/apache/sqoop/tool/TestMainframeImportTool.java
> > Lines 251 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868657#file1868657line251>
> >
> >     Please use org.junit.rules.ExpectedException here.

Done.


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java
> > Lines 387 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868658#file1868658line387>
> >
> >     Please use a method of org.junit.Assert here.

Done.


- Chris


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62492/#review190724
-----------------------------------------------------------


On Oct. 25, 2017, 10:25 a.m., Chris Teoh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62492/
> -----------------------------------------------------------
> 
> (Updated Oct. 25, 2017, 10:25 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3224
>     https://issues.apache.org/jira/browse/SQOOP-3224
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Added --transfermode {ascii|binary} to support FTP transfer mode switching.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/SqoopOptions.java 587d4e1d 
>   src/java/org/apache/sqoop/mapreduce/BinaryKeyOutputFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java dc49282e 
>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java ea54b07f

>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryImportMapper.java
PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryRecord.java PRE-CREATION

>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
1f78384b 
>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java f222dc8f 
>   src/java/org/apache/sqoop/tool/MainframeImportTool.java 0cb91dbc 
>   src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java 95bc0ecb 
>   src/test/org/apache/sqoop/tool/TestMainframeImportTool.java d51e33e5 
>   src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java 90a85194 
> 
> 
> Diff: https://reviews.apache.org/r/62492/diff/4/
> 
> 
> Testing
> -------
> 
> Unit tests.
> 
> Functional testing on mainframe.
> 
> 
> Thanks,
> 
> Chris Teoh
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message