sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Szabolcs Vasas <vasas.szabo...@gmail.com>
Subject Re: Review Request 62492: SQOOP-3224: Mainframe FTP transfer should have an option to use binary mode for transfer
Date Fri, 01 Dec 2017 15:25:07 GMT


> On Nov. 10, 2017, 4:05 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryRecord.java
> > Lines 24 (patched)
> > <https://reviews.apache.org/r/62492/diff/4/?file=1868652#file1868652line24>
> >
> >     Can you please explain what is the purpose of this new class? The name of its
field is not too descriptive, is it supposed to contain the content of one file on the mainframe?
> >     I am not too familiar with how the Mainframe connector works, why does the binary
import need a SqoopRecord implementation and text import doesn't?
> 
> Chris Teoh wrote:
>     This is required as the code generator defaults the data type to String, so any attempt
to write byte arrays will fail with a cast. See below for an example:-
>     ```java
>     java.lang.Exception: java.lang.ClassCastException: [B cannot be cast to java.lang.String
>     	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>     	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
>     Caused by: java.lang.ClassCastException: [B cannot be cast to java.lang.String
>     	at mydataset$1.setField(msisdn.java:46)
>     	at mydataset.setField(msisdn.java:260)
>     	at org.apache.sqoop.mapreduce.mainframe.MainframeDatasetFTPRecordReader.convertToSqoopRecord(MainframeDatasetFTPRecordReader.java:186)
>     	at org.apache.sqoop.mapreduce.mainframe.MainframeDatasetFTPRecordReader.getNextBinaryRecord(MainframeDatasetFTPRecordReader.java:157)
>     	at org.apache.sqoop.mapreduce.mainframe.MainframeDatasetFTPRecordReader.getNextRecord(MainframeDatasetFTPRecordReader.java:89)
>     	at org.apache.sqoop.mapreduce.mainframe.MainframeDatasetFTPRecordReader.getNextRecord(MainframeDatasetFTPRecordReader.java:40)
>     	at org.apache.sqoop.mapreduce.mainframe.MainframeDatasetRecordReader.nextKeyValue(MainframeDatasetRecordReader.java:77)
>     	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>     	at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>     	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>     	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>     	at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>     	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>     	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>     	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>     	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     	at java.lang.Thread.run(Thread.java:745)
>     ```
>     
>     MainframeDatasetRecordReader.initialize() method calls:-
>     
>     ```java
>         inputClass = (Class<T>) (conf.getClass(
>                     DBConfiguration.INPUT_CLASS_PROPERTY, null));
>     ```
>     
>     Usually the value returned is the code generated class, setting this in DataDrivenImportJob.configureMapper()
alters the code generation to use MainframeDatasetBinaryRecord instead which fixes the class
cast problem.
>     ```java
>           Configuration conf = job.getConfiguration();
>           conf.setClass(DBConfiguration.INPUT_CLASS_PROPERTY, MainframeDatasetBinaryRecord.class,
>             DBWritable.class);
>     ```

I think this happens because org.apache.sqoop.manager.MainframeManager#getColumnTypes always
defines the column type as VARCHAR. I think if we changed that method we would not need this
class anymore, what do you think?


- Szabolcs


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62492/#review190724
-----------------------------------------------------------


On Nov. 29, 2017, 11:48 a.m., Chris Teoh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62492/
> -----------------------------------------------------------
> 
> (Updated Nov. 29, 2017, 11:48 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3224
>     https://issues.apache.org/jira/browse/SQOOP-3224
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Added --transfermode {ascii|binary} to support FTP transfer mode switching.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/SqoopOptions.java 587d4e1d 
>   src/java/org/apache/sqoop/mapreduce/BinaryKeyOutputFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java dc49282e 
>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeConfiguration.java ea54b07f

>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryImportMapper.java
PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetBinaryRecord.java PRE-CREATION

>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeDatasetFTPRecordReader.java
1f78384b 
>   src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java f222dc8f 
>   src/java/org/apache/sqoop/tool/MainframeImportTool.java 0cb91dbc 
>   src/java/org/apache/sqoop/util/MainframeFTPClientUtils.java 95bc0ecb 
>   src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetBinaryRecord.java
PRE-CREATION 
>   src/test/org/apache/sqoop/mapreduce/mainframe/TestMainframeDatasetFTPRecordReader.java
06141549 
>   src/test/org/apache/sqoop/tool/TestMainframeImportTool.java d51e33e5 
>   src/test/org/apache/sqoop/util/TestMainframeFTPClientUtils.java 90a85194 
> 
> 
> Diff: https://reviews.apache.org/r/62492/diff/7/
> 
> 
> Testing
> -------
> 
> Unit tests.
> 
> Functional testing on mainframe.
> 
> 
> Thanks,
> 
> Chris Teoh
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message