sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Jarcec Cecho <jar...@apache.org>
Subject Re: Does sqoop2 support Teradata?
Date Tue, 18 Mar 2014 15:46:12 GMT
You seems to be using VARCHAR column for creating splits, which is generally very bad idea.
I would strongly advise you to use a number column instead.

There is many things that can go wrong with using VARCHAR as a partition column. Looking at
the exception my wild guess would be charset - are you explicitly stating that the connection
should be in UTF? If not than Teradata default charset is ASCII that clearly won't work.

http://developer.teradata.com/doc/connectivity/jdbc/reference/current/jdbcug_chapter_2.html#URL_CHARSET

Jarcec

On Tue, Mar 18, 2014 at 01:26:39PM +0800, John Ho wrote:
> Many thanks Jarcec for the reply.
> 
> I merged 2 jar files as you suggested and sqoop2 able to connect to
> Teradata.
> 
> This time it got issue when extracting data from Teradata using small data
> set (20 rows)
> Below are details and logs collected from Hue GUI for the failed task.
> Sqoop1 has no issues so far.
> 
> 
> Thanks and regards,
> John
> 
> =======
> 
> Schema:
> CREATE TABLE dailyprices (
>   stockname varchar(256) NOT NULL default '',
>   tradedate varchar(256) NOT NULL default '',
>   openprice float default NULL,
>   highprice float default NULL,
>   lowprice float default NULL,
>   closeprice float default NULL,
>   tradevolume bigint default NULL,
>   adjcloseprice float default NULL,
>   PRIMARY KEY  (stockname,tradedate)
> );
> 
> ====
> Data for testing:
> 
> HPQ|2014-02-07|28.7|29.16|28.69|29.07|7537300|29.07
> HPQ|2014-02-06|28.23|28.66|28.2|28.49|6693700|28.49
> HPQ|2014-02-05|28.17|28.44|27.9|28.01|10278000|28.01
> HPQ|2014-02-04|28.14|28.41|27.89|28.33|9581300|28.33
> HPQ|2014-02-03|29.06|29.29|27.96|28.04|14655400|28.04
> HPQ|2014-01-31|28.94|29.19|28.74|29|12934300|29
> HPQ|2014-01-30|29.15|29.42|29.08|29.25|9095700|29.25
> HPQ|2014-01-29|28.92|29.15|28.75|29.02|13915700|29.02
> HPQ|2014-01-28|28.57|29.08|28.49|29|12407000|29
> HPQ|2014-01-27|28.53|29.09|28.38|28.6|15924800|28.6
> IBM|2014-02-07|175.64|177.56|175.07|177.25|4692900|177.25
> IBM|2014-02-06|173.97|174.85|173.79|174.67|4292200|174.67
> IBM|2014-02-05|172.19|174.97|172.19|174.24|4712300|173.29
> IBM|2014-02-04|173.53|173.75|172.36|172.84|4349800|171.9
> IBM|2014-02-03|176.02|176.02|172.72|172.9|7186800|171.96
> IBM|2014-01-31|176.11|177.84|175.34|176.68|5193400|175.72
> IBM|2014-01-30|177.17|177.86|176.36|177.36|4853700|176.39
> IBM|2014-01-29|175.98|178.53|175.89|176.4|4970900|175.44
> IBM|2014-01-28|178.05|178.45|176.16|176.85|5333300|175.89
> IBM|2014-01-27|179.61|179.65|177.66|177.9|5208600|176.93
> 
> ====
> Sqoop2's task diagnostic log (from hue UI):
> org.apache.sqoop.common.SqoopException: MAPRED_EXEC_0017:Error occurs
> during extractor run
> at org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:101)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
> at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: org.apache.sqoop.common.SqoopException:
> GENERIC_JDBC_CONNECTOR_0002:Unable to execute the SQL statement
> at
> org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.executeQuery(GenericJdbcExecutor.java:59)
> at
> org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.java:50)
> at
> org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.ja
> 
> ====
> stderr (from hue UI):
> 2014-03-18 13:10:33,969 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  -
> Starting progress service
> 2014-03-18 13:10:33,970 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  -
> Running extractor class
> org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor
> 2014-03-18 13:10:33,978 [pool-2-thread-1] DEBUG
> org.apache.sqoop.job.mr.ProgressRunnable  - Auto-progress thread reporting
> progress
> 2014-03-18 13:10:35,454 [main] INFO
>  org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor  - Using query:
> SELECT stockname,tradedate,closeprice,tradevolume FROM dailyprices WHERE
> 'HPQ' <= stockname AND stockname < 'Hó €'
> 2014-03-18 13:10:35,492 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  -
> Stopping progress service
> 
> 
> ====
> syslog (from hue UI):
> 2014-03-18 13:10:31,622 WARN mapreduce.Counters: Group
> org.apache.hadoop.mapred.Task$Counter is deprecated. Use
> org.apache.hadoop.mapreduce.TaskCounter instead
> 2014-03-18 13:10:33,076 WARN org.apache.hadoop.conf.Configuration:
> session.id is deprecated. Instead, use dfs.metrics.session-id
> 2014-03-18 13:10:33,077 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, sessionId=
> 2014-03-18 13:10:33,458 INFO org.apache.hadoop.util.ProcessTree: setsid
> exited with exit code 0
> 2014-03-18 13:10:33,482 INFO org.apache.hadoop.mapred.Task:  Using
> ResourceCalculatorPlugin :
> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1f28706
> 2014-03-18 13:10:33,774 INFO org.apache.hadoop.mapred.MapTask: Processing
> split: org.apache.sqoop.job.mr.SqoopSplit@72c1d428
> 2014-03-18 13:10:33,781 INFO org.apache.hadoop.mapred.MapTask: Map output
> collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
> 2014-03-18 13:10:33,785 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb =
> 136
> 2014-03-18 13:10:33,925 INFO org.apache.hadoop.mapred.MapTask: data buffer
> = 108380824/135476032
> 2014-03-18 13:10:33,925 INFO org.apache.hadoop.mapred.MapTask: record
> buffer = 356515/445644
> 2014-03-18 13:10:33,969 INFO org.apache.sqoop.job.mr.SqoopMapper: Starting
> progress service
> 2014-03-18 13:10:33,970 INFO org.apache.sqoop.job.mr.SqoopMapper: Running
> extractor class org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor
> 2014-03-18 13:10:33,978 DEBUG org.apache.sqoop.job.mr.ProgressRunnable:
> Auto-progress thread reporting progress
> 2014-03-18 13:10:35,454 INFO
> org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor: Using query:
> SELECT stockname,tradedate,closeprice,tradevolume FROM dailyprices WHERE
> 'HPQ' <= stockname AND stockname < 'Hó €'
> 2014-03-18 13:10:35,492 INFO org.apache.sqoop.job.mr.SqoopMapper: Stopping
> progress service
> 2014-03-18 13:10:35,495 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2014-03-18 13:10:35,501 WARN org.apache.hadoop.mapred.Child: Error running
> child
> org.apache.sqoop.common.SqoopException: MAPRED_EXEC_0017:Error occurs
> during extractor run
> at org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:101)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
> at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: org.apache.sqoop.common.SqoopException:
> GENERIC_JDBC_CONNECTOR_0002:Unable to execute the SQL statement
> at
> org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.executeQuery(GenericJdbcExecutor.java:59)
> at
> org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.java:50)
> at
> org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.java:31)
> at org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:96)
> ... 7 more
> Caused by: com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata Database]
> [TeraJDBC 14.10.00.26] [Error 6705] [SQLState HY000] An illegally formed
> character string was encountered during translation.
> at
> com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeDatabaseSQLException(ErrorFactory.java:307)
> at
> com.teradata.jdbc.jdbc_4.statemachine.ReceiveInitSubState.action(ReceiveInitSubState.java:109)
> at
> com.teradata.jdbc.jdbc_4.statemachine.StatementReceiveState.subStateMachine(StatementReceiveState.java:314)
> at
> com.teradata.jdbc.jdbc_4.statemachine.StatementReceiveState.action(StatementReceiveState.java:202)
> at
> com.teradata.jdbc.jdbc_4.statemachine.StatementController.runBody(StatementController.java:123)
> at
> com.teradata.jdbc.jdbc_4.statemachine.StatementController.run(StatementController.java:114)
> at
> com.teradata.jdbc.jdbc_4.TDStatement.executeStatement(TDStatement.java:384)
> at
> com.teradata.jdbc.jdbc_4.TDStatement.executeStatement(TDStatement.java:326)
> at
> com.teradata.jdbc.jdbc_4.TDStatement.doNonPrepExecuteQuery(TDStatement.java:314)
> at com.teradata.jdbc.jdbc_4.TDStatement.executeQuery(TDStatement.java:1091)
> at
> org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.executeQuery(GenericJdbcExecutor.java:56)
> ... 10 more
> 2014-03-18 13:10:35,506 INFO org.apache.hadoop.mapred.Task: Runnning
> cleanup for the task
> 
> 
> On Sat, Mar 15, 2014 at 1:17 AM, Jarek Jarcec Cecho <jarcec@apache.org>wrote:
> 
> > Hi John,
> > Sqoop2 is not polluting the DistributedCache like Sqoop 1 with putting
> > entire classpath to the MapReduce job. Instead we are propagating only jars
> > that are required. Sadly this mechanism is not working correctly in
> > Teradata case as the JDBC driver consist of two jars that are not
> > referencing each other - the dependency is something that user have to
> > know, not something that jar file exposes. As a result Sqoop will propagate
> > only one of them to the mapreduce job and that will in turn fail. This will
> > be fixed later when we introduce special Teradata connector. For now, you
> > could merge both Teradata JDBC driver jars together and put the resulting
> > jar into /var/log/sqoop2/.
> >
> > Jarcec
> >
> > On Fri, Mar 14, 2014 at 11:56:41AM +0800, John Ho wrote:
> > > Hi Sqoop Team,
> > >
> > > Does sqoop2 support importing data from Teradata?
> > >
> > > I'm using CDH4.5 and not sure where to put Teradata's JDBC driver files.
> > >
> > > I upload tdgssconfig.jar and terajdbc4.jar to /var/lib/sqoop2 and add
> > > "-classpath
> > > /var/lib/sqoop2/tdgssconfig.jar" to Java Configuration Options for
> > > TaskTracker for TaskTracker but it does not work as expected.
> > >
> > > I also upload the jar files to
> > >
> > /opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p0.30/lib/sqoop2/webapps/sqoop/WEB-INF/lib/
> > > but it does not work.
> > >
> > > There's no issues if I use sqoop 1.
> > >
> > >
> > > Thanks and regards,
> > > John Ho
> > >
> > > ===
> > > Task Log (CDH4.5.0 and Teradata Express 14.10 on SLES11):
> > >
> > > org.apache.sqoop.common.SqoopException: MAPRED_EXEC_0017:Error occurs
> > > during extractor run at
> > > org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:101) at
> > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) at
> > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at
> > > org.apache.hadoop.mapred.Child$4.run(Child.java:268) at
> > > java.security.AccessController.doPrivileged(Native Method) at
> > > javax.security.auth.Subject.doAs(Subject.java:415) at
> > >
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> > > at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by:
> > > org.apache.sqoop.common.SqoopException:
> > GENERIC_JDBC_CONNECTOR_0002:Unable
> > > to execute the SQL statement at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.executeQuery(GenericJdbcExecutor.java:59)
> > > at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.java:50)
> > > at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.ja
> > > org.apache.sqoop.common.SqoopException: MAPRED_EXEC_0017:Error occurs
> > > during extractor run at
> > > org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:101) at
> > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) at
> > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at
> > > org.apache.hadoop.mapred.Child$4.run(Child.java:268) at
> > > java.security.AccessController.doPrivileged(Native Method) at
> > > javax.security.auth.Subject.doAs(Subject.java:415) at
> > >
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> > > at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by:
> > > org.apache.sqoop.common.SqoopException:
> > GENERIC_JDBC_CONNECTOR_0002:Unable
> > > to execute the SQL statement at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.executeQuery(GenericJdbcExecutor.java:59)
> > > at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.java:50)
> > > at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.ja
> > > org.apache.sqoop.common.SqoopException: MAPRED_EXEC_0017:Error occurs
> > > during extractor run at
> > > org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:101) at
> > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) at
> > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at
> > > org.apache.hadoop.mapred.Child$4.run(Child.java:268) at
> > > java.security.AccessController.doPrivileged(Native Method) at
> > > javax.security.auth.Subject.doAs(Subject.java:415) at
> > >
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> > > at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by:
> > > org.apache.sqoop.common.SqoopException:
> > GENERIC_JDBC_CONNECTOR_0002:Unable
> > > to execute the SQL statement at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.executeQuery(GenericJdbcExecutor.java:59)
> > > at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.java:50)
> > > at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.ja
> > > org.apache.sqoop.common.SqoopException: MAPRED_EXEC_0017:Error occurs
> > > during extractor run at
> > > org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:101) at
> > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) at
> > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at
> > > org.apache.hadoop.mapred.Child$4.run(Child.java:268) at
> > > java.security.AccessController.doPrivileged(Native Method) at
> > > javax.security.auth.Subject.doAs(Subject.java:415) at
> > >
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> > > at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by:
> > > org.apache.sqoop.common.SqoopException:
> > GENERIC_JDBC_CONNECTOR_0002:Unable
> > > to execute the SQL statement at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.executeQuery(GenericJdbcExecutor.java:59)
> > > at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.java:50)
> > > at
> > >
> > org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor.extract(GenericJdbcImportExtractor.ja
> >

Mime
View raw message