sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Subhash Sriram <subhash.sri...@gmail.com>
Subject Sqoop 1.4.7 import into Hive as Parquet fails for decimal type
Date Fri, 29 Dec 2017 15:34:38 GMT
Hello,

I am trying to import a table from MS SQL server into Hive as Parquet, and
one of the columns is a decimal type. By default, Sqoop would change the
type for the decimal to a double, but unfortunately that is causing
precision issues for some of our calculations.

Right now, I am getting the following error running in a HDP 2.4 sandbox:

Import command:

[root@sandbox sqoop]# sqoop import
-Dsqoop.avro.logical_types.decimal.enable=true --hive-import --num-mappers
1 --connect "jdbc:sqlserver://<conn_string>" --username uname --password
pass --hive-overwrite --hive-database default --table SqoopDecimalTest
--driver com.microsoft.sqlserver.jdbc.SQLServerDriver --null-string '\\N'
--as-parquetfile

Error: org.kitesdk.data.DatasetOperationException: Failed to append {"id":
1, "price": 19.123450} to ParquetAppender{path=hdfs://
sandbox.hortonworks.com:8020/tmp/default/.temp/job_1514513583437_0001/mr/attempt_1514513583437_0001_m_000000_0/.6b8d110f-6d1a-450c-93e4-c3db1a421476.parquet.tmp,
schema={"type":"record","name":"SqoopDecimalTest","doc":"Sqoop import of
SqoopDecimalTest","fields":[{"name":"id","type":["null","int"],"default":null,"columnName":"id","sqlType":"4"},{"name":"price","type":["null",{"type":"bytes","logicalType":"decimal","precision":19,"scale":6}],"default":null,"columnName":"price","sqlType":"3"}],"tableName":"SqoopDecimalTest"},
fileSystem=DFS[DFSClient[clientName=DFSClient_attempt_1514513583437_0001_m_000000_0_1859161154_1,
ugi=root (auth:SIMPLE)]],
avroParquetWriter=org.apache.parquet.avro.AvroParquetWriter@f60f96b} at
org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:194)
at
org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:326)
at
org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:305)
at
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
at
org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at
org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:70)
at
org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at
org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:415) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

Caused by: java.lang.ClassCastException: java.math.BigDecimal cannot be
cast to java.nio.ByteBuffer
at org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:257)
at
org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
at
org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
at
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:121)
at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:288) at
org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:74)
at
org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:35)
at
org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:188)

I am running Sqoop v.1.4.7 built against Kite v. 1.1.1-SNAPSHOT (the master
branch) because I noticed that the current version 1.0.0 uses parquet-avro
1.6.0, so I thought using parquet-avro 1.8.1 might help. I get the error in
both versions.

Does anyone know what might be wrong? Or, is the answer that this is simply
not supported in Sqoop? Any ideas would be greatly appreciated!

Thank you,

Subhash

Mime
View raw message