sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benes, Pavel" <pavel.be...@merck.com>
Subject Sqoop import fails with NullPointerException
Date Mon, 01 Jun 2015 14:01:55 GMT
Hi guys,

I am using sqoop 1.4.5 to import some data from MySQL into hive using this command:

sqoop import --connect jdbc:mysql://some.merck.com:1234/eqtl_gtex_raw --username XXX --password
YYY --table adipose_subcutaneous --hcatalog-database mg_user_middlegate_benesp_mysql1 --hcatalog-table
adipose_subcutaneous --hive-partition-key mg_version --hive-partition-value 2015-05-28-13-18
-m 1 --verbose --fetch-size -2147483648

and it fails with this error

2015-06-01 13:20:39,209 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running
child : java.lang.NullPointerException
	at org.apache.hive.hcatalog.data.schema.HCatSchema.get(HCatSchema.java:105)
	at org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper.convertToHCatRecord(SqoopHCatImportHelper.java:194)
	at org.apache.sqoop.mapreduce.hcat.SqoopHCatImportMapper.map(SqoopHCatImportMapper.java:52)
	at org.apache.sqoop.mapreduce.hcat.SqoopHCatImportMapper.map(SqoopHCatImportMapper.java:34)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)


after some investigation it seems to be caused by hyphens in a table name. I have patched
sqoop jar to write more info into a log:


2015-06-01 13:15:49,337 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
Processing schema fields...
2015-06-01 13:15:49,337 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
Adding field 'mg_version'
2015-06-01 13:15:49,337 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
Field count: 6
2015-06-01 13:15:49,347 INFO [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Working
on split: 1=1 AND 1=1
2015-06-01 13:15:49,360 INFO [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Executing
query: SELECT `SNP`, `gene`, `beta`, ` t-stat`, `p-value` FROM `adipose_subcutaneous` AS `adipose_subcutaneous`
WHERE ( 1=1 ) AND ( 1=1 )
2015-06-01 13:15:49,657 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
Processing  HCatRecord, listing schema fields ...
2015-06-01 13:15:49,657 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
	Field: snp
2015-06-01 13:15:49,663 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
	Field: gene
2015-06-01 13:15:49,663 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
	Field: beta
2015-06-01 13:15:49,663 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
	Field:  t-stat
2015-06-01 13:15:49,663 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
	Field: p-value
2015-06-01 13:15:49,663 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
	Field: mg_version
2015-06-01 13:15:49,664 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
Processing key: 'SNP'
2015-06-01 13:15:49,664 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
Processing key: 'beta'
2015-06-01 13:15:49,664 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
Processing key: 'gene'
2015-06-01 13:15:49,664 INFO [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:
Processing key: 'p_value'
2015-06-01 13:20:39,209 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running
child : java.lang.NullPointerException

According to it the original DB table names are converted to lowercase and '-' characters
are replaced by sqoop. The tables without hyphens are resolved correctly (e.g. 'SNP' ->
'snp') but the table with hyphens (i.e. 'p-value' -> 'p_value' ) is not found in a schema.

I am attaching also sqoop log and job log.

Is this a known issue and is there any workaround for it? This should be general import/ingest
so unfortunately I have no control over table names to ingest.

Thanks,

Pavel


Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (2000 Galloping Hill Road, Kenilworth,
New Jersey, USA 07033), and/or its affiliates Direct contact information
for affiliates is available at 
http://www.merck.com/contact/contacts.html) that may be confidential,
proprietary copyrighted and/or legally privileged. It is intended solely
for the use of the individual or entity named on this message. If you are
not the intended recipient, and have received this message in error,
please notify us immediately by reply e-mail and then delete it from 
your system.
Mime
View raw message