sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Qian Xu" <sx.a...@googlemail.com>
Subject Re: Review Request 25222: SQOOP-1395: Use random generated class name for SqoopRecord
Date Tue, 02 Sep 2014 09:09:39 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated Sept. 2, 2014, 5:09 p.m.)

Review request for Sqoop.


changes name of avro record in schema (no need to change entity class name)

Bugs: SQOOP-1395

Repository: sqoop-trunk

Description (updated)

If you import a table "users". Sqoop will generate an entity class named "users.java". The
class will be compiled, submitted and used by a mapreduce job. If the target file format is
Avro or Parquet, an Avro schema will be generated as well. According to Avro specification,
the entity class is described as "record", the name of the "record" is "users".

For Parquet file format handling, we use the Kite SDK to manage Parquet file reading and writing
with minimal efforts. Kite requires an Avro schema and all data records to be packed into
GenericRecord instances. There will be a problem here. Kite will read the schema first and
try to instantiate a record regarding its name. In this case, Kite will try to instantiate
a "users" class. Unfortunately, there is a "users.java" out there. This will cause mapreduce
job fail. 

In order to solve this problem, I intend to keep the name of the entity class and the Avro
record different.

The patch will:
1. Change the record name in Avro schema.
2. Remove the SqoopAvroRecord, as it is no longer required. (ClassWriter.java is reverted
to previous state)

Diffs (updated)

  src/java/org/apache/sqoop/avro/AvroUtil.java 4b37d58 
  src/java/org/apache/sqoop/lib/SqoopAvroRecord.java 80875d2 
  src/java/org/apache/sqoop/mapreduce/AvroImportMapper.java 6adad79 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 905ba8c 
  src/java/org/apache/sqoop/mapreduce/ParquetImportMapper.java effbadd 
  src/java/org/apache/sqoop/mapreduce/ParquetJob.java a74432a 
  src/java/org/apache/sqoop/orm/AvroSchemaGenerator.java 806bace 
  src/java/org/apache/sqoop/orm/ClassWriter.java 4f9dedd 
  src/java/org/apache/sqoop/orm/TableClassName.java 88ab622 

Diff: https://reviews.apache.org/r/25222/diff/


Manually verified the unittests of Avro and Parquet file formats
Manually tested local Parquet and Avro import export.

File Attachments



Qian Xu

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message