sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Grover <grover.markgro...@gmail.com>
Subject Re: How does sqoop export detect Avro schema?
Date Tue, 03 Mar 2015 06:07:19 GMT
Thanks, Stanley. Appreciate the feedback.

On Sun, Mar 1, 2015 at 6:23 PM, Xu, Qian A <qian.a.xu@intel.com> wrote:

>  Hi Mark,
>
>
>
> The Hive (Parquet) export has limitation for Avro format contents. It
> works only with a Hive table created by Kite CLI or a previous Sqoop
> imported Hive table (which uses Kite internally). In concrete, metadata
> will be stored in a hidden directory called “.metadata”. So when exporting
> a Hive table, Sqoop will look for the directory. This is the reason why it
> fails.
>
>
>
> --Stanley (Qian) Xu
>
>
>
>
>
> *From:* grover.markgrover@gmail.com [mailto:grover.markgrover@gmail.com
> <grover.markgrover@gmail.com>] *On Behalf Of *Mark Grover
> *Sent:* Monday, March 02, 2015 8:30 AM
> *To:* user@sqoop.apache.org
> *Subject:* Re: How does sqoop export detect Avro schema?
>
>
>
> Forgot to mention, here's the error I am getting:
>
> https://gist.github.com/markgrover/113196fecd1ec5bd0b38
>
>
>
> And, please include me on cc. I am not on the list. Thanks again!
>
>
>
> On Sun, Mar 1, 2015 at 4:29 PM, Mark Grover <mark@apache.org> wrote:
>
> Hi Sqoop folks,
>
> I am trying to better understand how sqoop export works.
>
>
>
> In the sqoop export command, we don't put any information about the
> metadata of the HDFS data being exported. So, how does sqoop figure out the
> avro schema of the data being exported?
>
>
>
> Does it use Kite's .metadata directory for this? If so, that'd mean you
> can't export data not populated by Kite. I don't think that's the case.
>
> Does it parse our the file header or look at file extensions? If so, that
> doesn't work, I just populated an hive table which stores data in avro, and
> it's file extension is not avro.
>
> Does it do something else that I am missing?
>
>
>
> I created a Hive avro table using some new syntax supported in Hive 0.14+:
>
> CREATE EXTERNAL TABLE avg_movie_rating2(movie_id INT, rating DOUBLE)
>
> STORED AS AVRO
>
> LOCATION '/data/movielens/aggregated_ratings'
>
> And, I just haven't been able to get Sqoop to be able to export that data.
> Here's the sqoop export command that I ran:
>
>
> sqoop export --connect \
> jdbc:mysql://mgrover-haa-2.vpc.cloudera.com:3306/movie_dwh \
> --username root --table avg_movie_rating --export-dir \
> /data/movielens/aggregated_ratings -m 16 \
> --update-key movie_id --update-mode allowinsert
>
>
>
> Any thoughts/insights would be much appreciated!
>
>
>
> Thanks!
>
> Mark
>
>
>

Mime
View raw message