sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcus Truscello <marcus.trusce...@gmail.com>
Subject Re: --hive-import with --fields-terminated-by value over 127
Date Fri, 18 Dec 2015 14:52:44 GMT
I can absolutely try!  I was just hoping to get a read on if this would be
considered a worthwhile change to pursue or if it would be considered
"working as intended".
Regardless, I'll open an issue in JIRA and see where it goes from there.

On Fri, Dec 18, 2015 at 1:25 AM, Jarek Jarcec Cecho <jarcec@apache.org>
wrote:

> Can you create a JIRA Marcus?
>
> Jarcec
>
> > On Dec 17, 2015, at 6:49 PM, Marcus Truscello <
> marcus.truscello@gmail.com> wrote:
> >
> > This isn't so much as a bug report as a feature request.
> >
> > With sqoop, one can specify a --fields-terminated-by value greater than
> 127 using octal notation and it will work correctly.  The resulting file
> will have the correct delimiter.
> >
> > However, if you include the --hive-import option, the delimiter will
> result in error when being imported into Hive even though the file retains
> the correct delimiter.  This is the region of code responsible for the
> error:
> >
> https://github.com/apache/sqoop/blob/f19e2a523579db8c28a96febfd3cf35a5d58adc6/src/java/org/apache/sqoop/hive/TableDefWriter.java#L278-L300
> >
> > However, Hive supports delimiters with ASCII values between 128 and 255,
> just not in the octal escape form.  Instead, they must be specified as
> negative values (two's compliment, signed char).  For example, ASCII 254 in
> octal would normally be FIELDS TERMINATED BY '\0376' which is an error in
> Hive, but FIELDS TERMINATED BY '-2' works correctly.
> >
> > I believe that sqoop's --hive-import function should convert the
> --fields-terminated-by value into a form usable by Hive even if the value
> is greater than 127.  Values greater than 255 should probably still be an
> error.
> >
> >
> > Thanks for your time and consideration.
> > -Marcus
>
>

Mime
View raw message