drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edmon Begoli <ebeg...@gmail.com>
Subject Re: UTF conversion issue with gz files
Date Wed, 26 Aug 2015 04:43:50 GMT
Done.

On Tue, Aug 25, 2015 at 10:23 PM, Jacques Nadeau <jacques@dremio.com> wrote:

> Yes, please post an issue.  Right now, the text reader is based on utf8.
> It would need an enhancement to support alternative character sets.
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Mon, Aug 24, 2015 at 9:05 AM, Edmon Begoli <ebegoli@gmail.com> wrote:
>
> > We are unable to process files that OSX identifies as character sete
> > UTF16LE.  After unzipping and converting to UTF8, we ere able to process
> > one fine.  There are CONVERT_TO and CONVERT_FROM commands that appear to
> > address the issue, but we were unable to make them work on a gzipped or
> > unzipped version of the UTF16 file.  We were  able to use CONVERT_FROM
> ok,
> > but when we tried to wrap the results of that to cast as a date, or
> > anything else, it failed.  Trying to work with it natively caused the
> > double-byte nature to appear (a substring 1,4 only return the first two
> > characters).
> >
> > Is there a fix for this or should I file it as an issue?
> >
> > I cannot post the data because it is proprietary in nature, but I might
> be
> > able to try to re-create the data for release testing and
> > development purposes.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message