drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Girish <abhishek.gir...@gmail.com>
Subject Re: Last Column showing blank in csv file
Date Sat, 03 Dec 2016 21:39:26 GMT
Thanks Khurram, this was the attribute I thought din't exist as I couldn't
find it anywhere our docs.

This can also be added to format config:

"csv": {
      "type": "text",
      "extensions": [
        "csv"
      ],
      "lineDelimiter": "\r\n",
      "extractHeader": true,
      "delimiter": ","
    }

Tried with 1.9.0 on Windows (using the above format config) and the queries
work as expected.

We should update [1] with this information.

[1] https://drill.apache.org/docs/plugin-configuration-basic
s/#list-of-attributes-and-definitions

P.S. So there is no need to file a new JIRA for this.

On Sat, Dec 3, 2016 at 4:12 AM, Khurram Faraaz <kfaraaz@maprtech.com> wrote:

> I am not exactly sure this will work (I haven't tried it on Windows 10)
> Setting lineDelimiter to '\r\n' should try to help in your case, can you
> please try. For details look at DRILL-3149
> <https://issues.apache.org/jira/browse/DRILL-3149>
>
> Example:
>
> select * from table(dfs.`my_table`(type=>'text', 'lineDelimiter'=>'\r\n'))
>
>
> On Sat, Dec 3, 2016 at 2:28 AM, Kunal Khatua <kkhatua@maprtech.com> wrote:
>
> > There is a dos2unix utility for Linux that allows you to substitute the
> > multichar newline with the single char newline.
> >
> > For Windows, you can use either a similar util on SourceForge or the
> > CygUtils (part of the Cygwin shell, I believe) to achieve the same
> > conversion.
> >
> > In the meanwhile, like Abhishek suggested, please file a JIRA for this so
> > that the issue can be tracked and fixed in a future release.
> >
> > Thanks
> > Kunal
> >
> > On Wed 30-Nov-2016 8:24:00 PM, Abhishek Girish <
> abhishek.girish@gmail.com>
> > wrote:
> > Well the only workaround I got to work is this: I saved the file in
> > UNIX/OS_X Format and executed the query successfully (with no blanks in
> the
> > class column).
> >
> > Some text editors such as Notepad++ let you do this directly on Windows.
> Or
> > you could use a utility like iconv or dos2unix for larger files like Leon
> > suggested.
> >
> > Also, please file an enhancement JIRA for supporting a new 'newline'
> > attribute (which takes multi-byte values) for text formats. This will be
> > helpful for files on windows which use CRLF (\r\n) as newline characters.
> >
> > On Wed, Nov 30, 2016 at 7:56 PM, Abhishek Girish
> > wrote:
> >
> > > Okay, I could reproduce the issue on Windows.
> > >
> > > 0: jdbc:drill:zk=local> select A.sepalen, A.sepalwidth, A.patelen,
> > > A.patelwidth,
> > > A.class from dfs.`/drill/tmp.csv` as A;
> > > +----------+-------------+----------+--------------+--------+
> > > | sepalen | sepalwidth | patelen | patelwidth | class |
> > > +----------+-------------+----------+--------------+--------+
> > > | 5.1 | 3.5 | 1.4 | Iris-setosa | |
> > > | 4.9 | 3 | 1.4 | Iris-setosa | |
> > > | 4.7 | 3.2 | 1.3 | Iris-setosa | |
> > > | 4.6 | 3.1 | 1.5 | Iris-setosa | |
> > > | 5 | 3.6 | 1.4 | Iris-setosa | |
> > > | 5.4 | 3.9 | 1.7 | Iris-setosa | |
> > > | 4.6 | 3.4 | 1.4 | Iris-setosa | |
> > > | 5 | 3.4 | 1.5 | Iris-setosa | |
> > > | 4.4 | 2.9 | 1.4 | Iris-setosa | |
> > > | 4.9 | 3.1 | 1.5 | Iris-setosa | |
> > > | 5.4 | 3.7 | 1.5 | Iris-setosa | |
> > > | 4.8 | 3.4 | 1.6 | Iris-setosa | |
> > > +----------+-------------+----------+--------------+--------+
> > > 12 rows selected (0.277 seconds)
> > >
> > > I'll get back if and once I have a workaround / solution.
> > >
> > > On Wed, Nov 30, 2016 at 7:36 PM, Abhishek Girish
> > > abhishek.girish@gmail.com> wrote:
> > >
> > >> I forgot to mention that, the other query works too.
> > >>
> > >> > select A.sepalen, A.sepalwidth, A.patelen, A.patelwidth, A.class
> from
> > >> dfs.tmp.`tmp.csv` as A;
> > >> +----------+-------------+----------+--------------+--------+
> > >> | sepalen | sepalwidth | patelen | patelwidth | class |
> > >> +----------+-------------+----------+--------------+--------+
> > >> | 5.1 | 3.5 | 1.4 | Iris-setosa | 0.2 |
> > >> | 4.9 | 3 | 1.4 | Iris-setosa | 0.2 |
> > >> | 4.7 | 3.2 | 1.3 | Iris-setosa | 0.2 |
> > >> | 4.6 | 3.1 | 1.5 | Iris-setosa | 0.2 |
> > >> | 5 | 3.6 | 1.4 | Iris-setosa | 0.2 |
> > >> | 5.4 | 3.9 | 1.7 | Iris-setosa | 0.4 |
> > >> | 4.6 | 3.4 | 1.4 | Iris-setosa | 0.3 |
> > >> | 5 | 3.4 | 1.5 | Iris-setosa | 0.2 |
> > >> | 4.4 | 2.9 | 1.4 | Iris-setosa | 0.2 |
> > >> | 4.9 | 3.1 | 1.5 | Iris-setosa | 0.1 |
> > >> | 5.4 | 3.7 | 1.5 | Iris-setosa | 0.2 |
> > >> | 4.8 | 3.4 | 1.6 | Iris-setosa | 0.2 |
> > >> +----------+-------------+----------+--------------+--------+
> > >> 12 rows selected (0.31 seconds)
> > >>
> > >>
> > >> On Wed, Nov 30, 2016 at 8:15 AM, Leon Clayton
> > >> wrote:
> > >>
> > >>> Are we sure there is no hidden characters on the end of the one or
> more
> > >>> lines. thinking ?
> > >>>
> > >>> try a dos2unix on the file to check this theory out.
> > >>>
> > >>>
> > >>> > On 30 Nov 2016, at 10:45, Sanjiv Kumar wrote:
> > >>> >
> > >>> > Hello
> > >>> > Yes you are right. select * from `tmp.csv`
> > >>> > Is Working fine, but if select column then the last column data
> > showing
> > >>> > blank.
> > >>> > Run this query:-
> > >>> > select A.`sepalen`, A.`sepalwidth`, A.`patelen`, A.`patelwidth`,
> > >>> A.class
> > >>> > from dfs.tmp.`copydata.csv` as A;
> > >>> >
> > >>> > If you run this query you will get last column data as blank.
> > >>> >
> > >>> > On Wed, Nov 30, 2016 at 11:15 AM, Sanjiv Kumar
> > >>> wrote:
> > >>> >
> > >>> >> I am using latest version 1.8 and in window 10 operating system.
> > >>> >>
> > >>> >> On Tue, Nov 29, 2016 at 11:40 AM, Sanjiv Kumar
> > >>> >> wrote:
> > >>> >>
> > >>> >>> I already pasted the csv file. Just copy and save it as
csv. I am
> > >>> also
> > >>> >>> attaching csv file. And the query is same as i mention
above.
> > >>> >>>
> > >>> >>> select A.`sepalen`, A.`sepalwidth`, A.`patelen`, A.`patelwidth`,
> > >>> A.class
> > >>> >>> from dfs.tmp.`copydata.csv` as A;
> > >>> >>>
> > >>> >>> On Mon, Nov 28, 2016 at 6:30 PM, Sanjiv Kumar
> > >>> >>> wrote:
> > >>> >>>
> > >>> >>>> Ya Its working but what if i am using :-
> > >>> >>>> select A.`sepalen`, A.`sepalwidth`, A.`patelen`, A.`patelwidth`,
> > >>> >>>> A.class from dfs.tmp.`copydata.csv` as A;
> > >>> >>>>
> > >>> >>>> why my last column data showing blank?
> > >>> >>>> And One more thing See my csv file:-
> > >>> >>>>
> > >>> >>>> sepalen,sepalwidth,patelen,patelwidth,class
> > >>> >>>> 5.1,3.5,1.4,Iris-setosa,0.2
> > >>> >>>> 4.9,3,1.4,Iris-setosa,0.2
> > >>> >>>> 4.7,3.2,1.3,Iris-setosa,0.2
> > >>> >>>> 4.6,3.1,1.5,Iris-setosa,0.2
> > >>> >>>> 5,3.6,1.4,Iris-setosa,0.2
> > >>> >>>> 5.4,3.9,1.7,Iris-setosa,0.4
> > >>> >>>> 4.6,3.4,1.4,Iris-setosa,0.3
> > >>> >>>> 5,3.4,1.5,Iris-setosa,0.2
> > >>> >>>> 4.4,2.9,1.4,Iris-setosa,0.2
> > >>> >>>> 4.9,3.1,1.5,Iris-setosa,0.1
> > >>> >>>> 5.4,3.7,1.5,Iris-setosa,0.2
> > >>> >>>> 4.8,3.4,1.6,Iris-setosa,0.2
> > >>> >>>>
> > >>> >>>> This is my previous file.
> > >>> >>>> Now if i add comma after class Check this new file.
> > >>> >>>>
> > >>> >>>> sepalen,sepalwidth,patelen,patelwidth,class,
> > >>> >>>> 5.1,3.5,1.4,Iris-setosa,0.2
> > >>> >>>> 4.9,3,1.4,Iris-setosa,0.2
> > >>> >>>> 4.7,3.2,1.3,Iris-setosa,0.2
> > >>> >>>> 4.6,3.1,1.5,Iris-setosa,0.2
> > >>> >>>> 5,3.6,1.4,Iris-setosa,0.2
> > >>> >>>> 5.4,3.9,1.7,Iris-setosa,0.4
> > >>> >>>> 4.6,3.4,1.4,Iris-setosa,0.3
> > >>> >>>> 5,3.4,1.5,Iris-setosa,0.2
> > >>> >>>> 4.4,2.9,1.4,Iris-setosa,0.2
> > >>> >>>> 4.9,3.1,1.5,Iris-setosa,0.1
> > >>> >>>> 5.4,3.7,1.5,Iris-setosa,0.2
> > >>> >>>> 4.8,3.4,1.6,Iris-setosa,0.2
> > >>> >>>>
> > >>> >>>>
> > >>> >>>> And Fire this query:- select A.`sepalen`, A.`sepalwidth`,
> > >>> A.`patelen`,
> > >>> >>>> A.`patelwidth`, A.class from dfs.tmp.`copydata.csv`
as A;
> > >>> >>>>
> > >>> >>>> then the output is show fine. but if the comma is
not there in
> > then
> > >>> the
> > >>> >>>> last column data showing blank.
> > >>> >>>>
> > >>> >>>> Is this a bug ??
> > >>> >>>> ......................
> > >>> >>>> Thanks & Regards
> > >>> >>>> *Sanjiv Kumar*
> > >>> >>>>
> > >>> >>>
> > >>> >>>
> > >>> >>>
> > >>> >>> --
> > >>> >>> ......................
> > >>> >>> Thanks & Regards
> > >>> >>> *Sanjiv Kumar*
> > >>> >>>
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >> --
> > >>> >> ......................
> > >>> >> Thanks & Regards
> > >>> >> *Sanjiv Kumar*
> > >>> >>
> > >>> >
> > >>> >
> > >>> >
> > >>> > --
> > >>> > ......................
> > >>> > Thanks & Regards
> > >>> > *Sanjiv Kumar*
> > >>>
> > >>>
> > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message