drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Junjun Olympia <romeo.olym...@gmail.com>
Subject Re: CSV header issue
Date Thu, 02 Apr 2015 06:58:49 GMT
While waiting for DRILL-951
<https://issues.apache.org/jira/browse/DRILL-951>, maybe you can use
something like this:

select sum(cast(trim(columns[6]) as int)) from HDFS.`/test.csv` where
trim(columns[6]) similar to '^(\+|-)?[0-9]+(\.[0-9]+)?';

Cheers,

Junjun


On Thu, Apr 2, 2015 at 2:43 PM, Mahesh Sankaran <sankarmahesh37@gmail.com>
wrote:

> we are waiting for Apache Drill 1.0.Thanks for the information.
>
> On Thu, Apr 2, 2015 at 12:04 PM, Aman Sinha <asinha@maprtech.com> wrote:
>
> > The exact release date depends on a variety of factors - I will let folks
> > who manage the release timeline chime in.
> >
> > On Wed, Apr 1, 2015 at 11:19 PM, Mahesh Sankaran <
> sankarmahesh37@gmail.com
> > >
> > wrote:
> >
> > > thank you aman.May i know the release date of apache drill 1.0.
> > >
> > > On Thu, Apr 2, 2015 at 11:40 AM, Aman Sinha <asinha@maprtech.com>
> wrote:
> > >
> > > > Hi Mahesh,
> > > > Please see https://issues.apache.org/jira/browse/DRILL-951  for the
> > > issue
> > > > of CSV headers.  It is a feature that will be addressed in an
> upcoming
> > > > release (currently tagged for 1.0).
> > > >
> > > > Aman
> > > >
> > > > On Wed, Apr 1, 2015 at 10:52 PM, Mahesh Sankaran <
> > > sankarmahesh37@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hi ,
> > > > >          I am currently working in Apache Drill to analyse CSV
> > files.My
> > > > > problem is, If the CSV file has headers means we cant do any sum
> > > query.It
> > > > > shows the following errors.
> > > > >
> > > > > 0: jdbc:drill:zk=nn01:2181,dn02:2181,dn03:218> select
> > > sum(cast(columns[6]
> > > > > as int)) from HDFS.`/test.csv` limit 10;
> > > > > Query failed: RemoteRpcException: Failure while running fragment.,
> > > > rcvdbyte
> > > > > [ 584925d6-dab6-42ce-8eb3-fa7abfb0e0f2 on nn01:31010 ]
> > > > > [ 584925d6-dab6-42ce-8eb3-fa7abfb0e0f2 on nn01:31010 ]
> > > > >
> > > > >
> > > > > Error: exception while executing query: Failure while executing
> > query.
> > > > > (state=,code=0)
> > > > >
> > > > > *But the above query is working well without headers.There is any
> way
> > > to
> > > > > sum the columns in CSV files with headers in Apache Drill.*
> > > > >
> > > > > *This is our example file:*
> > > > > 0: jdbc:drill:zk=nn01:2181,dn02:2181,dn03:218> select * from
> > > > > HDFS.`/test.csv` limit 10;
> > > > > +------------+------------+
> > > > > |  columns   |    dir0    |
> > > > > +------------+------------+
> > > > > |
> ["date1","time1","srcip","dstip","service","sentbyte","rcvdbyte"] |
> > > > > nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","10.10.100.74","192.168.0.12","DNS","0","193"] |
> > > > > nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","10.10.100.74","192.168.0.12","DNS","0","166"] |
> > > > > nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","10.10.100.74","192.168.0.12","DNS","60","359"]
> > > > > | nn01:9000  |
> > > > > |
> > > > >
> > > > >
> > > >
> > >
> >
> ["2015-01-01","00:00:00","10.10.50.195","106.10.193.45","php","717","359","0","0"]
> > > > > | nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","111.123.180.44","117.239.67.36","9064","0","0"]
> > > > > | nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","111.123.180.44","117.239.67.37","9064","0","0"]
> > > > > | nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","111.123.180.44","117.239.67.38","9064","0","0"]
> > > > > | nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","111.123.180.44","117.239.67.34","9064","0","0"]
> > > > > | nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","111.123.180.44","117.239.67.44","9064","0","0"]
> > > > > | nn01:9000  |
> > > > >
> > > > >
> > > > > Thanks and Regards,
> > > > >
> > > > > Mahesh Sankaran
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message