drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From George Lu <luwenbin...@gmail.com>
Subject Re: How to deploy Drill to achieve optimal performance
Date Wed, 06 May 2015 03:55:10 GMT
Hi Ted and Steven,

The cluster is a testing one and it has two HDFS DataNodes and two HBase
RegionServers and total cluster has 9 nodes, I deployed Drill 0.9.0 to 7 of
them.

test2      | 31010      | 31011        | 31012      | false      |
| test3      | 31010      | 31011        | 31012      | false      |
| test8      | 31010      | 31011        | 31012      | true       |
| test4      | 31010      | 31011        | 31012      | false      |
| test6      | 31010      | 31011        | 31012      | false      |
| test9      | 31010      | 31011        | 31012      | false      |
| test5      | 31010      | 31011        | 31012      | false

test5,test6 are the data nodes and regionservers.

I query a small table (select count(*) from table) contains 18664 records,
and it costs "1 row selected (5.786 seconds)".
If I query some table with 40397300 records, "1 row selected (579.322
seconds)"
If I query select count(*), convert_from(activities_perf.log.rt,'utf8')
from activities_perf group by activities_perf.log.rt, it always get "
Query failed: SYSTEM ERROR: Command failed while establishing connection.
Failure type CONNECTION.

Fragment 2:4

[7540323b-1db0-4220-8016-b3a7c950979c on test3:31010]
java.lang.RuntimeException: java.sql.SQLException: Failure while executing
query.
at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
at sqlline.SqlLine.print(SqlLine.java:1809)
at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
at sqlline.SqlLine.dispatch(SqlLine.java:889)
at sqlline.SqlLine.begin(SqlLine.java:763)
at sqlline.SqlLine.start(SqlLine.java:498)
at sqlline.SqlLine.main(SqlLine.java:460)"

Seems some nodes fail from time to time. Not sure whether Drill will
reschedule the query on some node or can configure to do?

I have attach the log files from some of the nodes (as I cannot log into
some of the nodes in the cluster) for your reference.

Many thanks!

George Lu

On Wed, May 6, 2015 at 1:06 AM, Steven Phillips <sphillips@maprtech.com>
wrote:

> It would be helpful if you could post the profile for the query somewhere,
> or send it directly to me as an attachment (since attachments won't post to
> the mailing list).
>
> To get the profile, go to the profile page in the Web UI:
>
>
> http://drill.apache.org/docs/monitoring-and-canceling-queries-in-the-drill-web-ui/
>
> When you find the profile for the query in question, if you add ".json" to
> the URL, this will display the wrong text for the profile. You can download
> this and save it somewhere.
>
> On Tue, May 5, 2015 at 3:38 AM, Ted Dunning <ted.dunning@gmail.com> wrote:
>
> > George,
> >
> > That sounds much too slow.
> >
> > Can you provide some samples of the data and queries?  How about actual
> > data counts?  Millioins?  hundreds of millions?
> >
> >
> >
> >
> >
> > On Tue, May 5, 2015 at 8:54 AM, George Lu <luwenbin888@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > These days, I am trying Drill to see whether Drill fits the
> realtime/near
> > > realtime interactive queries requirement.
> > > I have a HBase server, underlying HDFS contains three data nodes, and I
> > > deployed 7 Drill nodes within the cluster.
> > > I have several million records in the HBase table and I issue queries
> > like
> > > SUM, MAX, COUNT against the table and found the Drill costs like 5 to 6
> > > minutes on average to get the result.
> > >
> > > Such latency is not ideal for interactive use.
> > >
> > > I know Drill is used for low-latency query, so I would like to ask for
> > help
> > > how to achieve that? How to make Drill run queries in low-latency (in
> > > seconds not minutes)?
> > >
> > > Any suggestions are welcome!
> > >
> > > Thanks!
> > >
> > > George
> > >
> >
>
>
>
> --
>  Steven Phillips
>  Software Engineer
>
>  mapr.com
>

Mime
View raw message