In your drillbit.log file, can you look for the entries for the foreman node to see where the
time is being spent ?
e.g entries of the following type:
[275dec51-fcc1-f1bf-cb2f-57a838805a82:foreman] INFO o.a.d.exec.store.parquet.Metadata -
Took 64 ms to read metadata from cache file
Each entry is timestamped, so if you see a long gap between two of them, that will give an
idea about what’s going on.
I am not familiar with how the S3 is set up. My guess is there is some latency issues there
that could be causing it but let’s first get the log output.
-Aman
On 2/13/17, 12:05 PM, "David Kincaid" <kincaid.dave@gmail.com> wrote:
We've just recently started working with Drill and I'm seeing something
that doesn't seem right and I'm not sure how to troubleshoot. We have 100
Parquet files which are each about 400MB each using Snappy compression.
While trying to query this data I am seeing extraordinary planning time for
certain queries. The planning time is about 12 minutes and the actual
execution of the query is less than 2 minutes.
A few details of our setup. We are running Drill on an AWS EMR cluster on
m4.16xlarge nodes (64 cores, 256GB each). We've given Drill an 8GB Java
heap and 100GB Java direct memory. We have verified that the metadata cache
file is being created and used. We have tried a cluster of 10 nodes and a
cluster of 2 nodes with no difference in planning time or execution time.
Does anyone have some pointers on troubleshooting excessive planning time?
It seems like we must have something misconfigured or are missing
something. We're very new to Drill and I think I've exhausted all my
troubleshooting ideas so far. Any tips anyone can provide?
Here is the main query I've been experimenting with, so you can get a feel
for the query complexity:
select fltb1.sapId, yearmo,
COUNT(*) as totalcnt,
count(distinct(CASE
WHEN
(REPEATED_CONTAINS(fltb1.classLabels,
'Thing:Service:MedicalService:Diagnostic:Radiology:Ultrasound.*'))
THEN fltb1.invoiceId
END)) as ultracount,
count(distinct (CASE
WHEN
(REPEATED_CONTAINS(fltb1.classLabels,
'Thing:Service:MedicalService:Diagnostic:LaboratoryTest.*'))
THEN fltb1.invoiceId
END)) as labcount
from (select sapid, invoiceId, TO_CHAR(TO_TIMESTAMP(transactionDate,
'YYYY-MM-dd HH:mm:ss.SSSSSS'), 'yyyy-MM') yearmo, classLabels
from s3.cisexport.transactionView) fltb1
group by fltb1.sapId, yearmo;
Thanks,
Dave
|