Thanks Paul.
To add more details we are comparing drill performance using below two
storage options
1.dfs plugin pointing to single node hdfs cluster
2. S3 plugin pointing to ecs bucket ,no hdfs
In both storage we have data stored in parquet files for e.g. in this query
we are querying a directory with 19 parquet files close to 2gb in total
same set on s3 and hdfs.
Drillbits are running on 2 unix machines with (6 core,32 gb) each.
On one of the unix machine we have hdfs single node cluster + zookeeper +
drillbit running .Other unix machine is running drill bit.
On Both hdfs and s3 storage we have created parquet metadata
file,additionally we have statistics created for dfs .
Based on analysis so far dfs is performing better when compared to s3.Same
query which completes in 2.121s on dfs ,times out on s3.
Looking at plan mostly "parquet row group scan" is taking more time 99 %.
Stack trace shows error " unable to execute http request: Timeout waiting
for connection from
(org.apache.drill.common.exceptions.ExecutionSetupException)
java.io.InterruptedIOException: getFileStatus on
s3a://test-bucket/TestDir/Test_1.parquet:
com.amazonaws.SdkClientException: Unable to execute HTTP request:
Timeout waiting for connection from pool
org.apache.drill.exec.store.parquet.AbstractParquetScanBatchCreator.getBatch():261
org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch():42
org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch():36
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():163
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():186
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():141
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():186
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():141
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():186
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():141
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():186
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():114
org.apache.drill.exec.physical.impl.ImplCreator.getExec():90
org.apache.drill.exec.work.fragment.FragmentExecutor.run():292
org.apache.drill.common.SelfCleaningRunnable.run():38
.......():0
Caused By (java.lang.Exception) getFileStatus on
s3a://test-bucket/TestDir/Test_1.parquet:
com.amazonaws.SdkClientException: Unable to execute HTTP request:
Timeout waiting for connection from pool
org.apache.hadoop.fs.s3a.S3AUtils.translateInterruptedException():352
org.apache.hadoop.fs.s3a.S3AUtils.translateException():177
org.apache.hadoop.fs.s3a.S3AUtils.translateException():151
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus():2242
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus():2204
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus():2143
org.apache.parquet.hadoop.util.HadoopInputFile.fromPath():39
org.apache.drill.exec.store.parquet.AbstractParquetScanBatchCreator.readFooter():353
org.apache.drill.exec.store.parquet.AbstractParquetScanBatchCreator.getBatch():149
org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch():42
org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch():36
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():163
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():186
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():141
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():186
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():141
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():186
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():141
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():186
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():114
org.apache.drill.exec.physical.impl.ImplCreator.getExec():90
org.apache.drill.exec.work.fragment.FragmentExecutor.run():292
org.apache.drill.common.SelfCleaningRunnable.run():38
.......():0
Thanks & Regards ,
Navin
On Sat, 28 Mar 2020, 09:27 Paul Rogers, <par0328@yahoo.com> wrote:
> Hi Navin,
>
>
> You had mentioned your ECS solution in an earlier note. What are you using
> to access data in your container? Is your ECS container running HDFS? Or,
> do you have some other API?
>
>
> Do you have Drill running in a container on ECS, or is that were your data
> is located? It would be helpful if you could perhaps describe your setup in
> a bit more detail so we can offer suggestions about where to look for an
> issue.
>
>
> By the way: the query profile is often a good place to start. You'll find
> them in the Drill Web Console. Looking at each operator you can see how
> much memory was used and how long things took. Specifically, look at the
> time taken by the scan: is the slowness due to reading the data, or is some
> other part of the query taking the time?
>
>
> When you get the error, what is the stack trace? Is the error coming from
> some particular HDFS client? In some particular operation?
>
>
> Thanks,
>
> - Paul
>
>
>
> On Friday, March 27, 2020, 6:59:42 AM PDT, Navin Bhawsar <
> navin.bhawsar@gmail.com> wrote:
>
>
> Hi,
>
> We are facing performance issue where apache drill query on ecs time out
> with below error "ConnectionPoolTimeoutException: Timeout waiting for
> connection from pool"
>
> However same query works fine on hdfs single node with execution time of
> 2.1 sec.(planning =.483s)
>
> Parquet file size <1.5 GB
> Total parquet files scanned = 8( total 19 in directory)
> Apache drill version 1.17
> JDK 1.8.0_74
> Total rows returned from query =71000
>
> There are 2 drillbits running in distributed mode .
> 13 GB default allocated per drill bit.
>
> Any ideas why ecs performance so bad when compared with hdfs for drill ?
> Please advise if drill provides options to optimize ecs querying .
>
> Please let me know if you need more details.
>
> Thanks & Regards,
> Navin
>
|