drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kwizera hugues Teddy <nbted2...@gmail.com>
Subject Re: Multiple fragments in apache drill
Date Fri, 15 Feb 2019 15:15:45 GMT
Hi Kunal Khatua,

Thanks for your helpful response.

Kind Hugues

On Wed, Feb 13, 2019 at 8:22 PM Kunal Khatua <kunal@apache.org> wrote:

> Hi Hugues
>
> The number of fragments is determined by the number of sources (i.e.
> whether the data can be read in parallel) and the number of estimated rows.
> CSV and Parquet files are easy to read in parallel, but JSON files are
> not, because Drill does not know how many JSON documents exist in the file
> and where their offsets are.
>
> The number of estimated rows tells Drill whether to parallelize a major
> fragment of operators. You can try reducing this property in your
> session/system via the UI [/options page] :
> planner.slice_target
>
> ~ Kunal
>
> On 2/13/2019 7:14:34 AM, Kwizera hugues Teddy <nbted2017@gmail.com> wrote:
> Hello Team drill,
>
> I'm executing a query in Apache drill cluster, however, it is making only 1
> minor segment. I have tried various queries like union of 2 queries
> , aggragation etc, and executing it on millions records however it is
> still making 1 fragment only. Is there any configuration change that I can
> do for making multiple segments so that these could be executed on each
> drill bit individually. How can I confirm whether the query is being
> executed on 1 drillbit instance or multiple instances.
>
> - We are trying to compare Impala vs Drill , but for the moment Impala is
> more fast Than Drill
>
> - Environment :
>
> Drill On Yarn : whith 6 drillbits;
>
>
> Regards Hugues Teddy
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message