drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@apache.org>
Subject Re: Very slow parquet write performance due to single threaded write
Date Tue, 28 Sep 2021 07:53:12 GMT

Did you send this same message to a different list (possibly dev@drill?)?

I remember answering it with some timing information, but see that you don't have an answer

On 2021/09/23 15:08:08, Ashu Pachauri <ashu210890@gmail.com> wrote: 
> Hi,
> I have been trying to load a medium sized csv file (22 million rows and  20
> columns) into a parquet table using Drill's CTAS statement.
> However, now matter what I try, the parquet writer in the query plan has
> only one associated minor fragment and thus runs in a single thread.  I
> have tried a simple query with/without order by and with/without partition
> by clauses without much success.
> Is this a limitation of Drill that even in the presence of partition by
> clause ( and absence of any order by), the writes in CTAS are single
> threaded or I am missing something?
> Thanks and Regards,
> Ashu Pachauri

View raw message