drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashu Pachauri <ashu210...@gmail.com>
Subject Very slow parquet write performance due to single threaded write
Date Thu, 23 Sep 2021 15:08:08 GMT

I have been trying to load a medium sized csv file (22 million rows and  20
columns) into a parquet table using Drill's CTAS statement.

However, now matter what I try, the parquet writer in the query plan has
only one associated minor fragment and thus runs in a single thread.  I
have tried a simple query with/without order by and with/without partition
by clauses without much success.

Is this a limitation of Drill that even in the presence of partition by
clause ( and absence of any order by), the writes in CTAS are single
threaded or I am missing something?

Thanks and Regards,
Ashu Pachauri

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message