drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Gilmore <dragoncu...@gmail.com>
Subject Custom UDFS slow
Date Wed, 27 May 2015 02:26:15 GMT
Hi guys,

I have written a couple of custom UDFS (specifically WEEK() and WEEKYEAR()
to get that date information out of timestamps).

I sampled two queries (on approx. 11 million records in Parquet files)

select count(*) from `table` group by extract(day from `timestamp`)

750ms

select count(*) from `table` group by week(`timestamp`)

2100ms

The code for the WEEK() function is not far from the code from the source
for the EXTRACT(DAY) function.  Furthermore, even if I copy the exact code
for the EXTRACT(DAY) function into that, it has the same performance
detriments.

My question is, why would a UDF be so much slower?  Is this by design or is
there something I'm missing?

Happy to attach the source code of the function if that helps.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message