calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <jh...@apache.org>
Subject Re: Enumerable groupBy() take advantage of input collation?
Date Sat, 22 Aug 2015 04:10:07 GMT
Thanks!

> On Aug 21, 2015, at 4:01 PM, Li Yang <liyang@apache.org> wrote:
> 
> https://issues.apache.org/jira/browse/CALCITE-853
> 
> On Fri, Aug 21, 2015 at 2:20 PM, Julian Hyde <jhyde@apache.org> wrote:
> 
>> Yes, that would be useful. Please log a jira.
>> 
>> Enumerable.groupBy doesn't know its input's collation so can't make that
>> decision, but EnumerableAggregate does. I think that EnumerableAggregate
>> should have a "trigger key", a subset of its group key, and if the trigger
>> key changes it will emit and flush its hash table.
>> 
>> As well as for your use case, it will be useful for streaming queries.
>> 
>> Julian
>> 
>>> On Aug 20, 2015, at 2:35 AM, Li Yang <liyang@apache.org> wrote:
>>> 
>>> I encountered Out Of Mem exception when a huge result set is passed into
>>> EnumerableAggregate and get aggregated in memory. I'm thinking if the
>> input
>>> is sorted by the group-by key, then the groupBy() don't have to hold all
>>> data in memory any more.
>>> 
>>> So does the Enumerable groupBy() take advantage of input collation
>>> currently?  Should I open a JIRA for it?
>>> 
>>> 
>>> Cheers
>>> Yang
>> 
>> 


Mime
View raw message