asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Luo <cl...@uci.edu>
Subject Re: Adapting TimSort into AsterixDB/Hyracks
Date Sat, 28 Oct 2017 20:50:13 GMT
That can be done in Hyracks layer, but enabling Quicksort in AsterixDB has
other issues. Right now AsterixDB only uses merge sort because of its
stableness. Once switching to Quicksort, we have to make sure all the other
operators are happy with this change (some operators are OK with it, but
some are not).

On Sat, Oct 28, 2017 at 12:05 PM, Mike Carey <dtabass@gmail.com> wrote:

> So the word on the web seems maybe to be that Quicksort is generally
> superior and cache-friendly (cache oblivious). Wondering if we should just
> get our Quicksort code under control?
>
>
>
> On 10/28/17 11:36 AM, Chen Luo wrote:
>
>> Not exactly sure about this right now... But since TimSort is essentially
>> a
>> combination of insertion sort and merge sort, it's cache-friendliness
>> won't
>> be worse than our merge sort.
>>
>> This TimSort could be served as a short-term plugin-and-play improvement
>> of
>> our sorting algorithm. It is still stable, the same as our current merge
>> sort, but faster, especially on partially ordered dataset.
>>
>> Best regards,
>> Chen Luo
>>
>> On Sat, Oct 28, 2017 at 10:58 AM, Mike Carey <dtabass@gmail.com> wrote:
>>
>> How is it on cache-friendliness?
>>>
>>>
>>>
>>> On 10/27/17 11:38 PM, abdullah alamoudi wrote:
>>>
>>> While I have no answer to the question of legality, this sounds great.
>>>>
>>>> ~Abdullah.
>>>>
>>>> On Oct 27, 2017, at 9:20 PM, Chen Luo <cluo8@uci.edu> wrote:
>>>>
>>>>> Hi devs,
>>>>>
>>>>> I have adapted the TimSort algorithm used in JDK (java.util.TimSort)
>>>>> into
>>>>> Hyracks, which gives 10-20% performance improvements on random data.
It
>>>>> will be more useful if the input data is partially sorted, e.g.,
>>>>> primary
>>>>> keys fetched from secondary index scan, which I haven't got time to
>>>>> experiment with.
>>>>>
>>>>> *Before going any further, is it legal to adapt some algorithm
>>>>> implementation from JDK into our codebase? *I saw the JDK
>>>>> implementation
>>>>> itself is adopted from
>>>>> http://svn.python.org/projects/python/trunk/Objects/listsort.txt as
>>>>> well.
>>>>>
>>>>> Best regards,
>>>>> Chen Luo
>>>>>
>>>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message