asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Adapting TimSort into AsterixDB/Hyracks
Date Sat, 28 Oct 2017 19:05:39 GMT
So the word on the web seems maybe to be that Quicksort is generally 
superior and cache-friendly (cache oblivious). Wondering if we should 
just get our Quicksort code under control?


On 10/28/17 11:36 AM, Chen Luo wrote:
> Not exactly sure about this right now... But since TimSort is essentially a
> combination of insertion sort and merge sort, it's cache-friendliness won't
> be worse than our merge sort.
>
> This TimSort could be served as a short-term plugin-and-play improvement of
> our sorting algorithm. It is still stable, the same as our current merge
> sort, but faster, especially on partially ordered dataset.
>
> Best regards,
> Chen Luo
>
> On Sat, Oct 28, 2017 at 10:58 AM, Mike Carey <dtabass@gmail.com> wrote:
>
>> How is it on cache-friendliness?
>>
>>
>>
>> On 10/27/17 11:38 PM, abdullah alamoudi wrote:
>>
>>> While I have no answer to the question of legality, this sounds great.
>>>
>>> ~Abdullah.
>>>
>>> On Oct 27, 2017, at 9:20 PM, Chen Luo <cluo8@uci.edu> wrote:
>>>> Hi devs,
>>>>
>>>> I have adapted the TimSort algorithm used in JDK (java.util.TimSort) into
>>>> Hyracks, which gives 10-20% performance improvements on random data. It
>>>> will be more useful if the input data is partially sorted, e.g., primary
>>>> keys fetched from secondary index scan, which I haven't got time to
>>>> experiment with.
>>>>
>>>> *Before going any further, is it legal to adapt some algorithm
>>>> implementation from JDK into our codebase? *I saw the JDK implementation
>>>> itself is adopted from
>>>> http://svn.python.org/projects/python/trunk/Objects/listsort.txt as
>>>> well.
>>>>
>>>> Best regards,
>>>> Chen Luo
>>>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message