lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <>
Subject [jira] [Commented] (LUCENE-7299) BytesRefHash.sort() should use radix sort?
Date Wed, 25 May 2016 17:31:13 GMT


David Smiley commented on LUCENE-7299:

In the patch you committed, I noticed the inner subclass implementation of IntroSorter has
a swap() that simply calls StringMSBRadixSorter.this.swap(). That got me thinking, couldn't
compare() do likewise?  Just a minor point of course.

An observation here:  StringMSBRadixSorter is a radix sort that falls back on IntroSorter
which is in turn a quick sort that falls back on either insertionSort or heapSort.  Wow...
maybe we need to work in some more  :-)  Or maybe reconsider if fewer is better or when we
switch.  For example StringMSBRadixSorter switches to IntroSort to cap the recursion levels
(currently at 8), and for other reason too granted, but IntroSort uses recursion _too_.  The
Sorter.THRESHOLD constant confused me temporarily; perhaps "INSERTION_SORT_THRESHOLD" would
be less ambiguous in a class hierarchy of sorters with various thresholds.

> BytesRefHash.sort() should use radix sort?
> ------------------------------------------
>                 Key: LUCENE-7299
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 6.1, master (7.0)
>         Attachments:, LUCENE-7299.patch, LUCENE-7299.patch
> Switching DocIdSetBuilder to radix sort helped make things significantly faster. We should
be able to do the same with BytesRefHash.sort()?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message