commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles <>
Subject Re: [Math] MATH-1129
Date Wed, 18 Jun 2014 12:32:31 GMT
Hello Luc.

>> See
>> The problem reported was due to the sorting procedure not behaving
>> correctly in the presence of NaN.
>> This procedure could be replaced by an equivalent method from the 
>> JDK:
>>   java.util.Arrays.sort(double[],int,int)
>> Any objection?
> If you imply removing the select, medianOf3 and partition methods, 
> yes I
> would object. If you imply replacing only the insertionSort method 
> used
> for small sub-arrays, then I almost agree.

Issue 1129 concerns the latter: See the comment in "insertionSort" in 
current code.

However, for the former, you should really have a look at MATH-1120
The proposed patch indeed touches those methods.

> However, I'm not sure this would be sufficient as the sort is done
> partially in all the methods above. The former ones are used to split
> the big array into smaller ones and sorting only the sub-arrays that 
> are
> needed to compute the percentile (it is really a selection algorithm,
> not a full-sort algorithm). So I guess we should look at all the 
> methods
> at once to ensure proper handling of NaNs. The trick is that all
> comparisons involving NaN return false (<, <=, >, >=, ==, !=),
> regardless of the NaN being at the left hand side or right hand side 
> of
> the comparison (so we get the funny consequence that if a is a NaN, 
> the
> test a == a evaluates to false ...).

As of r1603217, "insertionSort" seems to behave correctly (at least on 
example reported in MATH-1129).

> The fast an insertion sort is used at the end is because it is very
> simply and efficient for small arrays, which is exactly what happens
> here: the array part has MIN_SELECT_SIZE elements or less, i.e. 15
> elements or less.

That was my question, implicitly: is it worth not using the JDK, or 
is "insertionSort" so much faster than "java.util.Arrays.sort" that it 
worth maintaining a custom implementation?

> As I wrote this selection algorithm, I could do the check if you 
> want.

What check?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message