[Numpy-discussion] suggested change of behavior for interp
Julian Taylor
jtaylor.debian at googlemail.com
Wed Jun 5 13:08:41 EDT 2013
On 05.06.2013 16:33, Nathaniel Smith wrote:
> On Wed, Jun 5, 2013 at 3:16 PM, Slavin, Jonathan
> <jslavin at cfa.harvard.edu> wrote:
>> The simplest monotonicity test that I've seen is:
>>
>> dx = np.diff(x)
>> monotonic = np.all(dx < 0.) or np.all(dx > 0.)
>>
>> I expect that this is pretty fast, though I haven't tested it yet. If we
>> want to make checking optional, then I think the default should be to check
>> with the option to skip the check.
>
> The slow down people are worried about is, suppose that 'xp' has
> 1,000,000 entries, and the user wants to interpolate 1 point. If we
> can assume the array is sorted, then we can find which bin the 1 point
> falls into by using binary search (np.searchsorted), which will
> require examining ~log2(1,000,000) = 20 entries in the array. Checking
> that it's sorted, though, will require examining all 1,000,000 points
> -- it turns an O(log n) operation into an O(n) operation. And if this
> is being called as part of some larger algorithm, this effect can
> cascade -- if it gets called m times from a loop, now that's O(mn)
> instead of (m log n), etc. That's often the difference between
> tractable and intractable.
>
> If you submit a PR that gives interp the option to check for
> monotonicity, then we'll almost certainly merge it :-). If you also
> make it the default then there might be some debate, though I'd argue
> for it...
I would not like it when the performance goes from log to linear by default.
It is documented that the coordinates must be increasing after all.
How about as a compromise the function checks one or two closest
neighbors around the interpolation point and warns/errors if those are
not monotonic.
Its not fool proof but should at least catch the very wrong cases with
an acceptable performance loss.
More information about the NumPy-Discussion
mailing list