lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicholas Knize (JIRA)" <>
Subject [jira] [Commented] (LUCENE-7179) GeoPoint and LatLonPoint test data should quantize once
Date Tue, 05 Apr 2016 17:21:25 GMT


Nicholas Knize commented on LUCENE-7179:

I don't disagree with the "pain" points. But you have to remember that {{GeoPointField}} works
by way of a quad tree represented in unsigned long space. This isn't "quantization" for memory/disk
purposes, its a dimensionality reduction technique. {{GeoPointTermsEnum}} relations simply
reduce to a bunch of prefix masking and bit operations. The fact that the space filling curve
is represented as a 64 bit long is only for bit operation simplicity. I could change it to
a bigger bit space and make it closer to lossless, it just makes the enum code harrier.

bq. I tried to port LatLonPoints "rounds down" test and it fails

If you're referring to {{TestEncodingUtils.testEncodeDecodeRoundsDown}} it passes fine with
the LUCENE-7164 64 bit space change. It won't pass if you change the GeoPoint encoding to
use {{Math.round}} But again... all of these inconsistencies are occurring within the expected
accepted TOLERANCE so they shouldn't be a surprise. Its the same as casting a double to a
float and back and expecting numerical stability.

bq. its buggy for some double values

?? Not sure I follow. Its not lossless if that's what you mean? But that's also a known limitation
for using 32bit unsigned space. 

bq. Otherwise I don't think we should do quantization!

Its needed for GeoPointField. But if we don't want to handle the dimensional reduction limitations
we can remove this approach altogether. Noting that we haven't even begun to tap into its
optimization potential.

> GeoPoint and LatLonPoint test data should quantize once
> -------------------------------------------------------
>                 Key: LUCENE-7179
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Nicholas Knize
>         Attachments: LUCENE-7179.patch
> {{LatLonPoint}} and {{GeoPointField}} tests pre quantizes test data to ensure consistency
with indexed (encoded) data. The pre quantized data then becomes indexed, undergoing another
quantization. To guarantee numerical stability this should be changed such that the test data
is quantized after indexing.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message