lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicholas Knize (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-6712) GeoPointField should cut over to DocValues for boundary filtering
Date Sat, 01 Aug 2015 20:03:04 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650516#comment-14650516
] 

Nicholas Knize edited comment on LUCENE-6712 at 8/1/15 8:02 PM:
----------------------------------------------------------------

Initial patch with the following changes:

* adds a GeoPointTermQueryConstantScoreWrapper for cutting over to doc values on boundary
terms
* adds a GeoPointQueryPostFilter interface for Query specific filtering
* reduces precision_step from 6 to 8 (8 terms per point instead of 11)

Initial *rough* performance benchmarks indicate ~20% boost in query performance and ~42% smaller
index (still need to run some due diligence on these numbers). 


was (Author: nknize):
Initial patch with the following changes:

* adds a GeoPointTermQueryConstantScoreWrapper for cutting over to doc values on boundary
terms
* adds a GeoPointQueryPostFilter interface for Query specific filtering
* reduces precision_step from 6 to 8 (8 terms per point instead of 11)

Initial performance benchmarks indicate ~20% boost in query performance and ~42% smaller index.


> GeoPointField should cut over to DocValues for boundary filtering
> -----------------------------------------------------------------
>
>                 Key: LUCENE-6712
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6712
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Nicholas Knize
>         Attachments: LUCENE-6712.patch
>
>
> Currently GeoPointField queries only use the Terms Dictionary for ranges that fall within
and on the boundary of the query shape.  For boundary ranges the full precision terms are
iterated, for within ranges the postings list is used.
> Instead of iterating full precision terms for boundary ranges, this enhancement cuts
over to DocValues for post-filtering boundary terms. This allows us to increase precisionStep
for GeoPointField thereby reducing the number of terms and the size of the index. This enhancement
should also provide a boost in query performance since visiting more docs and fewer terms
should be more efficient than visiting fewer docs and more terms. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message