lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-8126) Spatial prefix tree based on S2 geometry
Date Wed, 10 Jan 2018 20:03:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-8126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16320976#comment-16320976
] 

David Smiley commented on LUCENE-8126:
--------------------------------------

_(an aside: I wish the JIRA GitHub integration didn't put so much code context around the
feedback text!)_

It's nice to see a new RPT SpatialPrefixTree implementation :-)  The API is a little crusty;
perhaps sometime we could kick around some ideas to make it nicer.

It'll be interesting to see how well this performs.  This appears to be a 6-ary tree, as opposed
to 4-ary (quad) or 32-ary (geohash).  One could build a variable arity prefixTree by the way
(i.e. first level has 256, next 128, etc.), and I recently tweaked one of ours to do that
(not contributed back yet).

For point data, the higher the arity, the smaller the index but slower search as it must scan
more.

For non-point data, it's not clear since distErrPct caps the resolution of a shape relative
to its size, and I believe (though not 100% sure) that it yields a roughly normal distribution
around a certain number of cells (given fixed distErrPct, random shape type & size, near
equator, random tree arity).  It'd be neat to empirically validate my theory.  If I'm right,
then the optimal arity is probably 4 for non-point shapes, and we have two of those implementations.
RE "near equator" above, see LUCENE-5056 though it has an easy fix in my last comment.

Given the way S2 divides a world into 6 sides recursively, it seems it would place shapes
at a balanced depth in the tree no matter where in the world the data is.  That's a nice benefit...
making the cell depth for a shape a bit more shallow than the probable depth in the other
tree implementations (assuming a target precision for a given shape).  That's a bonus.

CC [~nknize] you may find this issue interesting

> Spatial prefix tree based on S2 geometry
> ----------------------------------------
>
>                 Key: LUCENE-8126
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8126
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial-extras
>            Reporter: Ignacio Vera
>
> Hi [~dsmiley],
> I have been working on a prefix tree based on goggle S2 geometry (https://s2geometry.io/)
to be used mainly with Geo3d shapes with very promising results, in particular for complex
shapes (e.g polygons). Using this pixelization scheme reduces the size of the index, improves
the performance of the queries and reduces the loading time for non-point shapes. 
> If you are ok with this contribution and before providing any code I would like to understand
what is the correct/prefered approach:
> 1) Add new depency to the S2 library (https://mvnrepository.com/artifact/io.sgr/s2-geometry-library-java).
It has Apache 2.0 license so it should be ok.
> 2) Create a utility class with all methods necessary to navigate the S2 tree and create
shapes from S2 cells (basically port what we need from the library into Lucene).
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message