lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Ankit.Mura...@ril.com>
Subject RE: Lucene Spatial Implementation for Points within Polygon.
Date Wed, 24 Dec 2014 10:46:11 GMT
Thanks for the suggestions David..
However I am in a fix.. Although I am indexing and searching both using JTS, I am still getting
very less hits. I am very sure that points which are indexed, falls inside lot of polygons
but hits are not giving me the proper result.

For approx. 8 lac polygons, I am getting 4.5 lacs polygons having points. For remaining 3.5
lacs I am not getting any HITS. Providing a small snippet of the code. Please suggest.

I am indexing points as WKT Shape using the following Code.

JtsSpatialContext spatialContext=JtsSpatialContext.GEO;
SpatialPrefixTree grid=new GeohashPrefixTree(spatialContext,22);
spatialStrategy=new RecursivePrefixTreeStrategy(grid,"position");

Shape point = spatialContext.readShape("POINT("+lat+" "+lon+")");
doc.add(new StoredField("FieldName",value));
for(IndexableField f: spatialStrategy.createIndexableFields(point))
{
doc.add(f);
}

doc.add(new StoredField(spatialStrategy.getFieldName(),lat+";"+lon+";"value));

indexWriter.addDocument(doc);


For Searching, since I have polygons, I am using the following code:

JtsSpatialContext spatialContext=JtsSpatialContext.GEO;
SpatialPrefixTree grid=new GeohashPrefixTree(spatialContext,22);
spatialStrategy=new RecursivePrefixTreeStrategy(grid,"position");


StringBuffer to create polygons like this.

POLYGON((Lat Long,Lat Long pairs))

SpatialArgs args=new SpatialArgs(SpatialOperation.Intersects,spatialContext.readShape(StringBuffer.toString());
ConstantScoreQuery csq=new ConstantScoreQuery(spatialStrategy.makeQuery(args));


TopDocs docs=indexSearcher.search(csq,100000);

If(docs.totalHits>0)
{
Process Data
}
Else
{
PRINT NO DATA FOUND.
}

Problem is for most of the polygons (approx. 50%) , I am getting NO DATA FOUND indicating
no HITS. Now, I am pretty sure that there are Lat/Long pair's indexed which fall within the
supplied polygon but I am unable to get all the Hits.

Please help me in identifying where am I going wrong. For every incorrect polygon which is
present(boundaries intersecting,incomplete), I am printing exception which is again I am excluding..
This is not the worry..

Worry is I am getting very polygons which actually have points inside them.

Please correct me where I am going wrong.


-----Original Message-----
From: david.w.smiley@gmail.com [mailto:david.w.smiley@gmail.com] 
Sent: 22 December 2014 19:19
To: java-user@lucene.apache.org
Subject: Re: Lucene Spatial Implementation for Points within Polygon.

Hello.

You have stated the use-case so generically that it’s not clear if you should index the
polygon set and query by the point set, or the reverse.
Generally, you should index the set that is known in-advance and then query by the other,
the set that is generally not known.  Assuming this is the case, index the stable set with
RecursivePrefixTreeStrategy, *and*, for accuracy, if that set is also the polygon set, use
SerializedDVStrategy
*or* simply keep them all in-memory keyed by an identifier (call
JtsGeometry.index() on each as well) that you check against at runtime.  If you don’t have
enough RAM then you’ll do the former.  If neither set seems to be “stable”, you could
really index either, definitely choose to index the points.  The predicate you should use
is INTERSECTS; the others are intended for polygon against polygons (basically any non-point
shape against another non-point shape).

If your scenario is quite simply, you have a bunch of points and polygons you get all at once
to make this computation and then that’s it (no long-term need to query again by the same
polygons or points in the future), I suggest using JTS directly in-memory, and its PreparedGeometry
to optimize each polygons, then iterate through your points to see which polygons they are
in.  You might even use JTS's STRtree to index polygon bounding boxes to avoid looping over
all polygons.

~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley

On Mon, Dec 22, 2014 at 12:30 AM, <Ankit.Murarka@ril.com> wrote:
>
> Hello Team,
>
> We are starting off with Lucene Spatial implementation for some of the 
> use
> cases:
>
> A . Given "N" polygons and "M" points, find how many points lie inside 
> each of the polygon.
>
> 1st Approach :
>
> For A, we indexed Polygons using WKT and using JtsSpatial strategy. I 
> set the Level at 22 . This has resulted in huge number of terms. This 
> was needed as I need the search to be near perfect.
>
> For Indexing, I used Point(Supplied as WKT) using Jts again with Level 
> at
> 22 (Although I think specifying level at query time does not make much 
> difference).
>
> For this, we used ""CONTAINS" .  Output is coming but I am not sure if 
> I am doing it the right way. Need suggestion.
>
> I am having following confusion:
>
> a.       Will CONTAINS and IS WITHIN both work in the same way for the
> given scenario. I am ruling OUT INTERSECTS as that scenario is not 
> appropriate.
>
> b.      Second, are we missing something  in getting the correct output.
>
>
> 2nd Approach : (Reversed)
>
> Indexed POINTS in WKT format.
> Passed Polygons in WKT using JTs as query and fired as INTERSECTS and 
> WITHIN.
>
> In second approach, we are getting more output than the 1st approach.
>
> However, we are still not sure which is the best way to tackle this 
> problem. Please suggest.
>
> "Confidentiality Warning: This message and any attachments are 
> intended only for the use of the intended recipient(s).
> are confidential and may be privileged. If you are not the intended 
> recipient. you are hereby notified that any review. re-transmission. 
> conversion to hard copy. copying. circulation or other use of this 
> message and any attachments is strictly prohibited. If you are not the 
> intended recipient. please notify the sender immediately by return 
> email.
> and delete this message and any attachments from your system.
>
> Virus Warning: Although the company has taken reasonable precautions 
> to ensure no viruses are present in this email.
> The company cannot accept responsibility for any loss or damage 
> arising from the use of this email or attachment."
>
"Confidentiality Warning: This message and any attachments are intended only for the use of
the intended recipient(s). 
are confidential and may be privileged. If you are not the intended recipient. you are hereby
notified that any 
review. re-transmission. conversion to hard copy. copying. circulation or other use of this
message and any attachments is 
strictly prohibited. If you are not the intended recipient. please notify the sender immediately
by return email. 
and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure no viruses
are present in this email. 
The company cannot accept responsibility for any loss or damage arising from the use of this
email or attachment."
Mime
View raw message