lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: get term neighbours
Date Tue, 05 May 2009 14:20:16 GMT
There isn't a very clean way to do this just yet, but it is doable.   
Index with positions (you might find offsets useful too) and then use  
the TermVectorMapper and TermVector API call on the IndexReader (not  
the termPositions).  Then, you will need to implement a  
TermVectorMapper that takes in your position and then reads in the  
term vector and gets just those positions around the interested  
position.  Once you are outside of your window, you can then short  
circuit out of the TermVM (I think).


On May 3, 2009, at 2:39 PM, Adrian Dimulescu wrote:

> Hello,
> I am post-processing a positional index -- with a field like the  
> following:
> doc.add(new Field(Constants.FIELD_TEXT, txt, Store.NO,  
> At post-processing, I want to retrieve the neighbours of a given  
> term within a given range. That is, if document x contains the  
> sequence :
> "Alabama experienced significant /recovery as the economy of the  
> state/ transitioned from agriculture to diversified interests in  
> heavy manufacturing"
> for range = 3 and term = "economy", I want to retrieve "recovery as  
> the *economy* of the state".
> I see there is an API call :
> IndexReader.termPositions(term)
> which retrieves the actual positions of the given term. Is there a  
> quick way to retrieve its neighbours too, instead of browsing all  
> terms for all document and see if their position is close to the  
> position of the central term ?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Grant Ingersoll

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message