lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dawid Weiss <>
Subject Re: Clustering lucene's results
Date Thu, 23 Sep 2004 17:38:35 GMT
Hi Andrzej :)

Yep, ok, I'll take a look at it. After I come back from abroad (next 
week). I just wanted to save myself some time and have an already 
written code that fetches the information we need for clustering; you 
know what I mean, I'm sure. But I'll start from scratch when I get back.


Andrzej Bialecki wrote:

> Dawid Weiss wrote:
>> Hi William,
>> No, I don't have examples because I never used Lucene directly. If you 
>> provide me with a sample index and an API that executes a query on 
>> this index (I need document titles, summaries, or snippets and an 
>> anchor (identifier), can be an URL).
> Hi Dawid :-)
> I believe the approach to this component should be that you first 
> initialize it by reading a mapping of Lucene index field names to 
> "logical" names (metadata) like title, url, body, etc. The reason is 
> that each index uses its own metadata schema, i.e. in Lucene-speak, the 
> field names.
> Moreover, when you execute a query you get just a document id plus its 
> score. It's up to you to build a snippet. There is a code in the 
> jakarta-lucene-sandbox CVS repo. (highlighter) to create snippets from 
> the query and the hit list, take a look at this...
>> Send me such a snippet and I'll try to write the integration code with 
>> Lucene. It is only a matter of writing a simple InputComponent 
>> instance and this is really trivial (see Nutch's plugin code).
> The basic usage scenario is that you open the IndexReader (either using 
> directory name as a String or a Directory instance), and then create a 
> Query instance, usually using QueryParser, and finally you search using 
> IndexSearcher. You get a list of Hits, which you can use to get scores, 
> and the contents of the documents. Take a look at the IndexFiles and 
> SearchFiles classes in org.apache.lucene.demo package (under /src/demo).

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message