lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: SpanQueries in Luke
Date Fri, 05 Mar 2010 11:11:12 GMT
On 2010-03-05 11:22, mark harwood wrote:
>>> I'll commit the current mostly-working state today, you can take a look
> OK. However I think this XMLQueryParser addition will only resurface a long-standing
issue with Luke and Lucene in general.
> This query parser works best on multiple fields (e.g. free-text<UserQuery>  tags
and<TermsFilter>  on structured fields). Each field typically requires different analyzers
and there is currently no way of recording this information as metadata alongside an index.
> Without this metadata each user's Luke session starts with a game of "guess-which-analyzer-to-use?"

I guess ;) that generally speaking there is no good answer to this - the 
same token stream could have been produced by varying analysis chains, 
even across indexing sessions that append to the same index.

> I use my own proprietary system for storing such index metadata and this is through an
XML file that contains a BeanEncoder-serialized PerFieldAnalyserWrapper among other things.
> It would be nice to see some standardisation in how this information can be made available
in *any* Lucene index but I guess this overlaps with things like Solr's config.


Theoretically one could store such information in 
IndexCommit.getUserData(). The lack of standardized metadata is an 
issue, of course - we could start experimenting with this in Luke, to 
see whether we can squeeze a subset of Solr schema there.

Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message