lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: Analyzer and Fieldable, different stored and indexed values
Date Wed, 27 Aug 2008 16:55:27 GMT
Grant Ingersoll wrote:
> If I'm understanding correctly...
> What about a SinkTokenizer that is backed by a Reader/Field instead of 
> the current one that stores it all in a List?  This is more or less the 
> use case for the Tee/Sink implementations, w/ the exception that we 
> didn't plan for the Sink being too large, but that is easily overcome, IMO.
> That is, you use a TeeTokenFilter that adds to your Sink, which 
> serializes to some storage, and then your SinkTokenizer just 
> unserializes.  No need to change Fieldable at all or anything else
> Or maybe just a Tokenizer that is backed by a Field would work and uses 
> a TermEnum on the Field to serve up next() for the TokenStream.
> Just thinking out loud...

Actually, the scenario is more complicated, because I need to implement 
this as a Solr FieldType ... besides, wouldn't this mean that I can't 
store the original value, because I'm setting the tokenStream on a Field 
(which automatically makes it un-stored)?

Anyway, thanks for the hint, I'll check if I can do it this way. Other 
points about the new Analyzer API - I still think it would offer more 
flexibility than the current API, for a minimal cost in compatibility, 
and likely no cost in performance.

Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message