lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Token filter on multivalue field
Date Wed, 03 Jun 2009 20:22:20 GMT

Hello,

It's ugly, but the first thing that came to mind was ThreadLocal.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: David Giffin <david@giffin.org>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, June 3, 2009 1:57:42 PM
> Subject: Token filter on multivalue field
> 
> Hi There,
> 
> I'm working on a unique token filter, to eliminate duplicates on a
> multivalue field. My filter works properly for a single value field.
> It seems that a new TokenFilter is created for each value in the
> multivalue field. I need to maintain an array of used tokens across
> all of the values in the multivalue field. Is there a good way to do
> this? Here is my current code:
> 
> public class UniqueTokenFilter extends TokenFilter {
> 
>     private ArrayList words;
>     public UniqueTokenFilter(TokenStream input) {
>         super(input);
>         this.words = new ArrayList();
>     }
> 
>     @Override
>     public final Token next(Token in) throws IOException {
>         for (Token token=input.next(in); token!=null; token=input.next()) {
>             if ( !words.contains(token.term()) ) {
>                 words.add(token.term());
>                 return token;
>             }
>         }
>         return null;
>     }
> }
> 
> Thanks,
> David


Mime
View raw message