lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chantal Ackermann <chantal.ackerm...@btelligent.de>
Subject Re: concatenating tokens
Date Fri, 09 Oct 2009 08:34:07 GMT
Hi Joe,

WordDelimiterFilter removes different delimiters, and creates several 
token strings from the input. It can also concatenate and add that as 
additional token to the stream. Though, it concatenates without space. 
But maybe you can tweak it to your needs?
You could also use two different fields, one creating the concatenated 
version with spaces, and the other producing the catenated tokens. (Both 
with WordDelimiter and/or RegexPattern filters etc.)

Cheers,
Chantal

Joe Calderon schrieb:
> hello *, im using a combination of tokenizers and filters that give me
> the desired tokens, however for a particular field i want to
> concatenate these tokens back to a single string, is there a filter to
> do that, if not what are the steps needed to make my own filter to
> concatenate tokens?
> 
> for example, i start with "Sprocket (widget) - Blue" the analyzers
> churn out the tokens [sprocket,widget,blue] i want to end up with the
> string "sprocket widget blue", this is a simple example and in the
> general case lowercasing and punctuation removal does not work, hence
> why im looking to concatenate tokens
> 
> --joe

Mime
View raw message