lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject Re: Split analysis
Date Wed, 02 Mar 2011 12:33:38 GMT
There is an updateRequestProcessorChain you can use to execute some 
processors. Check de page for deduplication, it already has methods for 
creating signatures but you can easily create your own if you have to.

Use copyField to copy the value to a non-analyzed field (string) and obtain the 
original token input.

http://wiki.apache.org/solr/Deduplication


On Wednesday 02 March 2011 13:21:58 dan sutton wrote:
> Hi All,
> 
> I have a requirement to analyze a field with a series of filters,
> calculate a 'signature' then concatenate with the original input
> 
> e.g.
> 
> input     =>     'this is the input'
> 
> tokenized and filtered,  input becomes say 'this input'     =>
> 12ef5e (signature)
> 
> so the final output indexed is:
> 
> 12ef5ethis is the input
> 
> I can calculate the signature easily, but how can I get access to the
> original (now tokenized and filtered) input
> 
> Many thanks in advance,
> Dan

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Mime
View raw message