lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@buyways.nl>
Subject RE: Config issue for deduplication
Date Thu, 13 May 2010 16:15:16 GMT
What's your solrconfig? No deduplication is overwritesDedupes = false and signature field is
other than doc ID field (unique) 
 
-----Original message-----
From: Markus Fischer <info@flyingfischer.ch>
Sent: Thu 13-05-2010 17:01
To: solr-user@lucene.apache.org; 
Subject: Config issue for deduplication

I am trying to configure automatic deduplication for SOLR 1.4 in Vufind. 
I followed:

http://wiki.apache.org/solr/Deduplication

Actually nothing happens. All records are being imported without any 
deduplication.

What am I missing?

Thanks
Markus

I did:

- create a duplicated set of records, only shifted their ID by a fixed 
number

---
solrconfig.xml

<requestHandler name="/update" class="solr.XmlUpdateRequestHandler" >
 <lst name="defaults">
     <str name="update.processor">dedupe</str>
 </lst>
</requestHandler>

<updateRequestProcessorChain name="dedupe">
  <processor 
class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory">
  <bool name="enabled">true</bool>
  <bool name="overwriteDupes">true</bool>
  <str name="signatureField">dedupeHash</str>
  <str name="fields">reference,issn</str>
  <str 
name="signatureClass">org.apache.solr.update.processor.Lookup3Signature</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

---
In schema.xml I added the field

<field name="dedupeHash" type="string" stored="true" indexed="true" 
multiValued="false" />

--

If I look at the created field "dedupeHash" it seems to be empty...!?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message