manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <>
Subject Re: Running 2 jobs to update same document Index but different fields
Date Tue, 27 Mar 2012 17:25:44 GMT
The document key in Solr is the url of the document, as constructed by
the connector you are using.  If you are using the same document to
construct two different Solr documents, ManifoldCF by definition
cannot be aware of this.  But if these are different files from the
point of view of ManifoldCF they will have different URLs and be
treated differently.  The jobs can overlap in this case with no


On Tue, Mar 27, 2012 at 1:08 PM, Anupam Bhattacharya
<> wrote:
> I want to configure two jobs to index in SOLR using ManifoldCF using
> /extract/update requestHandler.
> 1st to synchronize only XML files & 2nd to synchronize the PDF file.
> If both these document share a unique id. Can i combine the indexes for both
> in 1 SOLR schema without overriding the details added by previous job.
> suppose,
>       xmldoc indexes field0(id), field1, field2, field3
> &    pdfdoc indexes field0(id), field4, field5, field6.
> Output docindex ==> (xml+pdf doc), field0(id), field1, field2, field3,
> field4, field5, field6
> Regards
> Anupam

View raw message