nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lajos <>
Subject Re: How do I customize Nutch to cater to existing SOLR schema
Date Wed, 12 Mar 2014 11:30:41 GMT
Hi tripiy,

I had raised this issue a few months ago, as I had the exact same 
problem. It cannot be solved by configuration, because of the way that 
the MappingReader puts fields in Maps.

I solved this by implementing a custom plugin that further would support 
some basic transformations, ala Solr DIH.

I proposed this as a contribution, but the only feedback comment didn't 
get the use case. Given you have the exact same issue, I'd say it is a 
useful contribution.

I'll see how/what I can do to get the code available. If a committer can 
comment/provide guidelines, I'd appreciate it.



On 12/03/2014 12:17, tripiy wrote:
> Hi,
> I am working with a SOLR instance with a schema defined by the CMS system
> that are using. Additionally i want to crawl and index content from some
> external sites using Nutch. The biggest issue i am facing is that my
> uniquekey is different from that provided by Nutch (i want to us "_uniqueid"
> instead of default "id"). Though there are other custom fields which i can
> manage using the solr copyfield however i'm not able to get around the
> uniquekey which cannot be set as destination for a copyfield.
> Is there a way to do this customization on Nutch thru config or the Nutch
> code needs to be modified to accommodate custom SOLR schema.
> thanx
> --
> View this message in context:
> Sent from the Nutch - Dev mailing list archive at

View raw message