manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marisol Redondo <marisol.redondo.gar...@gmail.com>
Subject Re: Metadata adjuster
Date Wed, 22 Feb 2017 13:57:22 GMT
Hi  Karl and thank you for this quick answer.

I was reading the documentation of MCF 1.10 but I'm using MCF 2.5, sorry
for the confusion, and I think this version is compatible with solr6.
The pdf doesn't have any metadata or field called facetContentType, this is
because I'd been trying to use the Metadata Adjuster, to add a new
metadata/property to the doc so solr can index by this field when I'm
injecting the doc.
Should I use other transformation or is there any other way of duing it?
I am migrating from nutch to ManifoldCF and in nutch we can do it with
plugins, and I was thinking that the plugins in nutch are the same as the
transformation connectors in MCF

The completed error in solr is :

017-02-21 13:19:32.108 INFO  (qtp1854778591-18) [   x:sites]
> o.a.s.c.PluginBag Going to create a new requestHandler with {type =
> requestHandler,name = /update/extract,class =
> solr.extraction.ExtractingRequestHandler,args =
> {defaults={lowernames=true,fmap.
> meta=ignored_,fmap.content=_text_,update.chain=add-unknown-fields-to-the-schema,df=_text_}}}

2017-02-21 13:19:32.454 INFO  (qtp1854778591-18) [   x:sites]
> o.a.s.u.p.LogUpdateProcessorFactory [sites]  webapp=/solr path=/up

date/extract params={resource.name=introduction.pdf&literal.id=https://...../introduction.pdf&wt=xml&version=2.2}{}
> 0 347

2017-02-21 13:19:32.455 ERROR (qtp1854778591-18) [   x:sites]
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: [

doc=https://....../introduction.pdf] missing required field:
> facetContentType

        at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:197)

        at
> org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:82)

        at
> org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:277)

        at
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:211)



Thanks


On 21 February 2017 at 14:52, Karl Wright <daddywri@gmail.com> wrote:

> Hi Marisol,
>
> Can you find the [INFO] entry in the Solr log for this document?  That
> should help clear up any confusion.
>
> Also, for what it is worth, MCF 1.10 is not using a SolrJ that is up to
> date with Solr 6.x.  That could be the source of the problem  Is there any
> reason you are using a 1.x version of MCF?
>
> Karl
>
>
> On Tue, Feb 21, 2017 at 8:42 AM, Marisol Redondo <
> marisol.redondo.garcia@gmail.com> wrote:
>
>> Hi.
>>
>> I'm trying to use metadata adjuster to add one field to the solr index
>> but doesn't inject the field into a solr's field.
>> Maybe I'm misundertaning the use of the metada adjuster, but I have read
>> in the documentation (https://manifoldcf.apache.org
>> /release/release-1.10/en_US/end-user-documentation.html) that I can add
>> metadata to the document that is going to be indexed into solr, but the
>> solr instance gave me the error "missing required field:
>> facetContentType".
>>
>> ManifoldCF Job pipeline:
>> 1. Repository (type web repository)
>> 2. Transformation (Tikka Metadata Extractor)
>> 3. Transformation (type Metada Adjuster)
>> 4. Output (Solr 6)
>>
>> ManifoldCF Job Metadata Expressions tab:
>>   Parameter name: "facetContentType"
>>   Remove this parameter: false
>>   Expresion: xxxx  (the literal text value I want in facetContentType)
>>
>> Solr schema:
>>   .....
>>   <field name="facetContentType" type="string" indexed="true"
>> stored="true" required="true"/>
>>  ....
>>
>> The error logged in ManifoldCF is:
>>       Error from server at http://solrServer:port/solr/c
>> <http://revnetsolrdev:8983/solr/sites>ore: [doc=https://....../index.aspx]
>> missing required field: facetContentType.
>>
>> Thanks for your help
>>
>
>

Mime
View raw message