manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phillip Rhodes <motley.crue....@gmail.com>
Subject MCF not indexing documents due to mime-type
Date Wed, 20 Dec 2017 09:25:33 GMT
MCF folks:

I'm about to tear my hair out over this one... I just realized that
I've been running MCF with the "Use the Extract Update Handler:"
option checked.  Suspecting this might be related to another issue I
was having (content was not being stored in the field named in the
"Content field name:" option in MCF), I turned this option off.

Now, MCF happily rejects nearly every document in my repository with this:

Result Code: EXCLUDEDMIMETYPE
Result Description: Excluding document because of mime type (application/pdf)
(and so on for many other mime types)

So... this is *not* what I would expect to happen as I have nothing at
all listed in the "excluded mime types" setting for this output
connector.  With nothing explicitly excluded, I would (perhaps
naively) expect all mime types to be sent to Solr.

But what makes it even worse is this: even when I explicitly add types
(for example, application/pdf) to the "included mime types" setting
and re-index, I *still* get the same message and no PDF files are
indexed.

Any ideas?  Is this a bug, or is there something else I need to do?



Thanks,


Phil
~~~
This message optimized for indexing by NSA PRISM

Mime
View raw message