manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timo Selvaraj <timo.selva...@gmail.com>
Subject Re: Metadata Adjuster transformer
Date Thu, 16 Apr 2015 10:48:40 GMT
Hi Rafa,

Let me check this document field.

Thanks,
Timo

> On Apr 16, 2015, at 3:44 AM, Rafa Haro <rharo@apache.org> wrote:
> 
> Hi Timo, 
> 
> If you are using the Tika transformer, probably it is also extracting the document type
as general metadata field and you can manipulate that one in the metadata adjuster
> 
> Cheers,
> Rafa
> 
> 
> En 15 de abril de 2015 en 21:24:17, Karl Wright (daddywri@gmail.com <mailto:daddywri@gmail.com>)
escrito:
> 
>> Hi Timo,
>> 
>> Yes, you can do that, but not with the current metadata adjuster.  It does not allow
you to access the core fields.
>> 
>> Karl
>> 
>> 
>> On Wed, Apr 15, 2015 at 3:16 PM, Timo Selvaraj <timo.selvaraj@gmail.com <mailto:timo.selvaraj@gmail.com>>
wrote:
>> Thanks Karl.
>> 
>> Can I create a new meta field contenttype and add the value HTML based on the mime
type value in the core field?
>> 
>> Timo
>> 
>>> On Apr 15, 2015, at 3:13 PM, Karl Wright <daddywri@gmail.com <mailto:daddywri@gmail.com>>
wrote:
>>> 
>>> Hi Timo,
>>> 
>>> The metadata adjuster currently does not give you access to the core document
fields, only to the document's general metadata.  Basically, anything that ManifoldCF uses
to make crawling decisions based upon is not accessible or modifiable by the adjuster, because
it's not general metadata.
>>> 
>>> That include the document's file name, content/mime type, length, creation date,
and modification date.
>>> 
>>> Technically it is possible to build a document transformer which would copy internal
fields like those described into general metadata fields that could then be manipulated with
the metadata adjuster.  Some connectors already supply such general metadata fields, but it
is by no means a consistent practice.
>>> 
>>> Karl
>>> 
>>> 
>>> On Wed, Apr 15, 2015 at 2:49 PM, Timo Selvaraj <timo.selvaraj@gmail.com <mailto:timo.selvaraj@gmail.com>>
wrote:
>>> Hi,
>>> 
>>> I need to change the incoming meta data into a specified format.
>>> 
>>> I want to change 
>>> 
>>> "Content-Type":"text/html"
>>> to
>>> 
>>> "contenttype":"HTML"
>>> Has anyone done something similar with the metadata adjuster?
>>> 
>>> Thanks,
>>> Timo


Mime
View raw message