manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: schedule information
Date Mon, 22 Dec 2014 10:42:18 GMT
Hi Jitu,

I'm sorry for the miscommunication.  What I meant is that without any
modifications, you can add the job's name as metadata for all documents
indexed with the job.

If you need to index hard-wired metadata for every job run, you will need
to modify WorkerThread.java.  The IJobDescription object is readily
available there, but you will also need to write a SQL query to obtain the
job's start time.

Karl


On Mon, Dec 22, 2014 at 4:33 AM, Jitu <abjitu@gmail.com> wrote:

> Hi Karl,
>           Thanks for the quick reply and support. i have gone through the
> source code of "ForcedMetadataConnector.java" as well as  end user document
> "
> http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html#metadataadjuster".
> It says we can add a string constant for every job run. but for my client
> requirement he wants to know what all files crawled for every run of the
> job. so to search that i need to a send unique id of every job run as part
> of metadata. this unique id changes for every job run so i cannot use
> ForcedMetadataConnector. you advised "It's certainly possible to add the
> current job's start time field as hard-wired metadata" Please let me know
> how to achieve it.
>
> Thanks,
> Jitu
>
> On Fri, Dec 19, 2014 at 1:09 PM, Karl Wright <daddywri@gmail.com> wrote:
>
>> Hi Jitu,
>>
>> You can certainly add a unique string associated with a job to every
>> document using the Metadata Adjuster transformation connector (which of
>> course can be the job name).  The time of indexing is already sent as a
>> metadata field (can't remember which one off the top of my head, but I'm
>> sure you can find it).  What you can't get, mainly because it basically has
>> little meaning in MCF, is the time the job was started.  It's certainly
>> possible to add the current job's start time field as hard-wired metadata,
>> but I bet your client would prefer the actual time of indexing of the
>> document anyhow.
>>
>> Thanks,
>> Karl
>>
>>
>> On Fri, Dec 19, 2014 at 2:30 AM, Jitu <abjitu@gmail.com> wrote:
>>>
>>> Hi Karl,
>>>             Thanks for all your support. For one of our customer they
>>> need job scheduled information to be sent as part of output connector.
>>> Basically my customer wants to know what all files are indexed in one job
>>> run using solr search.
>>>
>>> For example if my job ran on 17th dec 2014 at 11:23 AM then i will send
>>> a unique string say "JobName 17-12-2014 11:23" as part of file metadata
>>> to solr output connector. During solr search it will use this string to
>>> search what all files are indexed as part of this string or job run.
>>>
>>> Please correct me if i am wrong or suggest me how to achive it.
>>>
>>> Thanks,
>>> Jitu
>>>
>>
>

Mime
View raw message