lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noble Paul നോബിള്‍ नोब्ळ्" <>
Subject Re: DIH: post-import action & independent SQL statements
Date Tue, 22 Jul 2008 04:18:31 GMT
yes, It will be there in the next patch .
The Entityprocessor interface will have an extra destroy() method so
you can extend SqlEntityProcessor and override the init/destroy
methods for doing pre/post actions. init() is already there

Another addition is getSolrCore() in Context which can help you invoke
methods on solr directly

On Tue, Jul 22, 2008 at 12:09 AM, Jonathan Lee <> wrote:
> Thanks for the solutions to #2 & #3. I assume by your last comment that the
> call back hooks are not yet in the DIH are features that will be released in
> the future as patches, correct?
>> From: Noble Paul നോബിള്‍ नोब्ळ् <>
>> Reply-To: <>
>> Date: Mon, 21 Jul 2008 23:25:15 +0530
>> To: <>
>> Subject: Re: DIH: post-import action & independent SQL statements
>> Overtime we have realized this as a common pattern of requirements.
>> a pre-import, post-import call back hooks are something I can think of.
>> On Mon, Jul 21, 2008 at 11:13 PM, Jonathan Lee <>
>> wrote:
>>> Hello,
>>> I have been using the DataImportHandler successfully to import documents
>>> from MySQL, and it has worked very well.  However I have three questions
>>> about the handler:
>>> 1. Is it possible to execute a command post-import? Specifically, I would
>>> like to run snapshooter after both full & delta imports. The postOptimize
>>> postCommit listeners do not quite work here since I do not want to optimize
>>> after delta imports, and I do not want to run snapshooter for each auto
>>> commit during a full import.
>>> 2. Is it possible to execute independent SQL statements before, during, or
>>> after importing? I would like to created some intermediate temporary tables
>>> and also set specific settings relevant to the import (e.g. "SET
>>> @@group_concat_max_len=...").
>> During the import it is definitely possible. Any transformer can
>> obtain a DataSource  as context.getDataSource(<name>) and invoke any
>> methods .The getData(string query can actually execute anything)
>>> 3. It would be great to have a way to chain together multiple Transformers.
>>> For instance, I'd like to perform regex operations, then template the output
>>> and finally add a custom document boost based on a column.  This could be
>>> done by chaining the RegexTransformer, TemplateTransformer and a custom
>>> Transformer.
>> I guess chaining is possible already.
>> transformer="RegexTransformer,TemplateTransformer,my.CustomTransformer"
>> can chain the 3 transformers
>>> Thanks for your help!
>>> Jonathan Lee
>> There are a bunch of features we have in mind. We do not want to make
>> the patch bigger than it already is and we are waiting for it get
>> committed so that we can provide incremental patches for these
>> --
>> --Noble Paul

--Noble Paul
View raw message