lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@corp.aol.com>
Subject Re: [DIH] blocking import operation
Date Thu, 12 Nov 2009 05:48:34 GMT
Yes , open an issue . This is a trivial change

On Thu, Nov 12, 2009 at 5:08 AM, Sascha Szott <szott@zib.de> wrote:
> Noble,
>
> Noble Paul wrote:
>> DIH imports are really long running. There is a good chance that the
>> connection times out or breaks in between.
> Yes, you're right, I missed that point (in my case imports take no longer
> than a minute).
>
>> how about a callback?
> Thanks for the hint. There was a discussion on adding a callback url to
> DIH a month ago, but it seems that no issue was raised. So, up to now its
> only possible to implement an appropriate Solr EventListener. Should we
> open an issue for supporting callback urls?
>
> Best,
> Sascha
>
>>
>> On Tue, Nov 10, 2009 at 12:12 AM, Sascha Szott <szott@zib.de> wrote:
>>> Hi all,
>>>
>>> currently, DIH's import operation(s) only works asynchronously.
>>> Therefore,
>>> after submitting an import request, DIH returns immediately, while the
>>> import process (in case a large amount of data needs to be indexed)
>>> continues asynchronously behind the scenes.
>>>
>>> So, what is the recommended way to check if the import process has
>>> already
>>> finished? Or still better, is there any method / workaround that will
>>> block
>>> the import operation's caller until the operation has finished?
>>>
>>> In my application, the DIH receives some URL parameters which are used
>>> for
>>> determining the database name that is used within data-config.xml, e.g.
>>>
>>> http://localhost:8983/solr/dataimport?command=full-import&dbname=foo
>>>
>>> Since only one DIH, /dataimport, is defined, but several database needs
>>> to
>>> be indexed, it is required to issue this command several times, e.g.
>>>
>>> http://localhost:8983/solr/dataimport?command=full-import&dbname=foo
>>>
>>> ... wait until /dataimport?command=status says "Indexing completed" (but
>>> without using a loop that checks it again and again) ...
>>>
>>> http://localhost:8983/solr/dataimport?command=full-import&dbname=bar&clean=false
>>>
>>>
>>> A suitable solution, at least IMHO, would be to have an additional DIH
>>> parameter which determines whether the import call is blocking on
>>> non-blocking, the default. As far as I see, this could be accomplished
>>> since
>>> Solr can execute more than one import operation at a time (it starts a
>>> new
>>> thread for each). Perhaps, my question is somehow related to the
>>> discussion
>>> [1] on ParallelDataImportHandler.
>>>
>>> Best,
>>> Sascha
>>>
>>> [1] http://www.lucidimagination.com/search/document/a9b26ade46466ee
>>>
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Mime
View raw message