lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@corp.aol.com>
Subject Re: [DIH] concurrent requests to DIH
Date Fri, 13 Nov 2009 03:36:02 GMT
I guess SOLR-1352 should solve all the problems with performance. I am
working on one currently and I hope to submit a patch soon.

On Thu, Nov 12, 2009 at 8:05 PM, Sascha Szott <szott@zib.de> wrote:
> Hi Avlesh,
>
> Avlesh Singh wrote:
>>>
>>> 1. Is it considered as good practice to set up several DIH request
>>> handlers, one for each possible parameter value?
>>>
>> Nothing wrong with this. My assumption is that you want to do this to
>> speed
>> up indexing. Each DIH instance would block all others, once a Lucene
>> commit
>> for the former is performed.
> Thanks for this clarification.
>
>> 2. In case the range of parameter values is broad, it's not convenient to
>>> define separate request handlers for each value. But this entails a
>>> limitation (as far as I see): It is not possible to fire several request
>>> to the same DIH handler (with different parameter values) at the same
>>> time.
>>>
>> Nope.
>>
>> I had done a similar exercise in my quest to write a
>> ParallelDataImportHandler. This thread might be of interest to you -
>> http://www.lucidimagination.com/search/document/a9b26ade46466ee/queries_regarding_a_paralleldataimporthandler.
>> Though there is a ticket in JIRA, I haven't been able to contribute this
>> back. If you think this is what you need, lemme know.
> Actually, I've already read this thread. In my opinion, both support for
> batch processing and multi-threading are important extensions of DIH's
> current capabilities, though issue SOLR-1352 mainly targets the latter. Is
> your PDIH implementation able to deal with batch processing right now?
>
> Best,
> Sascha
>
>> On Thu, Nov 12, 2009 at 6:35 AM, Sascha Szott <szott@zib.de> wrote:
>>
>>> Hi all,
>>>
>>> I'm using the DIH in a parameterized way by passing request parameters
>>> that are used inside of my data-config. All imports end up in the same
>>> index.
>>>
>>> 1. Is it considered as good practice to set up several DIH request
>>> handlers, one for each possible parameter value?
>>>
>>> 2. In case the range of parameter values is broad, it's not convenient
>>> to
>>> define separate request handlers for each value. But this entails a
>>> limitation (as far as I see): It is not possible to fire several request
>>> to the same DIH handler (with different parameter values) at the same
>>> time. However, in case several request handlers would be used (as in
>>> 1.),
>>> concurrent requests (to the different handlers) are possible. So, how to
>>> overcome this limitation?
>>>
>>> Best,
>>> Sascha
>>>
>>
>
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Mime
View raw message