lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MitchK <mitc...@web.de>
Subject Re: a bug of solr distributed search
Date Mon, 26 Jul 2010 06:06:01 GMT

Good morning,

https://issues.apache.org/jira/browse/SOLR-1632

- Mitch


Li Li wrote:
> 
> where is the link of this patch?
> 
> 2010/7/24 Yonik Seeley <yonik@lucidimagination.com>:
>> On Fri, Jul 23, 2010 at 2:23 PM, MitchK <mitch91@web.de> wrote:
>>> why do we do not send the output of TermsComponent of every node in the
>>> cluster to a Hadoop instance?
>>> Since TermsComponent does the map-part of the map-reduce concept, Hadoop
>>> only needs to reduce the stuff. Maybe we even do not need Hadoop for
>>> this.
>>> After reducing, every node in the cluster gets the current values to
>>> compute
>>> the idf.
>>> We can store this information in a HashMap-based SolrCache (or something
>>> like that) to provide constant-time access. To keep the values up to
>>> date,
>>> we can repeat that after every x minutes.
>>
>> There's already a patch in JIRA that does distributed IDF.
>> Hadoop wouldn't be the right tool for that anyway... it's for batch
>> oriented systems, not low-latency queries.
>>
>>> If we got that, it does not care whereas we use doc_X from shard_A or
>>> shard_B, since they will all have got the same scores.
>>
>> That only works if the docs are exactly the same - they may not be.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
> 
> 
-- 
View this message in context: http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p995407.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message