lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ku3ia <dem...@gmail.com>
Subject Re: Poor performance on distributed search
Date Mon, 19 Dec 2011 09:35:06 GMT
Hi, Erick. Thanks for your advice.
>>Here's another test. Add &debugQuery=on to your query and post the
results.
Here is for 2K rows:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">53153</int>
<lst name="params">
<str name="debugQuery">on</str>
<str name="fl">*,score</str>
<str name="shards">
127.0.0.1:8080/solr/shard1,127.0.0.1:8080/solr/shard2,127.0.0.1:8080/solr/shard3,127.0.0.1:8080/solr/shard4
</str>
<str name="ident">true</str>
<str name="start">0</str>
<str name="q">(mainstreaming)</str>
<str name="rows">2000</str>
</lst>
</lst>
<result name="response" numFound="2305" start="0" maxScore="4.657284">
>>>Here 2K docs<<<
</result>
<lst name="debug">
<str name="rawquerystring">(mainstreaming)</str>
<str name="querystring">(mainstreaming)</str>
<str name="parsedquery">ArticleText:mainstream</str>
<str name="parsedquery_toString">ArticleText:mainstream</str>
<str name="QParser">LuceneQParser</str>
<lst name="timing">
<double name="time">67797.0</double>
<lst name="prepare">
<double name="time">73.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">72.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
<lst name="process">
<double name="time">67724.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">66607.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">1115.0</double>
</lst>
</lst>
</lst>
<lst name="explain">
...
</lst>
</lst>
</response>

And this is for 10:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">3626</int>
<lst name="params">
<str name="debugQuery">on</str>
<str name="fl">*,score</str>
<str name="shards">
127.0.0.1:8080/solr/shard1,127.0.0.1:8080/solr/shard2,127.0.0.1:8080/solr/shard3,127.0.0.1:8080/solr/shard4
</str>
<str name="ident">true</str>
<str name="start">0</str>
<str name="q">(mainstreaming)</str>
<str name="rows">10</str>
</lst>
</lst>
<result name="response" numFound="2305" start="0" maxScore="4.657284">
>>>Here 10 docs<<<
</result>
<lst name="debug">
<str name="rawquerystring">(mainstreaming)</str>
<str name="querystring">(mainstreaming)</str>
<str name="parsedquery">ArticleText:mainstream</str>
<str name="parsedquery_toString">ArticleText:mainstream</str>
<str name="QParser">LuceneQParser</str>
<lst name="timing">
<double name="time">566.0</double>
<lst name="prepare">
<double name="time">17.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">17.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
<lst name="process">
<double name="time">549.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">353.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">196.0</double>
</lst>
</lst>
</lst>
<lst name="explain">
...
</lst>
</lst>
</response>

>>Also, I really have a hard time seeing what advantage you get from 
>>putting all those shards on the same machine, you're just creating 
>>extra work.
Yeah, on my production I have 5 servers and 6 shards (big shards) on each.
But I tried to use only one shard for each server (summary five shards) but
results wasn't fine.

>>Although there's one other possibility: By returning 2,000 rows, you 
>>require that each shard assemble a list of the top 2,000 documents 
>>and then they are collated into a single packet, so you're asking 
>>the system to do a lot of list processing.
So, as I understand, my main problem is to get 2000 rows from each shard?

P.S. Is any mechanism, for example, to get top 100 rows from each shard,
only merge it, sort by defined at query filed or score and pull result to
the user?

--
View this message in context: http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3597893.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message