lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From marship <mars...@126.com>
Subject Re:Re: How to speed up solr search speed
Date Sat, 17 Jul 2010 09:28:31 GMT
Hi. Peter and All.
I merged my indexes today. Now each index stores 10M document. Now I only have 10 solr cores.

And I used 

java -Xmx1g -jar -server start.jar
to start the jetty server.

At first I deployed them all on one search. The search speed is about 3s. Then I noticed from
cmd output when search start, 4 of 10's QTime only cost about 10ms-500ms. The left 5 cost
more, up to 2-3s. Then I put 6 on web server, 4 on another(DB, high load most time). Then
the search speed goes down to about 1s most time. 
Now most search takes about 1s. That's great. 

I watched the jetty output on cmd windows on web server, now when each search start, I saw
2 of 6 costs 60ms-80ms. The another 4 cost 170ms - 700ms.  I do believe the bottleneck is
still the hard disk. But at least, the search speed at the moment is acceptable. Maybe i should
try memdisk to see if that help.


And for -Xmx1g, actually I only see jetty consume about 150M memory, consider now the index
is 10x bigger. I don't think that works. I googled -Xmx is go enlarge the heap size. Not sure
can that help search.  I still have 3.5G memory free on server. 

Now the issue I found is search with "fq" argument looks slow down the search.

Thanks All for your help and suggestions.
Thanks.
Regards.
Scott


在2010-07-17 03:36:19,"Peter Karich" <peathal@yahoo.de> 写道:
>> > Each solr(jetty) instance on consume 40M-60M memory.
>
>> java -Xmx1024M -jar start.jar
>
>That's a good suggestion!
>Please, double check that you are using the -server version of the jvm
>and the latest 1.6.0_20 or so.
>
>Additionally you can start jvisualvm (shipped with the jdk) and hook
>into jetty/tomcat easily to see the current CPU and memory load.
>
>> But I have 70 solr cores
>
>if you ask me: I would reduce them to 10-15 or even less and increase
>the RAM.
>try out tomcat too
>
>> solr distriubted search's speed is decided by the slowest one. 
>
>so, try to reduce the cores
>
>Regards,
>Peter.
>
>> you mentioned that you have a lot of mem free, but your yetty containers
>> only using between 40-60 mem.
>>
>> probably stating the obvious, but have you increased the -Xmx param like for
>> instance:
>> java -Xmx1024M -jar start.jar
>>
>> that way you're configuring the container to use a maximum of 1024 MB ram
>> instead of the standard which is much lower (I'm not sure what exactly but
>> it could well be 64MB for non -server, aligning with what you're seeing)
>>
>> Geert-Jan
>>
>> 2010/7/16 marship <marship@126.com>
>>
>>   
>>> Hi Tom Burton-West.
>>>
>>>  Sorry looks my email ISP filtered out your replies. I checked web version
>>> of mailing list and saw your reply.
>>>
>>>  My query string is always simple like "design", "principle of design",
>>> "tom"
>>>
>>>
>>>
>>> EG:
>>>
>>> URL:
>>> http://localhost:7550/solr/select/?q=design&version=2.2&start=0&rows=10&indent=on
>>>
>>> Response:
>>>
>>> <response>
>>> -
>>> <lst name="responseHeader">
>>> <int name="status">0</int>
>>> <int name="QTime">16</int>
>>> -
>>> <lst name="params">
>>> <str name="indent">on</str>
>>> <str name="start">0</str>
>>> <str name="q">design</str>
>>> <str name="version">2.2</str>
>>> <str name="rows">10</str>
>>> </lst>
>>> </lst>
>>> -
>>> <result name="response" numFound="5981" start="0">
>>> -
>>> <doc>
>>> <str name="id">product_208619</str>
>>> </doc>
>>>
>>>
>>>
>>>
>>>
>>> EG:
>>> http://localhost:7550/solr/select/?q=Principle&version=2.2&start=0&rows=10&indent=on
>>>
>>> <response>
>>> -
>>> <lst name="responseHeader">
>>> <int name="status">0</int>
>>> <int name="QTime">94</int>
>>> -
>>> <lst name="params">
>>> <str name="indent">on</str>
>>> <str name="start">0</str>
>>> <str name="q">Principle</str>
>>> <str name="version">2.2</str>
>>> <str name="rows">10</str>
>>> </lst>
>>> </lst>
>>> -
>>> <result name="response" numFound="104" start="0">
>>> -
>>> <doc>
>>> <str name="id">product_56926</str>
>>> </doc>
>>>
>>>
>>>
>>> As I am querying over single core and other cores are not querying at same
>>> time. The QTime looks good.
>>>
>>> But when I query the distributed node: (For this case, 6422ms is still a
>>> not bad one. Many cost ~20s)
>>>
>>> URL:
>>> http://localhost:7499/solr/select/?q=the+first+world+war&version=2.2&start=0&rows=10&indent=on&debugQuery=true
>>>
>>> Response:
>>>
>>> <response>
>>> -
>>> <lst name="responseHeader">
>>> <int name="status">0</int>
>>> <int name="QTime">6422</int>
>>> -
>>> <lst name="params">
>>> <str name="debugQuery">true</str>
>>> <str name="indent">on</str>
>>> <str name="start">0</str>
>>> <str name="q">the first world war</str>
>>> <str name="version">2.2</str>
>>> <str name="rows">10</str>
>>> </lst>
>>> </lst>
>>> -
>>> <result name="response" numFound="4231" start="0">
>>>
>>>
>>>
>>> Actually I am thinking and testing a solution: As I believe the bottleneck
>>> is in harddisk and all our indexes add up is about 10-15G. What about I just
>>> add another 16G memory to my server then use "MemDisk" to map a memory disk
>>> and put all my indexes into it. Then each time, solr/jetty need to load
>>> index from harddisk, it is loading from memory. This should give solr the
>>> most throughout and avoid the harddisk access delay. I am testing ....
>>>
>>> But if there are way to make solr use better use our limited resource to
>>> avoid adding new ones. that would be great.
>>>
>>>
>>>
>>>
>>>
>>>
>>>     
>>   
>
>
>-- 
>http://karussell.wordpress.com/
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message