lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Index search optimization for fulltext remote streaming
Date Fri, 11 Jun 2010 22:02:29 GMT
'mergeFactor' should be 5 or 10, not 40k. This means Solr can open
thousands of small files and this will not work well.

ramBufferSizeMB  is 1G. The entire solr has 1G allocated, so there may
be a lot of garbage collection. Try 50 to 100 megs for
ramBufferSizeMB.

1G is a little small for doing large numbers of fulltext documents.

On Wed, Jun 9, 2010 at 2:59 AM, Danyal Mark <mark.danyal@gmail.com> wrote:
>
> We have following solr configuration:
>
> java -Xms512M -Xmx1024M -Dsolr.solr.home=<solr home directory> -jar
> start.jar
>
> in SolrConfig.xml
>
>  <indexDefaults>
>    <useCompoundFile>false</useCompoundFile>
>    <mergeFactor>40000</mergeFactor>
>    <maxBufferedDocs>200000</maxBufferedDocs>
>    <ramBufferSizeMB>1024</ramBufferSizeMB>
>    <maxFieldLength>10000</maxFieldLength>
>    <writeLockTimeout>1000</writeLockTimeout>
>    <commitLockTimeout>10000</commitLockTimeout>
>    <lockType>native</lockType>
>  </indexDefaults>
>
>
> <mainIndex>
>    <useCompoundFile>false</useCompoundFile>
>    <ramBufferSizeMB>1024</ramBufferSizeMB>
>    <mergeFactor>40000</mergeFactor>
>    <!-- Deprecated -->
>    <!--<maxBufferedDocs>10</maxBufferedDocs>-->
>    <!--<maxMergeDocs>2147483647</maxMergeDocs>-->
>    <unlockOnStartup>false</unlockOnStartup>
>    <reopenReaders>true</reopenReaders>
>    <deletionPolicy class="solr.SolrDeletionPolicy">
>      <str name="maxCommitsToKeep">1</str>
>      <str name="maxOptimizedCommitsToKeep">0</str>
>    </deletionPolicy>
>     <infoStream file="INFOSTREAM.txt">false</infoStream>
>  </mainIndex>
>
>
> Also, we have used autoCommit=false. We have our PC spec:
>
> Core2-Duo
> 2GB RAM
> Solr Server running in localhost
> Index Directory is also in local FileSystem
> Input Fulltext files using remoteStreaming from another PC
>
>
> Here, when we indexed 100000 Fulltext documents, the total time taken is
> 40mins. We want to optimize the time lesser to this. We have been studying
> on UpdateRequestProcessorChain section
>
> <requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
>  <lst name="defaults">
>   <str name="update.processor">dedupe</str>
>  </lst>
>  </requestHandler>
>
> How to use this UpdateRequestProcessorChain in /update/extract/ to run
> indexing in multiple chains (i.e multiple threads). Can you suggest me if I
> can optimize the process changing any of these configurations?
>
> with regards,
> Danyal Mark
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Index-search-optimization-for-fulltext-remote-streaming-tp828274p881809.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message