lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Klaas <mike.kl...@gmail.com>
Subject Re: Indexing very large files.
Date Thu, 06 Sep 2007 20:55:45 GMT
On 6-Sep-07, at 2:26 AM, Brian Carmalt wrote:

> Hallo again,
>
> I checked out the solr source and built the 1.3-dev version and  
> then I tried to index the same file to the new server.
> I do get a different exception trace, but the result is the same.

Note that StringBuilder expands capacity by allocating a new buffer  
and copying the old one in, so double the memory is needed during  
that operation.  The new buffer is probably a good fraction bigger  
(traditionally 2x, but typically implemented 1/8 or 1/4), so simply  
storing the text for that one document could require 600-700MB for  
that expansion operation.  Then you have overhead for the doc, and  
all the other solr memory requirements... also perhaps the serialized  
xml is in memory too, which brings us back up to close to a gig.

Under Solr grows special support for processing huge docs without  
copying, just increase you jvm while indexing such hugeness.  (Note  
that other input methods, like cvs, might behave better, but I  
haven't examined them to verify.)

-Mike

> java.lang.OutOfMemoryError: Java heap space
>    at java.util.Arrays.copyOf(Arrays.java:2882)
>    at java.lang.AbstractStringBuilder.expandCapacity 
> (AbstractStringBuilder.java:100)
>    at java.lang.AbstractStringBuilder.append 
> (AbstractStringBuilder.java:390)
>    at java.lang.StringBuilder.append(StringBuilder.java:119)
>    at org.apache.solr.handler.XmlUpdateRequestHandler.readDoc 
> (XmlUpdateRequestHandler.java:310)
>    at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate 
> (XmlUpdateRequestHandler.java:181)
>    at  
> org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody 
> (XmlUpdateRequestHandler.java:109)
>    at org.apache.solr.handler.RequestHandlerBase.handleRequest 
> (RequestHandlerBase.java:78)
>    at org.apache.solr.core.SolrCore.execute(SolrCore.java:723)
>    at org.apache.solr.servlet.SolrDispatchFilter.execute 
> (SolrDispatchFilter.java:193)
>    at org.apache.solr.servlet.SolrDispatchFilter.doFilter 
> (SolrDispatchFilter.java:161)
>    at  
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter 
> (ApplicationFilterChain.java:235)
>    at org.apache.catalina.core.ApplicationFilterChain.doFilter 
> (ApplicationFilterChain.java:206)
>    at org.apache.catalina.core.StandardWrapperValve.invoke 
> (StandardWrapperValve.java:230)
>    at org.apache.catalina.core.StandardContextValve.invoke 
> (StandardContextValve.java:175)
>    at org.apache.catalina.core.StandardHostValve.invoke 
> (StandardHostValve.java:128)
>    at org.apache.catalina.valves.ErrorReportValve.invoke 
> (ErrorReportValve.java:104)
>    at org.apache.catalina.core.StandardEngineValve.invoke 
> (StandardEngineValve.java:109)
>    at org.apache.catalina.connector.CoyoteAdapter.service 
> (CoyoteAdapter.java:261)
>    at org.apache.coyote.http11.Http11Processor.process 
> (Http11Processor.java:844)
>    at org.apache.coyote.http11.Http11Protocol 
> $Http11ConnectionHandler.process(Http11Protocol.java:581)
>    at org.apache.tomcat.util.net.JIoEndpoint$Worker.run 
> (JIoEndpoint.java:447)
>    at java.lang.Thread.run(Thread.java:619)


Mime
View raw message