lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Anand.Ni...@rbs.com>
Subject RE: OutOfMemoryError coming from TermVectorsReader
Date Fri, 23 Sep 2011 08:39:35 GMT
Thanks Otis,

I am able to show the results such that the last match (500 characters around the match) in
the log file is shown highlighted. I can try creating multiple documents from one log file
to see if it improves the performance.

Can anything else be done to reduce the heap size?

Anand Nigam
RBS Global Banking & Markets
Office: +91 124 492 5506   

-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com] 
Sent: 23 September 2011 09:35
To: solr-user@lucene.apache.org
Subject: Re: OutOfMemoryError coming from TermVectorsReader

Anand,

But do you really want the whole log file to be a single Solr document (from a cursory look
at the thread it seems that is the case).  Why not break up a log file into multiple documents?
e.g. each log message could be one Solr document.  Not only will that solve your memory issues,
but I think it also makes more sense if the intention is for a person to do a search and then
look at the matched log messages - much easier if you point a person to a short log doc than
a giant ones through which the person then has to do a manual find.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/


----- Original Message -----
> From: "Anand.Nigam@rbs.com" <Anand.Nigam@rbs.com>
> To: solr-user@lucene.apache.org
> Cc: 
> Sent: Thursday, September 22, 2011 11:56 PM
> Subject: RE: OutOfMemoryError coming from TermVectorsReader
> 
> Hi,
> 
> I am trying to index application log files and some database tables. 
> Size of the log files range from 1 MB to 100 MB. Database tables also 
> have few thousands of rows.
> 
> I have used termvector highlighter for the content of the log files as 
> mentioned
> below:
> 
> Heap size : 10 GB
> OS: Linux, 64 bit
> Solr version : 3.4.0
> 
> Thanks & Regards
> Anand
> 
> 
> 
> Anand Nigam
> RBS Global Banking & Markets
> Office: +91 124 492 5506
> 
> -----Original Message-----
> From: Glen Newton [mailto:glen.newton@gmail.com]
> Sent: 19 September 2011 16:52
> To: solr-user@lucene.apache.org
> Subject: Re: OutOfMemoryError coming from TermVectorsReader
> 
> Please include information about your heap size, (and other Java 
> command line
> arguments) as well a platform OS (version, swap size, etc), Java 
> version, underlying hardware (RAM, etc) for us to better help you.
> 
> From the information you have given, increasing your heap size should help.
> 
> Thanks,
> Glen
> 
> http://zzzoot.blogspot.com/
> 
> 
> On Mon, Sep 19, 2011 at 1:34 AM,  <Anand.Nigam@rbs.com> wrote:
>>  Hi,
>> 
>>  I am new to solr. I an trying to index text documents of large size. 
>> On
> searching from indexed documents I am getting following 
> OutOfMemoryError. Please help me in resolving this issue.
>> 
>>  The field which stores file content is configured in schema.xml as below:
>> 
>> 
>>  <field name="Content" type="text_token" 
> indexed="true" stored="true" 
>>  omitNorms="true" termVectors="true" 
> termPositions="true" 
>>  termOffsets="true" />
>> 
>>  and Highlighting is configured as below:
>> 
>> 
>>  <str name="hl">on</str>
>> 
>>  <str name="hl.fl">${all.fields.list}</str>
>> 
>>  <str name="f.Content.hl.fragsize">500</str>
>> 
>>  <str
> name="f.Content.hl.useFastVectorHighlighter">true</str>
>> 
>> 
>> 
>>  2011-09-16 09:38:45.763 [http-thread-pool-9091(5)] ERROR -
>>  java.lang.OutOfMemoryError: Java heap space
>>         at
>>  
>> org.apache.lucene.index.TermVectorsReader.readTermVector(TermVectorsR
>> e
>>  ader.java:503)
>>         at
>>  
>> org.apache.lucene.index.TermVectorsReader.get(TermVectorsReader.java:
>> 2
>>  63)
>>         at
>>  
>> org.apache.lucene.index.TermVectorsReader.get(TermVectorsReader.java:
>> 2
>>  84)
>>         at
>>  org.apache.lucene.index.SegmentReader.getTermFreqVector(SegmentReader.
>>  java:759)
>>         at
>>  
>> org.apache.lucene.index.DirectoryReader.getTermFreqVector(DirectoryRe
>> a
>>  der.java:510)
>>         at
>>  
>> org.apache.solr.search.SolrIndexReader.getTermFreqVector(SolrIndexRea
>> d
>>  er.java:234)
>>         at
>> 
> org.apache.lucene.search.vectorhighlight.FieldTermStack.<init>(FieldTe
>>  rmStack.java:83)
>>         at
>>  
>> org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getFie
>> l
>>  dFragList(FastVectorHighlighter.java:175)
>>         at
>>  
>> org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getBes
>> t
>>  Fragments(FastVectorHighlighter.java:166)
>>         at
>>  
>> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByFast
>> V
>>  ectorHighlighter(DefaultSolrHighlighter.java:509)
>>         at
>>  
>> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(Defau
>> l
>>  tSolrHighlighter.java:376)
>>         at
>>  
>> org.apache.solr.handler.component.HighlightComponent.process(Highligh
>> t
>>  Component.java:116)
>>         at
>>  
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(Sea
>> r
>>  chHandler.java:194)
>>         at
>>  
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
>> e
>>  rBase.java:129)
>>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
>>         at
>>  org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.
>>  java:356)
>>         at
>>  
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
>> r
>>  .java:252)
>>         at
>>  
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appl
>> i
>>  cationFilterChain.java:256)
>>         at
>>  
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationF
>> i
>>  lterChain.java:215)
>>         at
>>  
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperV
>> a
>>  lve.java:279)
>>         at
>>  
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextV
>> a
>>  lve.java:175)
>>         at
>>  
>> org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.j
>> a
>>  va:655)
>>         at
>>  
>> org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.jav
>> a
>>  :595)
>>         at
>>  com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:98)
>>         at
>>  
>> com.sun.enterprise.web.PESessionLockingStandardPipeline.invoke(PESess
>> i
>>  onLockingStandardPipeline.java:91)
>>         at
>>  
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.j
>> a
>>  va:162)
>>         at
>>  
>> org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.j
>> a
>>  va:326)
>>         at
>>  
>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.jav
>> a
>>  :227)
>>         at
>>  
>> com.sun.enterprise.v3.services.impl.ContainerMapper.service(Container
>> M
>>  apper.java:170)
>>         at
>>  
>> com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:8
>> 2
>>  2)
>>         at
>>  com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:719)
>>         at
>>  com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1013)
>> 
>>  Thanks & Regards
>>  Anand Nigam
>>  Developer
>> 
>> 
>>  
>> *********************************************************************
>> *
>>  ************* The Royal Bank of Scotland plc. Registered in Scotland  
>> No 90312.
>>  Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB.
>>  Authorised and regulated by the Financial Services Authority. The  
>> Royal Bank of Scotland N.V. is authorised and regulated by the De  
>> Nederlandsche Bank and has its seat at Amsterdam, the Netherlands, 
>> and  is registered in the Commercial Register under number 33002587.
>>  Registered Office: Gustav Mahlerlaan 350, Amsterdam, The Netherlands. 
>>  The Royal Bank of Scotland N.V. and The Royal Bank of Scotland plc 
>> are  authorised to act as agent for each other in certain jurisdictions.
>> 
>>  This e-mail message is confidential and for use by the addressee only.
>>  If the message is received by anyone other than the addressee, 
>> please  return the message to the sender by replying to it and then 
>> delete the  message from your computer. Internet e-mails are not 
>> necessarily  secure. The Royal Bank of Scotland plc and The Royal 
>> Bank of Scotland  N.V. including its affiliates ("RBS group") does 
>> not accept  responsibility for changes made to this message after it 
>> was sent. For  the protection of RBS group and its clients and 
>> customers, and in  compliance with regulatory requirements, the 
>> contents of both incoming  and outgoing e-mail communications, which 
>> could include proprietary  information and Non-Public Personal 
>> Information, may be read by  authorised persons within RBS group other than the intended
recipient(s).
>> 
>>  Whilst all reasonable care has been taken to avoid the transmission 
>> of  viruses, it is the responsibility of the recipient to ensure that 
>> the  onward transmission, opening or use of this message and any  
>> attachments will not adversely affect its systems or data. No  
>> responsibility is accepted by the RBS group in this regard and the  
>> recipient should carry out such virus and other checks as it 
>> considers
> appropriate.
>> 
>>  Visit our website at www.rbs.com
>> 
>>  
>> *********************************************************************
>> *
>>  *************
>> 
> 
> 
> 
> --
> -
> 
> -
>

Mime
View raw message