lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From entdeveloper <>
Subject Re: Plugin Performance Issues
Date Thu, 29 Oct 2009 18:49:52 GMT

Here is where our custom class is referenced in the schema:

<fieldtype name="text_lc" class="solr.TextField" tokenized="false">
  <analyzer type="index">
    <tokenizer class="my.custom.TokenizerFactory"/>
    <filter class="my.custom.FilterFactory" words="stopwords.txt"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>

As you can see, we built our own field type to be used at index time to
essentially act as a sort of KeywordTokenizer, but removing stopwords.  We
share a schema.xml for both master and slave servers for convenience, but we
only do indexing on the master server.  However, with this schema in place
on the slaves, as well as our custom.jar in the solrHome/lib directory, we
run into these issues where the memory usage grows and grows without

We've done this before (earlier in this thread) with having a custom
spelling implementation too, and we ran into the same problem. We since gave
up on that fix, but this is our very next attempt at deploying custom code
using solr's plugin capability.  Unfortunately, we got the same results.  In
fact, in a previous try, we had simply dropped one of our custom plugin jars
into the lib directory but forgot to deploy the new solrconfig or schema
files that referenced the classes in there, and the issue still occurred.

Anyway, for now we've been able to get around this by packaging the solr.war
with our custom jars in the WEB-INF/lib. Although this is more proper
anyway, it's not nearly as convenient as being able to drop jars into an
external lib directory and let solr pick up our classes that way.  I'm still
curious if this is unique to our environment or if there's a bug with solr's
classloading for the plugin functionality.

Grant Ingersoll-6 wrote:
> I would guess that your code is being used.  I'm not sure what you  
> mean by it "was only referenced in the schema".  That implies usage to  
> me.  Is it a new field type?  What is your plugin doing?
> Have you tried setting breakpoints at method entry points in your  
> plugin and starting up Solr w/ a debugger attached.
> -Grant
> On Oct 28, 2009, at 4:54 PM, entdeveloper wrote:
>> This is an issue we experienced a while back.  We once again tried  
>> to load a
>> custom class as a plugin jar from the lib directory and began  
>> experiencing
>> severe memory problems again.  The code in our jar wasn't being used  
>> at
>> all...the class was only referenced in the schema.  I find it  
>> strange that
>> no one else has experienced this, but we're not doing anything  
>> particularly
>> complex, which is still leading me to believe that there is something
>> strange going on with Solr's class loading for this lib directory.   
>> Perhaps
>> it is something specific with our environment (specs below)?
>> java version "1.6.0_05"
>> Java(TM) SE Runtime Environment (build 1.6.0_05-b13)
>> Java HotSpot(TM) 64-Bit Server VM (build 10.0-b19, mixed mode)
>> Tomcat 6.0.16
>> Linux 2.6.9-35.ELsmp #1 SMP Thu Jun 1 14:31:29 PDT 2006 x86_64  
>> x86_64 x86_64
>> GNU/Linux
>> Max heap set to 1GB.
>> With the jars in the plugin directory, RAM usage increases by 1.5 -  
>> 2GB,
>> increasing at about 200MB/hr.
>> hossman wrote:
>>> : I'm not entirely convinced that it's related to our code, but it  
>>> could
>>> be.
>>> : Just trying to get a sense if other plugins have had similar  
>>> problems,
>>> just
>>> : by the nature of using Solr's resource loading from the /lib  
>>> directory.
>>> Plugins aren't something that every Solr users -- but enough people  
>>> use
>>> them that if there was a fundemental memory leak just from loading  
>>> plugin
>>> jars i'm guessing more people would be complaining.
>>> I use plugins in several solr instances, and i've never noticed any
>>> problems like you describe -- but i don't personally use tomcat.
>>> Otis is right on the money: you need to use profiling tools to  
>>> really look
>>> at the heap and see what's taking up all that ram.
>>> Alternately: a quick way to rule out the special plugin class  
>>> loader would
>>> be to embed your custom handler directly into the solr.war ("The  
>>> Old Way"
>>> on the SolrPlugins wiki) ... if you still have problems, then the  
>>> cause
>>> isn't the plugin classloader.
>>> -Hoss
>> -- 
>> View this message in context:
>> Sent from the Solr - User mailing list archive at
> --------------------------
> Grant Ingersoll
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
> using Solr/Lucene:

View this message in context:
Sent from the Solr - User mailing list archive at

View raw message