lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boogie Shafer <Boogie.Sha...@proquest.com>
Subject Re: Recovering from Out of Mem
Date Tue, 14 Oct 2014 14:51:06 GMT
yago,

you can put more complex restart logic as shown in the examples below or just do something
similar to the java_oom.sh i posted earlier where you just spit out an email alert and deal
with service restarts and troubleshooting manually


e.g. something like the following for a java_error.sh will drop an email with a timestamp



echo `date` | mail -s "Java Error: General - $HOSTNAME" notify@domain.com


________________________________________
From: Tim Potter <tim.potter@lucidworks.com>
Sent: Tuesday, October 14, 2014 07:35
To: solr-user@lucene.apache.org
Subject: Re: Recovering from Out of Mem

jfyi - the bin/solr script does the following:

-XX:OnOutOfMemoryError="$SOLR_TIP/bin/oom_solr.sh $SOLR_PORT" where
$SOLR_PORT is the port Solr is bound to, e.g. 8983

The oom_solr.sh script looks like:

SOLR_PORT=$1

SOLR_PID=`ps waux | grep start.jar | grep $SOLR_PORT | grep -v grep | awk
'{print $2}' | sort -r`

if [ "$SOLR_PID" == "" ]; then

  echo "Couldn't find Solr process running on port $SOLR_PORT!"

  exit

fi

NOW=$(date +"%F%T")

(

echo "Running OOM killer script for process $SOLR_PID for Solr on port
$SOLR_PORT"

kill -9 $SOLR_PID

echo "Killed process $SOLR_PID"

) | tee solr_oom_killer-$SOLR_PORT-$NOW.log


I usually run Solr behind a supervisor type process (supervisord or
upstart) that will restart it if the process dies.


On Tue, Oct 14, 2014 at 8:09 AM, Markus Jelsma <markus@openindex.io> wrote:

> This will do:
> kill -9 `ps aux | grep -v grep | grep tomcat6 | awk '{print $2}'`
>
> pkill should also work
>
> On Tuesday 14 October 2014 07:02:03 Yago Riveiro wrote:
> > Boogie,
> >
> >
> >
> >
> > Any example for java_error.sh script?
> >
> >
> > —
> > /Yago Riveiro
> >
> > On Tue, Oct 14, 2014 at 2:48 PM, Boogie Shafer <
> Boogie.Shafer@proquest.com>
> >
> > wrote:
> > > a really simple approach is to have the OOM generate an email
> > > e.g.
> > > 1) create a simple script (call it java_oom.sh) and drop it in your
> tomcat
> > > bin dir echo `date` | mail -s "Java Error: OutOfMemory - $HOSTNAME"
> > > notify@domain.com 2) configure your java options (in setenv.sh or
> > > similar) to trigger heap dump and the email script when OOM occurs #
> > > config error behaviors
> > > CATALINA_OPTS="$CATALINA_OPTS -XX:+HeapDumpOnOutOfMemoryError
> > > -XX:HeapDumpPath=$TOMCAT_DIR/temp/tomcat-dump.hprof
> > > -XX:OnError=$TOMCAT_DIR/bin/java_error.sh
> > > -XX:OnOutOfMemoryError=$TOMCAT_DIR/bin/java_oom.sh
> > > -XX:ErrorFile=$TOMCAT_DIR/temp/java_error%p.log"
> > > ________________________________________
> > > From: Mark Miller <markrmiller@gmail.com>
> > > Sent: Tuesday, October 14, 2014 06:30
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Recovering from Out of Mem
> > > Best is to pass the Java cmd line option that kills the process on OOM
> and
> > > setup a supervisor on the process to restart it.  You need a somewhat
> > > recent release for this to work properly though. - Mark
> > >
> > >> On Oct 14, 2014, at 9:06 AM, Salman Akram
> > >> <salman.akram@northbaysolutions.net> wrote:
> > >>
> > >> I know there are some suggestions to avoid OOM issue e.g. setting
> > >> appropriate Max Heap size etc. However, what's the best way to recover
> > >> from
> > >> it as it goes into non-responding state? We are using Tomcat on back
> end.
> > >>
> > >> The scenario is that once we face OOM issue it keeps on taking queries
> > >> (doesn't give any error) but they just time out. So even though we
> have a
> > >> fail over system implemented but we don't have a way to distinguish if
> > >> these are real time out queries OR due to OOM.
> > >>
> > >> --
> > >> Regards,
> > >>
> > >> Salman Akram
>
>
Mime
View raw message