nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Ganyo (JIRA)" <>
Subject [jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore
Date Mon, 05 Jun 2006 14:00:30 GMT
    [ ] 

Scott Ganyo commented on NUTCH-258:

For the record:  I strongly object to closing this issue for the following reasons:

1) Having a *side-effect* of the entire system stop processing after merely logging a message
at a certain event level is a poor practice.  In fact, I believe that this would make a fantastic
anti-pattern.  If this kind of behavior is *really* wanted (and I argue that it should not
be below), it should be done through an explicit mechanism, not as a side-effect.  For example,
did you realize that since Hadoop hijacks and reassigns all log formatters (also a bad practice!)
in the org.apache.hadoop.util.LogFormatter static constructor that anyone using Nutch as a
library and logs a SEVERE error will suffer by having Nutch stop fetching?

2) Moreover, having the system stop processing forever more by use of a static(!) flag makes
the use of the Nutch system as a library within a server or service environment impossible.
 Once this logging is done, no more Fetcher processing in this run *or any other* can take
place.  This is inappropriate.  You might as well call System.exit() at this point!  In fact,
I could even argue that the current behavior is worse than a System.exit(), as it can actually
obfuscate why the system has ceased being operational even though it is still ostensibly "running."

Thus, while there definitely *are* instances of inappropriate logging levels being used and
I could document them, I believe that this issue is more endemic to the system and it's architecture
than the utilization of a particular logging level for a certain event.

> Once Nutch logs a SEVERE log item, Nutch fails forevermore
> ----------------------------------------------------------
>          Key: NUTCH-258
>          URL:
>      Project: Nutch
>         Type: Bug

>   Components: fetcher
>     Versions: 0.8-dev
>  Environment: All
>     Reporter: Scott Ganyo
>     Priority: Critical
>  Attachments: dumbfix.patch
> Once a SEVERE log item is written, Nutch shuts down any fetching forevermore.  This is
from the run() method in
>     public void run() {
>       synchronized (Fetcher.this) {activeThreads++;} // count threads
>       try {
>         UTF8 key = new UTF8();
>         CrawlDatum datum = new CrawlDatum();
>         while (true) {
>           if (LogFormatter.hasLoggedSevere())     // something bad happened
>             break;                                // exit
> Notice the last 2 lines.  This will prevent Nutch from ever Fetching again once this
is hit as LogFormatter is storing this data as a static.
> (Also note that "LogFormatter.hasLoggedSevere()" is also checked in
and will disable this class as well.)
> This must be fixed or Nutch cannot be run as any kind of long-running service.  Furthermore,
I believe it is a poor decision to rely on a logging event to determine the state of the application
- this could have any number of side-effects that would be extremely difficult to track down.
 (As it has already for me.)

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message