nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebastian Nagel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-1182) fetcher to log hung threads
Date Thu, 24 Apr 2014 22:15:16 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980389#comment-13980389
] 

Sebastian Nagel commented on NUTCH-1182:
----------------------------------------

Changed title: shutting down hung threads isn't be addressed now.

> fetcher to log hung threads
> ---------------------------
>
>                 Key: NUTCH-1182
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1182
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.3, 1.4
>         Environment: Linux, local job runner
>            Reporter: Sebastian Nagel
>            Assignee: Sebastian Nagel
>            Priority: Minor
>             Fix For: 2.4, 1.9
>
>         Attachments: NUTCH-1182-2x.patch, NUTCH-1182-trunk-v1.patch
>
>
> While crawling a slow server with a couple of very large PDF documents (30 MB) on it
> after some time and a bulk of successfully fetched documents the fetcher stops
> with the message: ??Aborting with 10 hung threads.??
> From now on every cycle ends with hung threads, almost no documents are fetched
> successfully. In addition, strange hadoop errors are logged:
> {noformat}
>    fetch of http://.../xyz.pdf failed with: java.lang.NullPointerException
>     at java.lang.System.arraycopy(Native Method)
>     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1108)
>     ...
> {noformat}
> or
> {noformat}
>    Exception in thread "QueueFeeder" java.lang.NullPointerException
>          at org.apache.hadoop.fs.BufferedFSInputStream.getPos(BufferedFSInputStream.java:48)
>          at org.apache.hadoop.fs.FSDataInputStream.getPos(FSDataInputStream.java:41)
>          at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:214)
> {noformat}
> I've run the debugger and found:
> # after the "hung threads" are reported the fetcher stops but the threads are still alive
and continue fetching a document. In consequence, this will
> #* limit the small bandwidth of network/server even more
> #* after the document is fetched the thread tries to write the content via {{output.collect()}}
which must fail because the fetcher map job is already finished and the associated temporary
mapred directory is deleted. The error message may get mixed with the progress output of the
next fetch cycle causing additional confusion.
> # documents/URLs causing the hung thread are never reported nor stored. That is, it's
hard to track them down, and they will cause a hung thread again and again.
> The problem is reproducible when fetching bigger documents and setting {{mapred.task.timeout}}
to a low value (this will definitely cause hung threads).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message