nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yossi Tamari (JIRA)" <>
Subject [jira] [Created] (NUTCH-2715) WARCExporter fails on large records
Date Mon, 06 May 2019 12:21:00 GMT
Yossi Tamari created NUTCH-2715:

             Summary: WARCExporter fails on large records
                 Key: NUTCH-2715
             Project: Nutch
          Issue Type: Bug
    Affects Versions: 1.15
            Reporter: Yossi Tamari

com.martinkl.warc.WARCRecord throws an IllegalStateException when a single line is over 10,000 bytes.
Since this exception is not caught in WARCExporter, it fails the whole export.

I doubt that validity of the limitation in WARCRecord, but regardless, I think WARCExporter
should catch the exception and skip to the next record.

(See also [])

This message was sent by Atlassian JIRA

View raw message