nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Giuseppe Totaro (JIRA)" <j...@apache.org>
Subject [jira] [Created] (NUTCH-1998) Add support for user-defined file extension to CommonCrawlDataDumper
Date Wed, 22 Apr 2015 18:40:59 GMT
Giuseppe Totaro created NUTCH-1998:
--------------------------------------

             Summary: Add support for user-defined file extension to CommonCrawlDataDumper
                 Key: NUTCH-1998
                 URL: https://issues.apache.org/jira/browse/NUTCH-1998
             Project: Nutch
          Issue Type: Improvement
          Components: tool
            Reporter: Giuseppe Totaro
            Priority: Minor


{{CommonCrawlDataDumper}} tool is able to generate CBOR-encoded files, extracted from Nutch
crawled data, using the Common Crawl format. By default, {{CommonCrawlDataDumper}} uses the
original file extension.
We are going to add support for a command-line option (e.g., {{-extension}}) that allows the
user to provide a file extension to use in place of the original one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message