nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexis (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (NUTCH-950) Content-Length limit, URL filter and few minor issues
Date Mon, 10 Jan 2011 10:30:47 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexis resolved NUTCH-950.
--------------------------

       Resolution: Fixed
    Fix Version/s: 2.0

Sorry I missed the Ivy configuration file in the plugin directory.

See NUTCH-955 for the new Ivy issue.

> Content-Length limit, URL filter and few minor issues
> -----------------------------------------------------
>
>                 Key: NUTCH-950
>                 URL: https://issues.apache.org/jira/browse/NUTCH-950
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 2.0
>            Reporter: Alexis
>             Fix For: 2.0
>
>         Attachments: nutch1.patch, nutch2.patch, nutch3.patch, nutch4.patch
>
>
> 1. crawl command (nutch1.patch)
> The class was renamed to Crawler but the references to it were not updated.
> 2. URL filter (nutch2.patch)
> This avoids a NPE on bogus urls which host do not have a suffix.
> 3. Content-Length limit (nutch3.patch)
> This is related to NUTCH-899.
> The patch avoids the entire flush operation on the Gora datastore to crash because the
MySQL blob limit was exceeded by a few bytes. Both protocol-http and protocol-httpclient plugins
were problematic.
> 4. Ivy configuration (nutch4.patch)
> - Change xercesImpl and restlet versions. These 2 version changes are required. The first
one currently makes a JUnit test crash, the second one is missing in default Maven repository.
> - Add gora-hbase, zookeeper which is an HBase dependency. Add MySQL connector. These
jars are necesary to run Gora with HBase or MySQL datastores. (more a suggestion that a requirement
here)
> - Add com.jcraft/jsch, which is a protocol-sftp plugin dependency. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message