nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ken Krugler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (NUTCH-786) Better list of suffix domains
Date Fri, 05 Feb 2010 14:06:27 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12830109#action_12830109
] 

Ken Krugler commented on NUTCH-786:
-----------------------------------

Is this something that should also be applied to crawler-commons? I believe Ian had added
support for finding "Effective TLDs" and that this support included an "effective_tld_names.dat"
file.


> Better list of suffix domains
> -----------------------------
>
>                 Key: NUTCH-786
>                 URL: https://issues.apache.org/jira/browse/NUTCH-786
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Julien Nioche
>            Assignee: Julien Nioche
>             Fix For: 1.1
>
>         Attachments: NUTCH-786.patch
>
>
> Small improvement to the content of domain-suffixes.xml : added compound TLD for .ar,
.co, .id, .il, .mx, .nz and .za

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message