nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Jelsma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-1660) Index filter for Page's latitude and longitude
Date Wed, 01 Oct 2014 12:41:34 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14154755#comment-14154755
] 

Markus Jelsma commented on NUTCH-1660:
--------------------------------------

This may be useful indeed but will it be reliable? In some cases we need the location of the
website, or better, the location of the intended audience of the website for timezone detection.
But the GeoIP databases are far from reliable - although paid versions improve on that - and
the IP itself is not reliable.

Do you guys have some workaround for that? Or just accept the error rate?

> Index filter for Page's latitude and longitude
> ----------------------------------------------
>
>                 Key: NUTCH-1660
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1660
>             Project: Nutch
>          Issue Type: New Feature
>    Affects Versions: 2.2.1
>            Reporter: Talat UYARER
>            Assignee: Talat UYARER
>            Priority: Minor
>             Fix For: 2.4
>
>         Attachments: index-more.patch
>
>
> I see some discuss about page's ip storing. I think If we have page's ip, we can index
page's geo position as latitude and longitude. That use for location based searches. 
> [~icebergx5] I know you have a patch about this in your secret patches  :) Can you share
us ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message