nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Ciborowski (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (NUTCH-1517) CloudSearch indexer
Date Mon, 09 Sep 2013 18:11:52 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Daniel Ciborowski updated NUTCH-1517:
-------------------------------------

    Comment: was deleted

(was: Does this process work with the data stored in hdfs? or does it have to be stored on
local file system? Still not able to get nutch to save segments though... But when I tried
to use the index on my previously crawled data I am still getting the matched 0 files errors.
)
    
> CloudSearch indexer
> -------------------
>
>                 Key: NUTCH-1517
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1517
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>            Reporter: Julien Nioche
>             Fix For: 1.9
>
>         Attachments: 0023883254_1377197869_indexer-cloudsearch.patch
>
>
> Once we have made the indexers pluggable, we should add a plugin for Amazon CloudSearch.
See http://aws.amazon.com/cloudsearch/. Apparently it uses a JSON based representation Search
Data Format (SDF), which we could reuse for a file based indexer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message