nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-1741) Support of Sitemaps in Nutch 2.x
Date Tue, 26 Jan 2016 19:44:40 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117839#comment-15117839
] 

Hudson commented on NUTCH-1741:
-------------------------------

SUCCESS: Integrated in Nutch-nutchgora #1548 (See [https://builds.apache.org/job/Nutch-nutchgora/1548/])
NUTCH-1741 Support of Sitemaps in Nutch 2.x (lewismc: [http://svn.apache.org/viewvc/nutch/branches/2.x/?view=rev&rev=1726853])
* 2.x/.gitignore
* 2.x/conf/gora-accumulo-mapping.xml
* 2.x/conf/gora-cassandra-mapping.xml
* 2.x/conf/gora-hbase-mapping.xml
* 2.x/conf/gora-mongodb-mapping.xml
* 2.x/conf/gora-solr-mapping.xml
* 2.x/conf/nutch-default.xml
* 2.x/src/gora/webpage.avsc
* 2.x/src/java/org/apache/nutch/crawl/DbUpdateMapper.java
* 2.x/src/java/org/apache/nutch/crawl/DbUpdaterJob.java
* 2.x/src/java/org/apache/nutch/crawl/GeneratorJob.java
* 2.x/src/java/org/apache/nutch/crawl/GeneratorMapper.java
* 2.x/src/java/org/apache/nutch/crawl/InjectType.java
* 2.x/src/java/org/apache/nutch/crawl/InjectorJob.java
* 2.x/src/java/org/apache/nutch/fetcher/FetcherJob.java
* 2.x/src/java/org/apache/nutch/fetcher/FetcherReducer.java
* 2.x/src/java/org/apache/nutch/metadata/Metadata.java
* 2.x/src/java/org/apache/nutch/metadata/Nutch.java
* 2.x/src/java/org/apache/nutch/net/URLFilters.java
* 2.x/src/java/org/apache/nutch/parse/NutchSitemapParse.java
* 2.x/src/java/org/apache/nutch/parse/NutchSitemapParser.java
* 2.x/src/java/org/apache/nutch/parse/ParseUtil.java
* 2.x/src/java/org/apache/nutch/parse/ParserJob.java
* 2.x/src/java/org/apache/nutch/storage/Mark.java
* 2.x/src/java/org/apache/nutch/storage/WebPage.java
* 2.x/src/java/org/apache/nutch/tools/Benchmark.java
* 2.x/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
* 2.x/src/test/org/apache/nutch/crawl/TestGenerator.java
* 2.x/src/test/org/apache/nutch/crawl/TestInjector.java
* 2.x/src/test/org/apache/nutch/fetcher/TestFetcher.java
* 2.x/src/test/org/apache/nutch/parse/TestSitemapParser.java
* 2.x/src/test/org/apache/nutch/util/CrawlTestUtil.java
* 2.x/src/test/org/apache/nutch/util/HelloHandler.java
* 2.x/src/testresources/fetch-test-site/robots.txt
* 2.x/src/testresources/fetch-test-site/sitemap1.xml
* 2.x/src/testresources/fetch-test-site/sitemap2.xml
* 2.x/src/testresources/fetch-test-site/sitemapIndex.xml


> Support of Sitemaps in Nutch 2.x
> --------------------------------
>
>                 Key: NUTCH-1741
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1741
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher, generator
>            Reporter: Alparslan Avcı
>            Assignee: cihad güzel
>              Labels: gsoc2015
>             Fix For: 2.4
>
>         Attachments: NUTCH-1741-v2.patch, NUTCH-1741-v3.patch, NUTCH-1741-v4.patch, NUTCH-1741.patch,
NUTCH-1741v5.patch, NUTCH-1741v6.patch, NUTCH-1741v7.patch, SitemapCrawlerLifeCycle.pdf, SitemapDevelopmentFor2x.pdf
>
>
> Sitemap support has to be implemented for 2.x branch. It is being discussed in NUTCH-1465
for trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message