nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Nutch Wiki] Update of "GoogleSummerOfCode/SitemapCrawler" by CihadGuzel
Date Sun, 23 Aug 2015 11:36:44 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "GoogleSummerOfCode/SitemapCrawler" page has been changed by CihadGuzel:
https://wiki.apache.org/nutch/GoogleSummerOfCode/SitemapCrawler?action=diff&rev1=10&rev2=11

    * Week2 (1June-7June): Sitemap detection will be done. FetcherJob will be updated for
  sitemap.
    * Week3-4 (8June-21June): The parser process is updated for sitemap file parser. New parser
plugins can be developed. 
    * Week5 (22June-28June): DbUpdaterJob is updated for sitemap.
-   * Midterm(26June-3 July): Bu kısma kadar sİtemap yaşam döngüsü ana hatlarıyla implemente
edilmiş olacaktır. Yapılanlar ve yapılacaklar değerlendirilecektir. Sitemap crawlera
basitçe çalışır hale getirilmiş olacaktır.
+   * Midterm(26June-3 July): Up to this stage, sitemap life cycle has been developed according
to the outline. Sitemap crawler runs simply. The process until now and from now on will be
evaluated.
    * Week6-7 (29June-12July): Sitemap ranking mechanism will be developed.
    * Week8 (13July-19July): Sitemap black list, sitemap file detection yapılacak ve error
detection yapılacak
    * Week9 (20July-26July): Frequent mechanism will be developed

Mime
View raw message