Apache Jenkins Server |
Jenkins build is back to normal : Nutch-nutchgora #270 |
Fri, 01 Jun, 04:26 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-trunk #1858 |
Fri, 01 Jun, 04:40 |
Julien Nioche |
Re: [VOTE] Apache Nutch 1.5 release-1.5RC4 |
Fri, 01 Jun, 09:01 |
Lewis John Mcgibbney |
Re: [VOTE] Apache Nutch 1.5 release-1.5RC4 |
Fri, 01 Jun, 10:32 |
Mattmann, Chris A (388J) |
Re: [VOTE] Apache Nutch 1.5 release-1.5RC4 |
Sat, 02 Jun, 05:11 |
Ali Safdar Kureishy |
Questions about the "hostCount" and related variables in org.apache.nutch.crawl.Generator$Selector::reduce() |
Mon, 04 Jun, 11:52 |
Markus Jelsma (JIRA) |
[jira] [Created] (NUTCH-1380) Fetcher reducer not to configure filter/normalizers |
Mon, 04 Jun, 13:10 |
Markus Jelsma (JIRA) |
[jira] [Created] (NUTCH-1381) Allow to override default subcollection field name |
Mon, 04 Jun, 13:10 |
Markus Jelsma (JIRA) |
[jira] [Updated] (NUTCH-1381) Allow to override default subcollection field name |
Mon, 04 Jun, 13:12 |
Markus Jelsma (JIRA) |
[jira] [Updated] (NUTCH-1381) Allow to override default subcollection field name |
Mon, 04 Jun, 13:50 |
Lewis John Mcgibbney |
[RESULT] [VOTE] Apache Nutch 1.5 release-1.5RC4 |
Thu, 07 Jun, 11:56 |
Lewis John Mcgibbney |
Re: [VOTE] Apache Nutch 1.5 release-1.5RC4 |
Thu, 07 Jun, 11:58 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1370) Expose exact number of urls injected @runtime |
Thu, 07 Jun, 13:19 |
Mattmann, Chris A (388J) |
Re: [VOTE] Apache Nutch 1.5 release-1.5RC4 |
Thu, 07 Jun, 14:47 |
lewis john mcgibbney |
[ANNOUNCE] Apache Nutch 1.5 Released |
Thu, 07 Jun, 16:52 |
Markus Jelsma (JIRA) |
[jira] [Updated] (NUTCH-1381) Allow to override default subcollection field name |
Thu, 07 Jun, 17:19 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1381) Allow to override default subcollection field name |
Thu, 07 Jun, 18:20 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1351) DomainStatistics to aggregate by TLD |
Thu, 07 Jun, 18:22 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1320) IndexChecker and ParseChecker choke on IDN's |
Thu, 07 Jun, 18:50 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1351) DomainStatistics to aggregate by TLD |
Thu, 07 Jun, 19:06 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1320) IndexChecker and ParseChecker choke on IDN's |
Thu, 07 Jun, 19:06 |
Markus Jelsma (JIRA) |
[jira] [Assigned] (NUTCH-1342) Read time out protocol-http |
Thu, 07 Jun, 19:06 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-1342) Read time out protocol-http |
Thu, 07 Jun, 19:06 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1381) Allow to override default subcollection field name |
Thu, 07 Jun, 19:06 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-1341) NotModified time set to now but page not modified |
Thu, 07 Jun, 19:12 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1346) Follow outlinks to ignore external |
Fri, 08 Jun, 07:05 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1346) Follow outlinks to ignore external |
Fri, 08 Jun, 07:26 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1336) Optionally not index db_notmodified pages |
Fri, 08 Jun, 07:39 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-1024) Dynamically set fetchInterval by MIME-type |
Fri, 08 Jun, 07:41 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-1262) Map `duplicating` content-types to a single type |
Fri, 08 Jun, 07:43 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1336) Optionally not index db_notmodified pages |
Fri, 08 Jun, 08:44 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1320) IndexChecker and ParseChecker choke on IDN's |
Fri, 08 Jun, 10:59 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1346) Follow outlinks to ignore external |
Fri, 08 Jun, 10:59 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1351) DomainStatistics to aggregate by TLD |
Fri, 08 Jun, 10:59 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1381) Allow to override default subcollection field name |
Fri, 08 Jun, 10:59 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1336) Optionally not index db_notmodified pages |
Fri, 08 Jun, 10:59 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1352) Improve regex urlfilters/normalizers synchronization |
Fri, 08 Jun, 14:07 |
Lewis John McGibbney (JIRA) |
[jira] [Closed] (NUTCH-1361) Fix mishandling of malformed urls in generator job |
Fri, 08 Jun, 14:07 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "Release_HOWTO" by LewisJohnMcgibbney |
Fri, 08 Jun, 14:21 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "Release_HOWTO" by LewisJohnMcgibbney |
Fri, 08 Jun, 14:28 |
lewis john mcgibbney |
VOTE Apache Nutch 2.0 RC1 |
Fri, 08 Jun, 14:49 |
Emre Çelikten (JIRA) |
[jira] [Created] (NUTCH-1382) Adding support for EmbeddedSolrServer to SolrIndexer |
Fri, 08 Jun, 15:38 |
Emre Çelikten (JIRA) |
[jira] [Updated] (NUTCH-1382) Adding support for EmbeddedSolrServer to SolrIndexer |
Fri, 08 Jun, 15:38 |
Sebastian Nagel (JIRA) |
[jira] [Created] (NUTCH-1383) IndexingFiltersChecker to show error message instead of null pointer exception |
Sat, 09 Jun, 21:43 |
Sebastian Nagel (JIRA) |
[jira] [Updated] (NUTCH-1383) IndexingFiltersChecker to show error message instead of null pointer exception |
Sat, 09 Jun, 21:51 |
Apache Wiki |
[Nutch Wiki] Update of "NutchTutorial" by SebastianNagel |
Sun, 10 Jun, 19:31 |
Sebastian Nagel |
bin/nutch -core |
Sun, 10 Jun, 21:24 |
Matthias Agethle (JIRA) |
[jira] [Created] (NUTCH-1384) Typo in ParseSegment's run-method |
Mon, 11 Jun, 06:26 |
Andy Xue (JIRA) |
[jira] [Created] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" |
Mon, 11 Jun, 07:19 |
Andy Xue (JIRA) |
[jira] [Updated] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" |
Mon, 11 Jun, 07:19 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-1383) IndexingFiltersChecker to show error message instead of null pointer exception |
Mon, 11 Jun, 07:25 |
Markus Jelsma (JIRA) |
[jira] [Updated] (NUTCH-1384) Typo in ParseSegment's run-method |
Mon, 11 Jun, 07:25 |
Markus Jelsma (JIRA) |
[jira] [Updated] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" |
Mon, 11 Jun, 07:27 |
Andy Xue (JIRA) |
[jira] [Updated] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" |
Mon, 11 Jun, 08:43 |
Andy Xue (JIRA) |
[jira] [Updated] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" |
Mon, 11 Jun, 08:43 |
Andy Xue (JIRA) |
[jira] [Updated] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" |
Mon, 11 Jun, 08:43 |
Markus Jelsma (JIRA) |
[jira] [Assigned] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" |
Mon, 11 Jun, 09:25 |
Markus Jelsma (JIRA) |
[jira] [Assigned] (NUTCH-1384) Typo in ParseSegment's run-method |
Mon, 11 Jun, 09:27 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" |
Mon, 11 Jun, 09:29 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1384) Typo in ParseSegment's run-method |
Mon, 11 Jun, 09:31 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1262) Map `duplicating` content-types to a single type |
Mon, 11 Jun, 10:28 |
Markus Jelsma (JIRA) |
[jira] [Updated] (NUTCH-1262) Map `duplicating` content-types to a single type |
Mon, 11 Jun, 10:32 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" |
Mon, 11 Jun, 14:15 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" |
Mon, 11 Jun, 20:23 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1384) Typo in ParseSegment's run-method |
Mon, 11 Jun, 20:23 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1262) Map `duplicating` content-types to a single type |
Mon, 11 Jun, 20:23 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-1360) Suport the storing of IP address connected to when web crawling |
Mon, 11 Jun, 20:29 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1360) Suport the storing of IP address connected to when web crawling |
Mon, 11 Jun, 21:09 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1364) Add a counter for malformed urls |
Mon, 11 Jun, 22:10 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-1364) Add a counter for malformed urls |
Mon, 11 Jun, 22:24 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1364) Add a counter in Generator for malformed urls |
Tue, 12 Jun, 00:12 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-1364) Add a counter in Generator for malformed urls |
Tue, 12 Jun, 00:14 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1364) Add a counter in Generator for malformed urls |
Tue, 12 Jun, 01:14 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1364) Add a counter in Generator for malformed urls |
Tue, 12 Jun, 04:31 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1262) Map `duplicating` content-types to a single type |
Tue, 12 Jun, 04:31 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1360) Suport the storing of IP address connected to when web crawling |
Tue, 12 Jun, 04:31 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1384) Typo in ParseSegment's run-method |
Tue, 12 Jun, 04:31 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1385) More robust plug-in order properties in "nutch-site.xml" |
Tue, 12 Jun, 04:31 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1024) Dynamically set fetchInterval by MIME-type |
Tue, 12 Jun, 10:12 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1352) Improve regex urlfilters/normalizers synchronization |
Tue, 12 Jun, 10:16 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1356) ParseUtil use ExecutorService instead of manually thread handling. |
Tue, 12 Jun, 10:18 |
Markus Jelsma (JIRA) |
[jira] [Created] (NUTCH-1386) Headings filter not to add empty values |
Tue, 12 Jun, 10:22 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1386) Headings filter not to add empty values |
Tue, 12 Jun, 10:22 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1319) HostNormalizer |
Tue, 12 Jun, 10:34 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-1330) OutlinkDB to preserve back up |
Tue, 12 Jun, 10:43 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1330) OutlinkDB to preserve back up |
Tue, 12 Jun, 10:43 |
Apache Wiki |
[Nutch Wiki] Update of "bin/nutch solrindex" by MarkusJelsma |
Tue, 12 Jun, 10:57 |
Ferdy Galema (JIRA) |
[jira] [Commented] (NUTCH-1356) ParseUtil use ExecutorService instead of manually thread handling. |
Tue, 12 Jun, 11:08 |
Ferdy Galema (JIRA) |
[jira] [Created] (NUTCH-1387) All parsers should respond to cancellation. |
Tue, 12 Jun, 11:12 |
Ferdy Galema (JIRA) |
[jira] [Updated] (NUTCH-1387) All parsers should respond to cancellation / interrupts. |
Tue, 12 Jun, 11:14 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1300) Indexer to normalize URL's |
Tue, 12 Jun, 11:28 |
Markus Jelsma (JIRA) |
[jira] [Resolved] (NUTCH-1318) Parse time outs crash parsing fetcher |
Tue, 12 Jun, 11:30 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1352) Improve regex urlfilters/normalizers synchronization |
Tue, 12 Jun, 11:46 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1330) OutlinkDB to preserve back up |
Tue, 12 Jun, 11:46 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1386) Headings filter not to add empty values |
Tue, 12 Jun, 11:46 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1024) Dynamically set fetchInterval by MIME-type |
Tue, 12 Jun, 11:46 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1300) Indexer to normalize URL's |
Tue, 12 Jun, 11:46 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1356) ParseUtil use ExecutorService instead of manually thread handling. |
Tue, 12 Jun, 11:46 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1319) HostNormalizer |
Tue, 12 Jun, 11:46 |
Markus Jelsma (JIRA) |
[jira] [Created] (NUTCH-1388) Optionally maintain custom fetch interval despite AdaptiveFetchSchedule |
Tue, 12 Jun, 13:07 |