ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2399) indexer-elastic does not index multi-value fields (only the first value is indexed) |
Tue, 01 Aug, 10:24 |
Michael Chen |
parse-zip Nutch 2.x compatibility? |
Wed, 02 Aug, 00:21 |
Michael Chen |
Re: Question on 2.x sitemap functionality |
Wed, 02 Aug, 00:28 |
kenneth mcfarland |
Re: Question on 2.x sitemap functionality |
Wed, 02 Aug, 00:30 |
Michael Chen |
Re: Question on 2.x sitemap functionality |
Wed, 02 Aug, 00:45 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2375) Upgrade the code base from org.apache.hadoop.mapred to org.apache.hadoop.mapreduce |
Wed, 02 Aug, 03:14 |
kenneth mcfarland |
Re: Question on 2.x sitemap functionality |
Wed, 02 Aug, 04:29 |
Michael Chen |
Re: parse-zip Nutch 2.x compatibility? |
Wed, 02 Aug, 17:14 |
Michael Chen |
HTML Support for jsoup-extractor in Nutch 2.x? |
Wed, 02 Aug, 21:42 |
Michael Chen |
Re: HTML Support for jsoup-extractor in Nutch 2.x? |
Wed, 02 Aug, 23:59 |
Michael Chen |
Parse-zip porting? |
Fri, 04 Aug, 23:52 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2375) Upgrade the code base from org.apache.hadoop.mapred to org.apache.hadoop.mapreduce |
Sat, 05 Aug, 18:39 |
Kaidul Islam (JIRA) |
[jira] [Created] (NUTCH-2405) jsoup-extractor structure correction, typo fixed |
Sun, 06 Aug, 09:09 |
Kaidul Islam (JIRA) |
[jira] [Updated] (NUTCH-2405) jsoup-extractor structure correction, typo fixed |
Sun, 06 Aug, 09:13 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2405) jsoup-extractor structure correction, typo fixed |
Sun, 06 Aug, 09:23 |
kenneth mcfarland (JIRA) |
[jira] [Created] (NUTCH-2406) Sum up constants, make minor changes |
Tue, 08 Aug, 08:28 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2406) Sum up constants, make minor changes |
Tue, 08 Aug, 08:29 |
d.ku...@technisat.de |
fetching pdfs from our website |
Tue, 08 Aug, 13:00 |
Omkar Reddy |
Regarding checksum error in hadoop in my latest PR. |
Wed, 09 Aug, 09:56 |
Mattmann, Chris A (3010) |
Release of TREC Dynamic Domain: Polar Dataset |
Wed, 09 Aug, 16:55 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2406) Sum up constants, make minor changes |
Wed, 09 Aug, 17:18 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2406) Sum up constants, make minor changes |
Wed, 09 Aug, 17:19 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-2406) Sum up constants, make minor changes |
Wed, 09 Aug, 17:20 |
Lewis John McGibbney (JIRA) |
[jira] [Assigned] (NUTCH-2406) Sum up constants, make minor changes |
Wed, 09 Aug, 17:20 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-2406) Sum up constants, make minor changes |
Wed, 09 Aug, 17:20 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-2405) jsoup-extractor structure correction, typo fixed |
Wed, 09 Aug, 17:26 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2405) jsoup-extractor structure correction, typo fixed |
Wed, 09 Aug, 17:26 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-2405) jsoup-extractor structure correction, typo fixed |
Wed, 09 Aug, 17:44 |
Sebastian Nagel |
Re: Regarding checksum error in hadoop in my latest PR. |
Fri, 11 Aug, 13:36 |
Vyacheslav Pascarel (JIRA) |
[jira] [Created] (NUTCH-2407) Memory leak causing Nutch Server to run out of memory |
Fri, 11 Aug, 21:20 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-1932) Automatically remove orphaned pages |
Sat, 12 Aug, 13:37 |
Sebastian Nagel (JIRA) |
[jira] [Created] (NUTCH-2408) CrawlDb: allow update from unparsed segments |
Sat, 12 Aug, 14:16 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2408) CrawlDb: allow update from unparsed segments |
Sat, 12 Aug, 14:24 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2375) Upgrade the code base from org.apache.hadoop.mapred to org.apache.hadoop.mapreduce |
Sun, 13 Aug, 11:14 |
Sebastian Nagel (JIRA) |
[jira] [Commented] (NUTCH-2407) Memory leak causing Nutch Server to run out of memory |
Mon, 14 Aug, 14:18 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-1932) Automatically remove orphaned pages |
Mon, 14 Aug, 18:51 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2408) CrawlDb: allow update from unparsed segments |
Mon, 14 Aug, 18:52 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2408) CrawlDb: allow update from unparsed segments |
Tue, 15 Aug, 11:36 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-2408) CrawlDb: allow update from unparsed segments |
Tue, 15 Aug, 11:55 |
Sebastian Nagel (JIRA) |
[jira] [Resolved] (NUTCH-2408) CrawlDb: allow update from unparsed segments |
Tue, 15 Aug, 12:21 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2400) Solr 6.6.0 compatibility |
Tue, 15 Aug, 15:26 |
Apache Wiki |
[Nutch Wiki] Update of "NutchTutorial" by SebastianNagel |
Tue, 15 Aug, 15:29 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2400) Solr 6.6.0 compatibility |
Tue, 15 Aug, 15:52 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-2400) Solr 6.6.0 compatibility |
Tue, 15 Aug, 15:52 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-1465) Support sitemaps in Nutch |
Tue, 15 Aug, 16:32 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-2400) Solr 6.6.0 compatibility |
Tue, 15 Aug, 16:53 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-1465) Support sitemaps in Nutch |
Tue, 15 Aug, 17:15 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-1465) Support sitemaps in Nutch |
Tue, 15 Aug, 17:15 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-1932) Automatically remove orphaned pages |
Tue, 15 Aug, 17:58 |
Sebastian Nagel (JIRA) |
[jira] [Commented] (NUTCH-2298) TestCrawlDbStates.testCrawlDbStatTransitionInject broken |
Tue, 15 Aug, 19:57 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2378) ChildFirst plugin classloader |
Tue, 15 Aug, 20:28 |
Sebastian Nagel (JIRA) |
[jira] [Commented] (NUTCH-2378) ChildFirst plugin classloader |
Tue, 15 Aug, 20:28 |
Sebastian Nagel (JIRA) |
[jira] [Comment Edited] (NUTCH-2378) ChildFirst plugin classloader |
Tue, 15 Aug, 20:29 |
Sebastian Nagel (JIRA) |
[jira] [Updated] (NUTCH-2378) ChildFirst plugin classloader |
Tue, 15 Aug, 20:29 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2378) ChildFirst plugin classloader |
Tue, 15 Aug, 20:30 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2378) ChildFirst plugin classloader |
Tue, 15 Aug, 20:35 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-2378) ChildFirst plugin classloader |
Tue, 15 Aug, 21:01 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2378) ChildFirst plugin classloader |
Wed, 16 Aug, 12:38 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2378) ChildFirst plugin classloader |
Wed, 16 Aug, 12:41 |
Sebastian Nagel (JIRA) |
[jira] [Updated] (NUTCH-2378) ChildFirst plugin classloader |
Wed, 16 Aug, 12:48 |
Sebastian Nagel (JIRA) |
[jira] [Commented] (NUTCH-2378) ChildFirst plugin classloader |
Wed, 16 Aug, 12:48 |
Sebastian Nagel (JIRA) |
[jira] [Assigned] (NUTCH-2378) ChildFirst plugin classloader |
Wed, 16 Aug, 12:49 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-2378) ChildFirst plugin classloader |
Wed, 16 Aug, 12:53 |
Vyacheslav Pascarel (JIRA) |
[jira] [Updated] (NUTCH-2407) Memory leak causing Nutch Server to run out of memory |
Wed, 16 Aug, 21:55 |
Markus Jelsma (JIRA) |
[jira] [Updated] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb |
Thu, 17 Aug, 08:46 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb |
Thu, 17 Aug, 08:46 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb |
Thu, 17 Aug, 08:49 |
Sebastian Nagel (JIRA) |
[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb |
Thu, 17 Aug, 10:03 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb |
Thu, 17 Aug, 10:14 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb |
Thu, 17 Aug, 10:20 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-2335) Injector not to filter and normalize existing URLs in CrawlDb |
Thu, 17 Aug, 10:22 |
Sebastian Nagel (JIRA) |
[jira] [Created] (NUTCH-2409) Injector: complete command-line help and counters |
Thu, 17 Aug, 10:53 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2409) Injector: complete command-line help and counters |
Thu, 17 Aug, 10:55 |
Sebastian Nagel (JIRA) |
[jira] [Created] (NUTCH-2410) Unit test for jsoup-extractor not to depend on external resources |
Thu, 17 Aug, 11:07 |
kenneth mcfarland |
NutchServer |
Thu, 17 Aug, 18:46 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2378) ChildFirst plugin classloader |
Fri, 18 Aug, 13:20 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-2378) ChildFirst plugin classloader |
Fri, 18 Aug, 13:43 |
hussein Al_Ahmad (JIRA) |
[jira] [Commented] (NUTCH-1690) IndexClean: mark url as unindexed after clean to not delete again |
Fri, 18 Aug, 13:58 |
Sebastian Nagel (JIRA) |
[jira] [Updated] (NUTCH-2378) ChildFirst plugin classloader |
Fri, 18 Aug, 14:35 |
Sebastian Nagel (JIRA) |
[jira] [Resolved] (NUTCH-2378) ChildFirst plugin classloader |
Fri, 18 Aug, 14:35 |
Sebastian Nagel (JIRA) |
[jira] [Resolved] (NUTCH-2071) A parser failure on a single document may fail crawling job |
Fri, 18 Aug, 14:36 |
Sebastian Nagel (JIRA) |
[jira] [Assigned] (NUTCH-2071) A parser failure on a single document may fail crawling job |
Fri, 18 Aug, 14:36 |
Sebastian Nagel (JIRA) |
[jira] [Work started] (NUTCH-2316) Library conflict with Parser-Tika Plugin and Lib Folder |
Fri, 18 Aug, 14:37 |
Sebastian Nagel (JIRA) |
[jira] [Work stopped] (NUTCH-2316) Library conflict with Parser-Tika Plugin and Lib Folder |
Fri, 18 Aug, 14:37 |
Sebastian Nagel (JIRA) |
[jira] [Resolved] (NUTCH-2316) Library conflict with Parser-Tika Plugin and Lib Folder |
Fri, 18 Aug, 14:37 |
Sebastian Nagel (JIRA) |
[jira] [Assigned] (NUTCH-2316) Library conflict with Parser-Tika Plugin and Lib Folder |
Fri, 18 Aug, 14:37 |
kenneth mcfarland |
Styles |
Fri, 18 Aug, 22:08 |
Markus Jelsma |
RE: Styles |
Fri, 18 Aug, 22:43 |
hussein Al_Ahmad (JIRA) |
[jira] [Comment Edited] (NUTCH-1690) IndexClean: mark url as unindexed after clean to not delete again |
Sat, 19 Aug, 15:16 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2399) indexer-elastic does not index multi-value fields (only the first value is indexed) |
Mon, 21 Aug, 11:52 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2399) indexer-elastic does not index multi-value fields (only the first value is indexed) |
Mon, 21 Aug, 16:50 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-2399) indexer-elastic does not index multi-value fields (only the first value is indexed) |
Mon, 21 Aug, 16:51 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-2399) indexer-elastic does not index multi-value fields (only the first value is indexed) |
Mon, 21 Aug, 16:51 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-1129) Any23 Nutch plugin |
Mon, 21 Aug, 17:41 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-1129) Any23 Nutch plugin |
Mon, 21 Aug, 18:01 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-1129) Any23 Nutch plugin |
Mon, 21 Aug, 18:01 |
Markus Jelsma (JIRA) |
[jira] [Created] (NUTCH-2411) Index-metadata to support indexing multiple values for a field |
Tue, 22 Aug, 14:27 |
Markus Jelsma (JIRA) |
[jira] [Updated] (NUTCH-2411) Index-metadata to support indexing multiple values for a field |
Tue, 22 Aug, 14:30 |
Markus Jelsma (JIRA) |
[jira] [Updated] (NUTCH-2411) Index-metadata to support indexing multiple values for a field |
Tue, 22 Aug, 14:31 |
Markus Jelsma (JIRA) |
[jira] [Updated] (NUTCH-2411) Index-metadata to support indexing multiple values for a field |
Tue, 22 Aug, 14:40 |