Chris A. Mattmann (JIRA) |
[jira] [Work started] (NUTCH-2213) CommonCrawlDataDumper saves gzipped body in extracted form |
Tue, 01 Mar, 03:35 |
Chris A. Mattmann (JIRA) |
[jira] [Assigned] (NUTCH-2213) CommonCrawlDataDumper saves gzipped body in extracted form |
Tue, 01 Mar, 03:35 |
asfgit |
[GitHub] nutch pull request: NUTCH-2213 : do not store the headers verbatim... |
Tue, 01 Mar, 03:36 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2213) CommonCrawlDataDumper saves gzipped body in extracted form |
Tue, 01 Mar, 03:36 |
Chris A. Mattmann (JIRA) |
[jira] [Resolved] (NUTCH-2213) CommonCrawlDataDumper saves gzipped body in extracted form |
Tue, 01 Mar, 03:44 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-2213) CommonCrawlDataDumper saves gzipped body in extracted form |
Tue, 01 Mar, 03:44 |
asfgit |
[GitHub] nutch pull request: Fix the issue of the bad tstamp |
Tue, 01 Mar, 03:59 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-2236) Upgrade to Hadoop 2.7.1 |
Tue, 01 Mar, 15:47 |
Rupanshu Satsangi (JIRA) |
[jira] [Commented] (NUTCH-2060) dedup is removing entries with status db_gone |
Tue, 01 Mar, 23:20 |
Sebastian Nagel (JIRA) |
[jira] [Commented] (NUTCH-2060) dedup is removing entries with status db_gone |
Tue, 01 Mar, 23:36 |
Rupanshu Satsangi (JIRA) |
[jira] [Commented] (NUTCH-2060) dedup is removing entries with status db_gone |
Wed, 02 Mar, 02:16 |
lewismc |
[GitHub] nutch pull request: NUTCH-2184 Enable IndexingJob to function with... |
Wed, 02 Mar, 04:23 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2184) Enable IndexingJob to function with no crawldb |
Wed, 02 Mar, 04:24 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-2184) Enable IndexingJob to function with no crawldb |
Wed, 02 Mar, 06:45 |
Arun Kumar (JIRA) |
[jira] [Commented] (NUTCH-2197) Add solr5 solrcloud indexer support |
Wed, 02 Mar, 08:08 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-2197) Add solr5 solrcloud indexer support |
Wed, 02 Mar, 10:17 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-2184) Enable IndexingJob to function with no crawldb |
Wed, 02 Mar, 10:29 |
Steven Hayles (JIRA) |
[jira] [Commented] (NUTCH-2060) dedup is removing entries with status db_gone |
Wed, 02 Mar, 11:26 |
Adnane B. (JIRA) |
[jira] [Commented] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ |
Wed, 02 Mar, 12:48 |
Ron van der Vegt (JIRA) |
[jira] [Created] (NUTCH-2237) DeduplicationJob: Add extra order criteria based on slug |
Wed, 02 Mar, 13:51 |
Ron van der Vegt (JIRA) |
[jira] [Updated] (NUTCH-2237) DeduplicationJob: Add extra order criteria based on slug |
Wed, 02 Mar, 14:20 |
Pablo Torres (JIRA) |
[jira] [Closed] (NUTCH-2233) Index-basic incorrect assignment of next fetch time when using Mongodb as storage backend |
Wed, 02 Mar, 15:53 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ |
Wed, 02 Mar, 17:34 |
Lewis John McGibbney (JIRA) |
[jira] [Comment Edited] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ |
Wed, 02 Mar, 17:35 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-2184) Enable IndexingJob to function with no crawldb |
Wed, 02 Mar, 18:10 |
sebastian-nagel |
[GitHub] nutch pull request: NUTCH-2184 Enable IndexingJob to function with... |
Wed, 02 Mar, 19:44 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2184) Enable IndexingJob to function with no crawldb |
Wed, 02 Mar, 19:44 |
sebastian-nagel |
[GitHub] nutch pull request: NUTCH-2184 Enable IndexingJob to function with... |
Wed, 02 Mar, 19:58 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2184) Enable IndexingJob to function with no crawldb |
Wed, 02 Mar, 19:58 |
sebastian-nagel |
[GitHub] nutch pull request: NUTCH-2184 Enable IndexingJob to function with... |
Wed, 02 Mar, 20:03 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2184) Enable IndexingJob to function with no crawldb |
Wed, 02 Mar, 20:04 |
lewismc |
[GitHub] nutch pull request: NUTCH-2184 Enable IndexingJob to function with... |
Wed, 02 Mar, 20:51 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2184) Enable IndexingJob to function with no crawldb |
Wed, 02 Mar, 20:51 |
sebastian-nagel |
[GitHub] nutch pull request: NUTCH-2184 Enable IndexingJob to function with... |
Wed, 02 Mar, 21:13 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2184) Enable IndexingJob to function with no crawldb |
Wed, 02 Mar, 21:14 |
Rupanshu Satsangi (JIRA) |
[jira] [Commented] (NUTCH-2060) dedup is removing entries with status db_gone |
Wed, 02 Mar, 22:37 |
Adnane B. (JIRA) |
[jira] [Comment Edited] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ |
Thu, 03 Mar, 03:24 |
Adnane B. (JIRA) |
[jira] [Commented] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ |
Thu, 03 Mar, 03:24 |
Pablo Torres (JIRA) |
[jira] [Created] (NUTCH-2238) Indexer for Elasticsearch 2.x |
Thu, 03 Mar, 13:05 |
ptorrestr |
[GitHub] nutch pull request: fix for NUTCH-2238 contributed by ptorrestr |
Thu, 03 Mar, 14:37 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2238) Indexer for Elasticsearch 2.x |
Thu, 03 Mar, 14:38 |
ptorrestr |
[GitHub] nutch pull request: fix for NUTCH-2238 contributed by ptorrestr |
Thu, 03 Mar, 17:59 |
lewismc |
[GitHub] nutch pull request: fix for NUTCH-2238 contributed by ptorrestr |
Thu, 03 Mar, 18:00 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2238) Indexer for Elasticsearch 2.x |
Thu, 03 Mar, 18:00 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2238) Indexer for Elasticsearch 2.x |
Thu, 03 Mar, 18:00 |
lewismc |
[GitHub] nutch pull request: fix for NUTCH-2238 contributed by ptorrestr |
Thu, 03 Mar, 18:00 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2238) Indexer for Elasticsearch 2.x |
Thu, 03 Mar, 18:01 |
ptorrestr |
[GitHub] nutch pull request: fix for NUTCH-2238 contributed by ptorrestr |
Thu, 03 Mar, 19:39 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2238) Indexer for Elasticsearch 2.x |
Thu, 03 Mar, 19:40 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ |
Thu, 03 Mar, 20:28 |
Sebastian Nagel (JIRA) |
[jira] [Commented] (NUTCH-2237) DeduplicationJob: Add extra order criteria based on slug |
Thu, 03 Mar, 20:46 |
Sebastian Nagel (JIRA) |
[jira] [Updated] (NUTCH-2237) DeduplicationJob: Add extra order criteria based on slug |
Thu, 03 Mar, 20:46 |
Adnane B. (JIRA) |
[jira] [Updated] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ |
Thu, 03 Mar, 22:41 |
Adnane B. (JIRA) |
[jira] [Comment Edited] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ |
Thu, 03 Mar, 22:42 |
Adnane B. (JIRA) |
[jira] [Comment Edited] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ |
Thu, 03 Mar, 22:43 |
Adnane B. (JIRA) |
[jira] [Comment Edited] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ |
Thu, 03 Mar, 22:44 |
Ron van der Vegt (JIRA) |
[jira] [Commented] (NUTCH-2237) DeduplicationJob: Add extra order criteria based on slug |
Mon, 07 Mar, 15:07 |
Ron van der Vegt (JIRA) |
[jira] [Updated] (NUTCH-2237) DeduplicationJob: Add extra order criteria based on slug |
Mon, 07 Mar, 15:08 |
Cihad Guzel |
GSOC 2016 |
Mon, 07 Mar, 22:38 |
Chris A. Mattmann (JIRA) |
[jira] [Work started] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events |
Tue, 08 Mar, 04:09 |
Chris A. Mattmann (JIRA) |
[jira] [Assigned] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events |
Tue, 08 Mar, 04:09 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events |
Tue, 08 Mar, 04:10 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events |
Tue, 08 Mar, 09:09 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-2202) Integration of Anthelion (Focused Crawling Module) into Nutch |
Tue, 08 Mar, 13:01 |
Robert Meusel (JIRA) |
[jira] [Commented] (NUTCH-2202) Integration of Anthelion (Focused Crawling Module) into Nutch |
Tue, 08 Mar, 15:05 |
Lewis John McGibbney (JIRA) |
[jira] [Assigned] (NUTCH-2202) Integration of Anthelion (Focused Crawling Module) into Nutch |
Tue, 08 Mar, 18:41 |
Farasath Ahamed (JIRA) |
[jira] [Commented] (NUTCH-2005) Implement HTrace'ing in Nutch |
Tue, 08 Mar, 19:30 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events |
Tue, 08 Mar, 23:42 |
lewismc |
[GitHub] nutch pull request: NUTCH-2202 Integration of Anthelion (Focused C... |
Wed, 09 Mar, 09:21 |
ASF GitHub Bot (JIRA) |
[jira] [Commented] (NUTCH-2202) Integration of Anthelion (Focused Crawling Module) into Nutch |
Wed, 09 Mar, 09:21 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-2202) Integration of Anthelion (Focused Crawling Module) into Nutch |
Wed, 09 Mar, 15:24 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-2185) protocol-soda-consumer plugin |
Wed, 09 Mar, 22:38 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events |
Sun, 13 Mar, 19:14 |
Raghav Bharadwaj Jayasimha Rao (JIRA) |
[jira] [Created] (NUTCH-2239) Selenium Handlers for Ajax Patterns from Student submissions |
Mon, 14 Mar, 01:58 |
Chris A. Mattmann (JIRA) |
[jira] [Updated] (NUTCH-2239) Selenium Handlers for Ajax Patterns from Student submissions |
Mon, 14 Mar, 06:44 |
Chris A. Mattmann (JIRA) |
[jira] [Assigned] (NUTCH-2239) Selenium Handlers for Ajax Patterns from Student submissions |
Mon, 14 Mar, 06:44 |
Chris A. Mattmann (JIRA) |
[jira] [Work started] (NUTCH-2239) Selenium Handlers for Ajax Patterns from Student submissions |
Mon, 14 Mar, 06:45 |
Chris A. Mattmann (JIRA) |
[jira] [Updated] (NUTCH-2239) Selenium Handlers for Ajax Patterns from Student submissions |
Mon, 14 Mar, 06:45 |
Chris A. Mattmann (JIRA) |
[jira] [Updated] (NUTCH-2239) Selenium Handlers for Ajax Patterns from Student submissions |
Mon, 14 Mar, 06:45 |
Chris A. Mattmann (JIRA) |
[jira] [Work started] (NUTCH-2191) Add protocol-htmlunit |
Mon, 14 Mar, 07:04 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-2191) Add protocol-htmlunit |
Mon, 14 Mar, 07:04 |
Chris A. Mattmann (JIRA) |
[jira] [Assigned] (NUTCH-2191) Add protocol-htmlunit |
Mon, 14 Mar, 07:04 |
Markus Jelsma (JIRA) |
[jira] [Commented] (NUTCH-2191) Add protocol-htmlunit |
Mon, 14 Mar, 15:50 |
Adnane B. (JIRA) |
[jira] [Commented] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_ |
Mon, 14 Mar, 16:22 |
Longuemare (JIRA) |
[jira] [Commented] (NUTCH-2138) Tika cannot OCR embedded images from PDF |
Tue, 15 Mar, 15:04 |
eldk (JIRA) |
[jira] [Comment Edited] (NUTCH-2138) Tika cannot OCR embedded images from PDF |
Tue, 15 Mar, 15:12 |
eldk (JIRA) |
[jira] [Comment Edited] (NUTCH-2138) Tika cannot OCR embedded images from PDF |
Tue, 15 Mar, 15:19 |
Karanjeet Singh (JIRA) |
[jira] [Commented] (NUTCH-2191) Add protocol-htmlunit |
Tue, 15 Mar, 19:57 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-2191) Add protocol-htmlunit |
Tue, 15 Mar, 19:58 |
songwanging (JIRA) |
[jira] [Commented] (NUTCH-2076) exceptions are not handled when using method waitForCompletion in a try block |
Wed, 16 Mar, 05:01 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1492) Support gora-dynamodb in Nutch 2.x |
Thu, 17 Mar, 07:15 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-2206) Provide example scoring.similarity.stopword.file |
Thu, 17 Mar, 07:15 |
Karanjeet Singh (JIRA) |
[jira] [Comment Edited] (NUTCH-2191) Add protocol-htmlunit |
Thu, 17 Mar, 07:58 |
Karanjeet Singh (JIRA) |
[jira] [Commented] (NUTCH-2191) Add protocol-htmlunit |
Thu, 17 Mar, 07:58 |
Markus Jelsma |
1.11 branch/tag |
Thu, 17 Mar, 09:43 |
Mattmann, Chris A (3980) |
Re: 1.11 branch/tag |
Thu, 17 Mar, 14:13 |
Markus Jelsma |
RE: 1.11 branch/tag |
Thu, 17 Mar, 15:39 |
lq (JIRA) |
[jira] [Created] (NUTCH-2240) ava.lang.NoSuchFieldError: INSTANCE selenium nutch |
Thu, 17 Mar, 17:29 |
eldk (JIRA) |
[jira] [Commented] (NUTCH-2138) Tika cannot OCR embedded images from PDF |
Thu, 17 Mar, 17:52 |
eldk (JIRA) |
[jira] [Comment Edited] (NUTCH-2138) Tika cannot OCR embedded images from PDF |
Thu, 17 Mar, 18:40 |