Stefan Neufeind (JIRA) |
[jira] Created: (NUTCH-377) Add possibility to search for multiple values |
Sun, 01 Oct, 14:02 |
Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-377) Add possibility to search for multiple values |
Sun, 01 Oct, 21:27 |
Stefan Neufeind (JIRA) |
[jira] Commented: (NUTCH-377) Add possibility to search for multiple values |
Sun, 01 Oct, 22:05 |
Doug Cook (JIRA) |
[jira] Commented: (NUTCH-353) pages that serverside forwards will be refetched every time |
Mon, 02 Oct, 18:12 |
JP Nutch |
Re: svn commit: r451649 - /lucene/nutch/trunk/CHANGES.txt |
Mon, 02 Oct, 18:23 |
Ken Krugler (JIRA) |
[jira] Commented: (NUTCH-353) pages that serverside forwards will be refetched every time |
Mon, 02 Oct, 20:26 |
Uroš Gruber |
Re: [jira] Commented: (NUTCH-353) pages that serverside forwards will be refetched every time |
Tue, 03 Oct, 06:28 |
Doug Cook |
Re: [jira] Commented: (NUTCH-353) pages that serverside forwards will be refetched every time |
Tue, 03 Oct, 15:07 |
Chris Mattmann |
Nutch requires JDK 1.5 now? |
Tue, 03 Oct, 15:34 |
Andrzej Bialecki |
Re: Nutch requires JDK 1.5 now? |
Tue, 03 Oct, 15:38 |
Sami Siren |
Re: Nutch requires JDK 1.5 now? |
Tue, 03 Oct, 16:04 |
Chris Mattmann |
Re: Nutch requires JDK 1.5 now? |
Tue, 03 Oct, 16:08 |
Chris Mattmann |
Re: Nutch requires JDK 1.5 now? |
Tue, 03 Oct, 16:10 |
Piotr Kosiorowski |
Re: Nutch requires JDK 1.5 now? |
Tue, 03 Oct, 17:14 |
Andrzej Bialecki (JIRA) |
[jira] Created: (NUTCH-378) MetaWrapper decorator |
Tue, 03 Oct, 19:48 |
Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-378) MetaWrapper decorator |
Tue, 03 Oct, 19:48 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-371) DeleteDuplicates should remove documents with duplicate URLs |
Tue, 03 Oct, 19:50 |
Doug Cutting (JIRA) |
[jira] Resolved: (NUTCH-304) Change JIRA email address for nutch issues from apache incubator |
Tue, 03 Oct, 22:24 |
Doug Cutting (JIRA) |
[jira] Commented: (NUTCH-353) pages that serverside forwards will be refetched every time |
Tue, 03 Oct, 22:45 |
Uroš Gruber |
Re: [jira] Commented: (NUTCH-353) pages that serverside forwards will be refetched every time |
Wed, 04 Oct, 05:56 |
Chris A. Mattmann (JIRA) |
[jira] Created: (NUTCH-379) ParseUtil does not pass through the content's URL to the ParserFactory |
Wed, 04 Oct, 16:03 |
Chris A. Mattmann (JIRA) |
[jira] Work started: (NUTCH-379) ParseUtil does not pass through the content's URL to the ParserFactory |
Wed, 04 Oct, 16:03 |
Chris A. Mattmann (JIRA) |
[jira] Updated: (NUTCH-379) ParseUtil does not pass through the content's URL to the ParserFactory |
Wed, 04 Oct, 17:51 |
Jim Kellerman (JIRA) |
[jira] Created: (NUTCH-380) Nutch does not run/build against Hadoop 0.6 |
Wed, 04 Oct, 19:14 |
Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-371) DeleteDuplicates should remove documents with duplicate URLs |
Thu, 05 Oct, 00:11 |
nutch-...@lucene.apache.org |
Nutch nightly build failure |
Thu, 05 Oct, 00:22 |
Jim Kellerman (JIRA) |
[jira] Commented: (NUTCH-371) DeleteDuplicates should remove documents with duplicate URLs |
Thu, 05 Oct, 02:37 |
Jim Kellerman (JIRA) |
[jira] Updated: (NUTCH-371) DeleteDuplicates should remove documents with duplicate URLs |
Thu, 05 Oct, 08:40 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-371) DeleteDuplicates should remove documents with duplicate URLs |
Thu, 05 Oct, 09:55 |
Uros Gruber (JIRA) |
[jira] Commented: (NUTCH-353) pages that serverside forwards will be refetched every time |
Thu, 05 Oct, 19:19 |
Uros Gruber (JIRA) |
[jira] Created: (NUTCH-381) Ignore external link not work as expected |
Thu, 05 Oct, 19:35 |
Jim Kellerman (JIRA) |
[jira] Updated: (NUTCH-380) Nutch does not run/build against Hadoop 0.6 |
Thu, 05 Oct, 22:35 |
Jim Kellerman (JIRA) |
[jira] Updated: (NUTCH-371) DeleteDuplicates should remove documents with duplicate URLs |
Thu, 05 Oct, 23:22 |
nutch.newbie (JIRA) |
[jira] Commented: (NUTCH-381) Ignore external link not work as expected |
Fri, 06 Oct, 02:06 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-381) Ignore external link not work as expected |
Fri, 06 Oct, 06:58 |
Uros Gruber (JIRA) |
[jira] Commented: (NUTCH-381) Ignore external link not work as expected |
Fri, 06 Oct, 13:54 |
Shay Lawless |
NutchWax |
Fri, 06 Oct, 16:22 |
Gordon Mohr |
Re: NutchWax |
Fri, 06 Oct, 17:00 |
jaison |
Setting Datanodes and Task Trackers |
Sat, 07 Oct, 11:22 |
Jin Yang |
First Time Run Nutch0.8.1 in Eclipse 3.2.1 Problem! |
Sat, 07 Oct, 22:23 |
nutch-...@lucene.apache.org |
Nutch nightly build failure |
Sun, 08 Oct, 00:21 |
tryma |
Problem parsing some MS Excel & other formats (Office 2003) |
Mon, 09 Oct, 07:02 |
Andrzej Bialecki |
Re: Problem parsing some MS Excel & other formats (Office 2003) |
Mon, 09 Oct, 07:21 |
tryma |
Re: Problem parsing some MS Excel & other formats (Office 2003) |
Mon, 09 Oct, 07:42 |
Andrzej Bialecki |
Re: Problem parsing some MS Excel & other formats (Office 2003) |
Mon, 09 Oct, 07:49 |
tryma |
Re: Problem parsing some MS Excel & other formats (Office 2003) |
Mon, 09 Oct, 08:10 |
Andrzej Bialecki |
Re: Problem parsing some MS Excel & other formats (Office 2003) |
Mon, 09 Oct, 09:03 |
nutch.newbie (JIRA) |
[jira] Commented: (NUTCH-185) XMLParser is configurable xml parser plugin. |
Mon, 09 Oct, 13:13 |
Jim Kellerman (JIRA) |
[jira] Created: (NUTCH-382) Fix for NUTCH-365 introduced a bug if generate.max.per.host.by.ip is enabled |
Tue, 10 Oct, 02:14 |
Jim Kellerman (JIRA) |
[jira] Updated: (NUTCH-382) Fix for NUTCH-365 introduced a bug if generate.max.per.host.by.ip is enabled |
Tue, 10 Oct, 02:20 |
Andrzej Bialecki (JIRA) |
[jira] Created: (NUTCH-383) Upgrade Nutch to Hadoop 0.7 |
Tue, 10 Oct, 19:26 |
Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-383) Upgrade Nutch to Hadoop 0.7 |
Tue, 10 Oct, 21:20 |
xu nutch |
[Nutch-dev] Re: Which extension point should I extend? |
Wed, 11 Oct, 01:12 |
Paul Ramirez (JIRA) |
[jira] Created: (NUTCH-384) When using the file protocol one can not map a parse plugin to a content type. The only way to get the plugin called is through the default plugin. The issue is that the content type never gets mapped. |
Wed, 11 Oct, 16:54 |
Chris A. Mattmann (JIRA) |
[jira] Updated: (NUTCH-384) Protocol-file plugin does not allow the parse plugins framework to operate properly |
Wed, 11 Oct, 16:56 |
Chris Schneider (JIRA) |
[jira] Created: (NUTCH-385) Server delay feature conflicts with maxThreadsPerHost |
Wed, 11 Oct, 18:40 |
Chris Schneider (JIRA) |
[jira] Commented: (NUTCH-385) Server delay feature conflicts with maxThreadsPerHost |
Wed, 11 Oct, 18:44 |
Chris Schneider (JIRA) |
[jira] Commented: (NUTCH-385) Server delay feature conflicts with maxThreadsPerHost |
Wed, 11 Oct, 18:47 |
Chris A. Mattmann (JIRA) |
[jira] Assigned: (NUTCH-384) Protocol-file plugin does not allow the parse plugins framework to operate properly |
Wed, 11 Oct, 18:51 |
Doug Cutting (JIRA) |
[jira] Commented: (NUTCH-385) Server delay feature conflicts with maxThreadsPerHost |
Wed, 11 Oct, 20:51 |
Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-383) Upgrade Nutch to Hadoop 0.7 |
Fri, 13 Oct, 11:50 |
Andrzej Bialecki |
Re: [jira] Updated: (NUTCH-383) Upgrade Nutch to Hadoop 0.7 |
Fri, 13 Oct, 12:15 |
Sami Siren |
Re: [jira] Updated: (NUTCH-383) Upgrade Nutch to Hadoop 0.7 |
Fri, 13 Oct, 16:25 |
Sami Siren (JIRA) |
[jira] Updated: (NUTCH-379) ParseUtil does not pass through the content's URL to the ParserFactory |
Fri, 13 Oct, 16:32 |
Greg Kim (JIRA) |
[jira] Commented: (NUTCH-357) crawling simulation |
Fri, 13 Oct, 17:59 |
Sami Siren (JIRA) |
[jira] Commented: (NUTCH-357) crawling simulation |
Fri, 13 Oct, 18:09 |
Greg Kim (JIRA) |
[jira] Commented: (NUTCH-357) crawling simulation |
Fri, 13 Oct, 18:40 |
Andrzej Bialecki |
Re: [jira] Updated: (NUTCH-383) Upgrade Nutch to Hadoop 0.7 |
Fri, 13 Oct, 21:11 |
KuroSaka TeruHiko (JIRA) |
[jira] Commented: (NUTCH-224) Nutch doesn't handle Korean text at all |
Sat, 14 Oct, 02:48 |
Sami Siren (JIRA) |
[jira] Commented: (NUTCH-339) Refactor nutch to allow fetcher improvements |
Sat, 14 Oct, 02:49 |
Sami Siren |
email to jira comments (WAS Re: [jira] Commented: (NUTCH-339) Refactor nutch to allow fetcher improvements) |
Sat, 14 Oct, 05:49 |
Chris Mattmann |
Re: [jira] Updated: (NUTCH-379) ParseUtil does not pass through the content's URL to the ParserFactory |
Sat, 14 Oct, 06:48 |
Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-383) Upgrade Nutch to Hadoop 0.7 |
Sat, 14 Oct, 19:12 |
Ernesto De Santis (JIRA) |
[jira] Created: (NUTCH-386) Plugin to index categories by url rules |
Sat, 14 Oct, 20:49 |
Ernesto De Santis (JIRA) |
[jira] Updated: (NUTCH-386) Plugin to index categories by url rules |
Sat, 14 Oct, 21:04 |
Rida Benjelloun |
Re: [jira] Commented: (NUTCH-185) XMLParser is configurable xml parser plugin. |
Mon, 16 Oct, 00:42 |
Doug Cutting |
Re: email to jira comments (WAS Re: [jira] Commented: (NUTCH-339) Refactor nutch to allow fetcher improvements) |
Mon, 16 Oct, 20:53 |
Andrzej Bialecki |
HEADS UP: rev. 464654 - upgrade to Hadoop 0.7.1 breaks data compatibility |
Mon, 16 Oct, 21:11 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-371) DeleteDuplicates should remove documents with duplicate URLs |
Mon, 16 Oct, 21:16 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-383) Upgrade Nutch to Hadoop 0.7 |
Mon, 16 Oct, 21:18 |
Piotr Kosiorowski |
Re: email to jira comments (WAS Re: [jira] Commented: (NUTCH-339) Refactor nutch to allow fetcher improvements) |
Tue, 17 Oct, 05:39 |
Paul Ramirez |
Issue with Boosting Fields |
Tue, 17 Oct, 20:31 |
Johannes Zillmann (JIRA) |
[jira] Created: (NUTCH-387) host normalization in Generator$Selector |
Wed, 18 Oct, 09:41 |
ian.mcna...@thomson.com |
RE: Issue with Boosting Fields |
Wed, 18 Oct, 18:05 |
Gal Nitzan |
RE: Issue with Boosting Fields |
Wed, 18 Oct, 18:12 |
Jared Dunne (JIRA) |
[jira] Created: (NUTCH-388) nutch-default.xml has outdated example for urlfilter.order |
Wed, 18 Oct, 19:51 |
ian.mcna...@thomson.com |
RE: Issue with Boosting Fields |
Wed, 18 Oct, 21:14 |
Teruhiko Kurosaka |
What javacc options should I use to compile NutchAnalysis.jj? |
Thu, 19 Oct, 00:41 |
Aisha |
Re: Problem parsing some MS Excel & other formats (Office 2003) |
Thu, 19 Oct, 14:46 |
Andrzej Bialecki |
Re: Problem parsing some MS Excel & other formats (Office 2003) |
Thu, 19 Oct, 15:39 |
Teruhiko Kurosaka |
RE: I modify NutchAnalysis.jj and NutchDocumentTokenizer.java to let nutch support chinese word. |
Thu, 19 Oct, 23:54 |
Teruhiko Kurosaka |
RE: What javacc options should I use to compile NutchAnalysis.jj? |
Thu, 19 Oct, 23:57 |
Aisha |
Re: Problem parsing some MS Excel & other formats (Office 2003) |
Fri, 20 Oct, 08:11 |
Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-387) host normalization in Generator$Selector |
Fri, 20 Oct, 08:20 |
Enis Soztutar (JIRA) |
[jira] Created: (NUTCH-389) a url tokenizer implementation for tokenizing index fields : url and host |
Fri, 20 Oct, 08:45 |
Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-389) a url tokenizer implementation for tokenizing index fields : url and host |
Fri, 20 Oct, 08:47 |
Enis Soztutar (JIRA) |
[jira] Updated: (NUTCH-389) a url tokenizer implementation for tokenizing index fields : url and host |
Fri, 20 Oct, 08:49 |
AJ Chen |
outlink extractor finds lots of junk |
Tue, 24 Oct, 05:32 |
nutch.newbie (JIRA) |
[jira] Commented: (NUTCH-185) XMLParser is configurable xml parser plugin. |
Tue, 24 Oct, 05:54 |
nutch.newbie (JIRA) |
[jira] Updated: (NUTCH-185) XMLParser is configurable xml parser plugin. |
Tue, 24 Oct, 06:30 |