Siddharth Jha (JIRA) |
[jira] Commented: (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed |
Mon, 03 Mar, 17:14 |
Siddharth Jha (JIRA) |
[jira] Created: (NUTCH-617) Cached Text Only |
Tue, 04 Mar, 08:45 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-617) Cached Text Only |
Tue, 04 Mar, 19:23 |
Frederic Wenzel |
Nightly builds unavailable |
Wed, 05 Mar, 10:11 |
Sami Siren |
Re: Nightly builds unavailable |
Wed, 05 Mar, 18:27 |
Andrzej Bialecki (JIRA) |
[jira] Created: (NUTCH-618) Tika error "Media type alias already exists" |
Thu, 06 Mar, 07:17 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-618) Tika error "Media type alias already exists" |
Fri, 07 Mar, 01:30 |
Chris A. Mattmann (JIRA) |
[jira] Assigned: (NUTCH-618) Tika error "Media type alias already exists" |
Fri, 07 Mar, 06:32 |
Chris A. Mattmann (JIRA) |
[jira] Work started: (NUTCH-618) Tika error "Media type alias already exists" |
Fri, 07 Mar, 06:34 |
Chris A. Mattmann (JIRA) |
[jira] Commented: (NUTCH-618) Tika error "Media type alias already exists" |
Fri, 07 Mar, 06:34 |
Euan Clark |
Confine nutch to one NIC? |
Sun, 09 Mar, 20:24 |
dong chen |
I have some problem with nutch result |
Tue, 11 Mar, 05:34 |
ogjunk-nu...@yahoo.com |
Re: Confine nutch to one NIC? |
Tue, 11 Mar, 20:21 |
Otis Gospodnetic (JIRA) |
[jira] Commented: (NUTCH-296) Image Search |
Wed, 12 Mar, 01:48 |
naveen.gosw...@wipro.com |
Problem in running Nutch where proxy authentication is required. |
Wed, 12 Mar, 16:09 |
naveen.gosw...@wipro.com |
Problem in running Nutch where proxy authentication is required. |
Wed, 12 Mar, 16:20 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-616) Reset Fetch Retry counter when fetch is successful |
Fri, 14 Mar, 12:13 |
Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-616) Reset Fetch Retry counter when fetch is successful |
Fri, 14 Mar, 13:27 |
Andrzej Bialecki (JIRA) |
[jira] Assigned: (NUTCH-616) Reset Fetch Retry counter when fetch is successful |
Fri, 14 Mar, 13:29 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-615) Redirected URL are fetched wihtout setting any FetchInterval |
Fri, 14 Mar, 14:02 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-613) Empty Summaries and Cached Pages |
Fri, 14 Mar, 14:24 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-613) Empty Summaries and Cached Pages |
Fri, 14 Mar, 14:24 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-612) URL filtering is always disabled in Generator when invoked by Crawl |
Fri, 14 Mar, 14:38 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-612) URL filtering is always disabled in Generator when invoked by Crawl |
Fri, 14 Mar, 14:38 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-610) Can't Update or modify an index while web gui is running |
Fri, 14 Mar, 14:44 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-601) Recrawling on existing crawl directory using force option |
Fri, 14 Mar, 14:54 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-601) Recrawling on existing crawl directory using force option |
Fri, 14 Mar, 14:54 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-590) Index multiple docs per call using IndexingFilter extension point |
Fri, 14 Mar, 15:00 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-590) Index multiple docs per call using IndexingFilter extension point |
Fri, 14 Mar, 15:00 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-592) Fetcher2 : NPE for page with status ProtocolStatus.TEMP_MOVED |
Fri, 14 Mar, 15:00 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-592) Fetcher2 : NPE for page with status ProtocolStatus.TEMP_MOVED |
Fri, 14 Mar, 15:00 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-575) NPE in OpenSearchServlet when summary is null |
Fri, 14 Mar, 15:10 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-575) NPE in OpenSearchServlet when summary is null |
Fri, 14 Mar, 15:10 |
Jesiel Trevisan |
Re: [jira] Commented: (NUTCH-575) NPE in OpenSearchServlet when summary is null |
Fri, 14 Mar, 16:16 |
Andrzej Bialecki |
Re: [jira] Commented: (NUTCH-575) NPE in OpenSearchServlet when summary is null |
Fri, 14 Mar, 17:24 |
Susam Pal |
Re: Problem in running Nutch where proxy authentication is required. |
Fri, 14 Mar, 17:41 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-566) Sun's URL class has bug in creation of relative query URLs |
Fri, 14 Mar, 23:34 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-556) automatic adjust the CrawlDatum.fetchInterval according to the number of newly outlinks |
Fri, 14 Mar, 23:38 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Fri, 14 Mar, 23:42 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-70) duplicate pages - virtual hosts in db. |
Fri, 14 Mar, 23:58 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-70) duplicate pages - virtual hosts in db. |
Fri, 14 Mar, 23:58 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-126) Fetching via https does not work with a proxy (patch) |
Sat, 15 Mar, 00:18 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-126) Fetching via https does not work with a proxy (patch) |
Sat, 15 Mar, 00:18 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-157) Problem during parsing msword document . It fetching properly but parsing is not working. Please show me the way how can i parse it |
Sat, 15 Mar, 00:20 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-157) Problem during parsing msword document . It fetching properly but parsing is not working. Please show me the way how can i parse it |
Sat, 15 Mar, 00:20 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-168) setting http.content.limit to -1 seems to break text parsing on some files |
Sat, 15 Mar, 00:22 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-168) setting http.content.limit to -1 seems to break text parsing on some files |
Sat, 15 Mar, 00:22 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-189) Injection infinite loop |
Sat, 15 Mar, 00:24 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-189) Injection infinite loop |
Sat, 15 Mar, 00:24 |
Hudson (JIRA) |
[jira] Commented: (NUTCH-575) NPE in OpenSearchServlet when summary is null |
Sat, 15 Mar, 04:15 |
Hudson (JIRA) |
[jira] Commented: (NUTCH-126) Fetching via https does not work with a proxy (patch) |
Sat, 15 Mar, 04:15 |
Hudson (JIRA) |
[jira] Commented: (NUTCH-601) Recrawling on existing crawl directory using force option |
Sat, 15 Mar, 04:15 |
Hudson (JIRA) |
[jira] Commented: (NUTCH-613) Empty Summaries and Cached Pages |
Sat, 15 Mar, 04:15 |
Hudson (JIRA) |
[jira] Commented: (NUTCH-612) URL filtering is always disabled in Generator when invoked by Crawl |
Sat, 15 Mar, 04:15 |
naveen.gosw...@wipro.com |
FW: Problem in running Nutch where proxy authentication is required. |
Sat, 15 Mar, 11:57 |
naveen.gosw...@wipro.com |
Thread behaviour in Nutch Crawl |
Sat, 15 Mar, 11:58 |
Vinci (JIRA) |
[jira] Created: (NUTCH-619) Another Language Identifier Plugin using Unicode code point range |
Sat, 15 Mar, 15:40 |
Vinci |
zh.ngp |
Sat, 15 Mar, 16:17 |
Vinci |
How can I change the analyzer of nutch query by plugin? |
Sat, 15 Mar, 16:26 |
Mark DeSpain (JIRA) |
[jira] Created: (NUTCH-620) BasicURLNormalizer should collapse runs of slashes with a single slash |
Sun, 16 Mar, 07:22 |
Mark DeSpain (JIRA) |
[jira] Updated: (NUTCH-620) BasicURLNormalizer should collapse runs of slashes with a single slash |
Sun, 16 Mar, 07:36 |
Vinci |
Chnage the Analyzer by plugin - how to dealing with the query? |
Sun, 16 Mar, 09:30 |
Vinci |
Write back to the segment? |
Sun, 16 Mar, 11:10 |
Vinci |
Re: Chnage the Analyzer by plugin - how to dealing with the query? Query always use the default analyzer! |
Sun, 16 Mar, 11:43 |
Vinci |
Cached page - can it be changed? |
Sun, 16 Mar, 12:12 |
Vinci |
(nutch 1.0) Query processing problem: NutchBeans and webapps search fail, but Luke sucess |
Sun, 16 Mar, 12:28 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-620) BasicURLNormalizer should collapse runs of slashes with a single slash |
Sun, 16 Mar, 20:06 |
Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-615) Redirected URL are fetched wihtout setting any FetchInterval |
Mon, 17 Mar, 02:45 |
Emmanuel Joke (JIRA) |
[jira] Commented: (NUTCH-530) Add a combiner to improve performance on updatedb |
Mon, 17 Mar, 02:59 |
Mark DeSpain (JIRA) |
[jira] Commented: (NUTCH-620) BasicURLNormalizer should collapse runs of slashes with a single slash |
Mon, 17 Mar, 06:11 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-615) Redirected URL are fetched wihtout setting any FetchInterval |
Mon, 17 Mar, 10:01 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-615) Redirected URL are fetched wihtout setting any FetchInterval |
Mon, 17 Mar, 12:35 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-616) Reset Fetch Retry counter when fetch is successful |
Mon, 17 Mar, 12:43 |
Andrzej Bialecki |
Retire the original Fetcher before the release? |
Mon, 17 Mar, 13:05 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-620) BasicURLNormalizer should collapse runs of slashes with a single slash |
Mon, 17 Mar, 13:23 |
Dennis Kubes |
Re: Retire the original Fetcher before the release? |
Mon, 17 Mar, 14:01 |
Andrzej Bialecki |
Re: Retire the original Fetcher before the release? |
Mon, 17 Mar, 14:20 |
Dennis Kubes |
Re: Retire the original Fetcher before the release? |
Mon, 17 Mar, 14:36 |
Andrzej Bialecki |
Re: Retire the original Fetcher before the release? |
Mon, 17 Mar, 15:17 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-220) PDF Box can't parse document: java.lang.NullPointerException |
Mon, 17 Mar, 16:23 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-220) PDF Box can't parse document: java.lang.NullPointerException |
Mon, 17 Mar, 16:23 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-223) Crawl.java uses Integer.MAX_VALUE for -topN where Generator.java uses Long.MAX_VALUE for -topN |
Mon, 17 Mar, 16:44 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-223) Crawl.java uses Integer.MAX_VALUE for -topN where Generator.java uses Long.MAX_VALUE for -topN |
Mon, 17 Mar, 16:44 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-243) Some meta-refresh urls get ignored due to matching regular expression |
Mon, 17 Mar, 16:50 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-243) Some meta-refresh urls get ignored due to matching regular expression |
Mon, 17 Mar, 16:50 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-610) Can't Update or modify an index while web gui is running |
Mon, 17 Mar, 16:52 |
Mark DeSpain (JIRA) |
[jira] Commented: (NUTCH-620) BasicURLNormalizer should collapse runs of slashes with a single slash |
Tue, 18 Mar, 02:24 |
Mark DeSpain (JIRA) |
[jira] Commented: (NUTCH-620) BasicURLNormalizer should collapse runs of slashes with a single slash |
Tue, 18 Mar, 04:43 |
Siddhartha Reddy |
Current OPIC implementation |
Tue, 18 Mar, 05:16 |
Hudson (JIRA) |
[jira] Commented: (NUTCH-616) Reset Fetch Retry counter when fetch is successful |
Tue, 18 Mar, 05:33 |
Hudson (JIRA) |
[jira] Commented: (NUTCH-220) PDF Box can't parse document: java.lang.NullPointerException |
Tue, 18 Mar, 05:33 |
Hudson (JIRA) |
[jira] Commented: (NUTCH-223) Crawl.java uses Integer.MAX_VALUE for -topN where Generator.java uses Long.MAX_VALUE for -topN |
Tue, 18 Mar, 05:33 |
Hudson (JIRA) |
[jira] Commented: (NUTCH-615) Redirected URL are fetched wihtout setting any FetchInterval |
Tue, 18 Mar, 05:33 |
Apache Hudson Server |
Build failed in Hudson: Nutch-trunk #393 |
Tue, 18 Mar, 05:34 |
Andrzej Bialecki |
Re: Current OPIC implementation |
Tue, 18 Mar, 09:18 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Tue, 18 Mar, 10:05 |
Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Tue, 18 Mar, 10:05 |
Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-598) Remove deprecated use of ToolBase, Migration to the new implementation |
Tue, 18 Mar, 10:05 |
Grant Ingersoll (JIRA) |
[jira] Created: (NUTCH-621) Nutch needs to declare it's crypto usage |
Tue, 18 Mar, 13:01 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-609) Allow Plugins to be Loaded from Jar File(s) |
Tue, 18 Mar, 14:51 |