Doğacan Güney |
Re: How to modify crawldb values |
Tue, 23 Jan, 15:06 |
"Thomas Müller" |
Re: [jira] Commented: (NUTCH-422) index-extra plugin creates additional fields in the index, based on configurable logic |
Wed, 03 Jan, 06:35 |
"Thomas Müller" |
Re: Next Nutch release |
Tue, 16 Jan, 16:37 |
Dogacan Güney (JIRA) |
[jira] Updated: (NUTCH-420) DeleteDuplicates.HashPartitioner depends on the order of IndexDocs |
Thu, 04 Jan, 09:30 |
Dogacan Güney (JIRA) |
[jira] Commented: (NUTCH-420) DeleteDuplicates.HashPartitioner depends on the order of IndexDocs |
Thu, 04 Jan, 09:30 |
Dogacan Güney (JIRA) |
[jira] Commented: (NUTCH-420) DeleteDuplicates.HashPartitioner depends on the order of IndexDocs |
Mon, 08 Jan, 15:28 |
Dogacan Güney (JIRA) |
[jira] Updated: (NUTCH-420) DeleteDuplicates.HashPartitioner depends on the order of IndexDocs |
Mon, 08 Jan, 15:28 |
Dogacan Güney (JIRA) |
[jira] Commented: (NUTCH-420) DeleteDuplicates.HashPartitioner depends on the order of IndexDocs |
Tue, 09 Jan, 08:42 |
Dogacan Güney (JIRA) |
[jira] Updated: (NUTCH-420) DeleteDuplicates.HashPartitioner depends on the order of IndexDocs |
Tue, 09 Jan, 08:42 |
AJ Chen |
Re: Reviving Nutch 0.7 |
Mon, 22 Jan, 20:42 |
Alan Tanaman |
New index-extra plugin and patch to IndexFilters |
Tue, 02 Jan, 10:24 |
Alan Tanaman |
RE: [jira] Commented: (NUTCH-422) index-extra plugin creates additional fields in the index, based on configurable logic |
Tue, 02 Jan, 11:52 |
Alan Tanaman |
Creating Lucence Compound Index |
Tue, 02 Jan, 12:57 |
Alan Tanaman |
RE: Creating Lucence Compound Index |
Tue, 02 Jan, 13:34 |
Alan Tanaman |
RE: Creating Lucence Compound Index |
Tue, 02 Jan, 14:12 |
Alan Tanaman |
RE: [jira] Commented: (NUTCH-422) index-extra plugin creates additional fields in the index, based on configurable logic |
Wed, 03 Jan, 12:09 |
Alan Tanaman |
RE: Next Nutch release |
Tue, 16 Jan, 17:48 |
Alan Tanaman |
RE: How to index in real time? |
Wed, 17 Jan, 15:09 |
Alan Tanaman |
RE: Reviving Nutch 0.7 |
Mon, 22 Jan, 10:37 |
Alan Tanaman |
RE: Reviving Nutch 0.7 |
Tue, 23 Jan, 09:41 |
Alan Tanaman |
RE: parse-rss make them items as different pages |
Sun, 28 Jan, 13:04 |
Alan Tanaman (JIRA) |
[jira] Commented: (NUTCH-422) index-extra plugin creates additional fields in the index, based on configurable logic |
Tue, 02 Jan, 22:57 |
Alan Tanaman (JIRA) |
[jira] Commented: (NUTCH-422) index-extra plugin creates additional fields in the index, based on configurable logic |
Mon, 15 Jan, 10:52 |
Andrew Groh (JIRA) |
[jira] Created: (NUTCH-436) Incorrect handling of relative paths when the embedded URL path is empty |
Fri, 26 Jan, 14:09 |
Andrew Groh (JIRA) |
[jira] Updated: (NUTCH-436) Incorrect handling of relative paths when the embedded URL path is empty |
Fri, 26 Jan, 14:13 |
Andrew Groh (JIRA) |
[jira] Commented: (NUTCH-436) Incorrect handling of relative paths when the embedded URL path is empty |
Fri, 26 Jan, 18:37 |
Andrzej Bialecki |
Re: Creating Lucence Compound Index |
Tue, 02 Jan, 13:07 |
Andrzej Bialecki |
Re: Creating Lucence Compound Index |
Tue, 02 Jan, 14:06 |
Andrzej Bialecki |
Re: Bug in Nutch, possibly due to issues-273 and 322 |
Wed, 03 Jan, 19:50 |
Andrzej Bialecki |
Re: How can I get one plugin's root dir |
Mon, 15 Jan, 17:33 |
Andrzej Bialecki |
Re: How can I get one plugin's root dir |
Tue, 16 Jan, 15:44 |
Andrzej Bialecki |
Re: Next Nutch release |
Tue, 16 Jan, 16:19 |
Andrzej Bialecki |
Re: How can I get one plugin's root dir |
Tue, 16 Jan, 16:55 |
Andrzej Bialecki |
Re: How can I get one plugin's root dir |
Tue, 16 Jan, 19:27 |
Andrzej Bialecki |
Re: Next Nutch release |
Wed, 17 Jan, 18:24 |
Andrzej Bialecki |
Fetcher2 |
Wed, 17 Jan, 21:18 |
Andrzej Bialecki |
Re: java.io.EOFException in latest nightly in mergesegs from hadoop.io.DataOutputBuffer |
Thu, 18 Jan, 21:44 |
Andrzej Bialecki |
Re: java.io.EOFException in latest nightly in mergesegs from hadoop.io.DataOutputBuffer |
Fri, 19 Jan, 09:29 |
Andrzej Bialecki |
Re: Next Nutch release |
Fri, 19 Jan, 10:32 |
Andrzej Bialecki |
Re: Next Nutch release |
Fri, 19 Jan, 21:22 |
Andrzej Bialecki |
Re: How to Become a Nutch Developer |
Sun, 21 Jan, 20:20 |
Andrzej Bialecki |
Re: Fetcher2 |
Mon, 22 Jan, 16:09 |
Andrzej Bialecki |
Re: How to Become a Nutch Developer |
Mon, 22 Jan, 16:22 |
Andrzej Bialecki |
Re: java.io.EOFException in latest nightly in mergesegs from hadoop.io.DataOutputBuffer |
Mon, 22 Jan, 20:17 |
Andrzej Bialecki |
Re: is crawldb format in Nutch 0.8 compatible with Nutch0.7 |
Tue, 23 Jan, 20:55 |
Andrzej Bialecki |
Re: Modified date in crawldb |
Thu, 25 Jan, 15:29 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implmentation. |
Fri, 05 Jan, 15:56 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-425) parse-js pollutes anchor text with base URL of source page |
Fri, 05 Jan, 17:01 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-426) parse-js skips parsing if found URL fails java.net.URL parse |
Fri, 05 Jan, 17:01 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-420) DeleteDuplicates.HashPartitioner depends on the order of IndexDocs |
Thu, 11 Jan, 22:02 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Wed, 17 Jan, 19:34 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-68) A tool to generate arbitrary fetchlists |
Wed, 17 Jan, 19:57 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-353) pages that serverside forwards will be refetched every time |
Sat, 20 Jan, 22:24 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-434) Replace usage of ObjectWritable with something based on GenericWritable |
Wed, 24 Jan, 20:47 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-339) Refactor nutch to allow fetcher improvements |
Thu, 25 Jan, 07:43 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-433) java.io.EOFException in newer nightlies in mergesegs or indexing from hadoop.io.DataOutputBuffer |
Thu, 25 Jan, 17:41 |
Armel Nene (JIRA) |
[jira] Created: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implmentation. |
Fri, 05 Jan, 14:44 |
Armel Nene (JIRA) |
[jira] Updated: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implmentation. |
Fri, 05 Jan, 15:11 |
Armel Nene (JIRA) |
[jira] Commented: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implmentation. |
Fri, 05 Jan, 16:02 |
Armel Nene (JIRA) |
[jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Mon, 15 Jan, 10:12 |
Armel Nene (JIRA) |
[jira] Updated: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Thu, 18 Jan, 09:52 |
Armel Nene (JIRA) |
[jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Thu, 18 Jan, 10:00 |
Armel T. Nene |
protocol-smb: a new protocol plugin for Windows Shares |
Fri, 05 Jan, 15:22 |
Armel T. Nene |
RE: Next Nutch release |
Wed, 17 Jan, 18:08 |
Armel T. Nene |
java.lang.IllegalStateException |
Fri, 19 Jan, 10:17 |
Armel T. Nene |
How to modify crawldb values |
Tue, 23 Jan, 14:50 |
Armel T. Nene |
RE: How to modify crawldb values |
Tue, 23 Jan, 16:18 |
Armel T. Nene |
is crawldb format in Nutch 0.8 compatible with Nutch0.7 |
Tue, 23 Jan, 19:26 |
Armel T. Nene |
RE: Fetcher2 |
Thu, 25 Jan, 00:06 |
Armel T. Nene |
RE: Fetcher2 |
Thu, 25 Jan, 11:43 |
Armel T. Nene |
Modified date in crawldb |
Thu, 25 Jan, 11:52 |
Armel T. Nene |
RE: Modified date in crawldb |
Thu, 25 Jan, 15:07 |
Armel T. Nene |
threads-safe methods in Nutch |
Thu, 25 Jan, 16:27 |
Brian Whitman |
java.io.EOFException in latest nightly in mergesegs from hadoop.io.DataOutputBuffer |
Thu, 18 Jan, 20:08 |
Brian Whitman |
Re: java.io.EOFException in latest nightly in mergesegs from hadoop.io.DataOutputBuffer |
Thu, 18 Jan, 22:09 |
Brian Whitman |
Re: java.io.EOFException in latest nightly in mergesegs from hadoop.io.DataOutputBuffer |
Fri, 19 Jan, 14:24 |
Brian Whitman |
Re: java.io.EOFException in latest nightly in mergesegs from hadoop.io.DataOutputBuffer |
Mon, 22 Jan, 16:36 |
Brian Whitman (JIRA) |
[jira] Created: (NUTCH-432) JAVA_PLATFORM with spaces (i.e. Mac OS X-ppc-32) breaks bin/nutch script |
Wed, 24 Jan, 17:39 |
Brian Whitman (JIRA) |
[jira] Commented: (NUTCH-432) JAVA_PLATFORM with spaces (i.e. Mac OS X-ppc-32) breaks bin/nutch script |
Wed, 24 Jan, 17:44 |
Brian Whitman (JIRA) |
[jira] Created: (NUTCH-433) java.io.EOFException in newer nightlies in mergesegs or indexing from hadoop.io.DataOutputBuffer |
Wed, 24 Jan, 17:53 |
Brian Whitman (JIRA) |
[jira] Commented: (NUTCH-433) java.io.EOFException in newer nightlies in mergesegs or indexing from hadoop.io.DataOutputBuffer |
Thu, 25 Jan, 17:08 |
Brian Whitman (JIRA) |
[jira] Commented: (NUTCH-433) java.io.EOFException in newer nightlies in mergesegs or indexing from hadoop.io.DataOutputBuffer |
Thu, 25 Jan, 17:53 |
Chee Wu |
nutch81 pages seems were not kept but no error message found |
Wed, 03 Jan, 12:30 |
Chris A. Mattmann (JIRA) |
[jira] Created: (NUTCH-431) Move plugin specific properties out of nutch-site.xml and into specific conf files for plugins |
Sat, 20 Jan, 22:03 |
Chris A. Mattmann (JIRA) |
[jira] Commented: (NUTCH-353) pages that serverside forwards will be refetched every time |
Sat, 20 Jan, 23:46 |
Chris A. Mattmann (JIRA) |
[jira] Assigned: (NUTCH-431) Move plugin specific properties out of nutch-site.xml and into specific conf files for plugins |
Fri, 26 Jan, 18:47 |
Chris A. Mattmann (JIRA) |
[jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Fri, 26 Jan, 18:51 |
Chris A. Mattmann (JIRA) |
[jira] Work started: (NUTCH-390) Javadoc warnings |
Tue, 30 Jan, 05:55 |
Chris A. Mattmann (JIRA) |
[jira] Resolved: (NUTCH-390) Javadoc warnings |
Tue, 30 Jan, 05:57 |
Chris A. Mattmann (JIRA) |
[jira] Closed: (NUTCH-390) Javadoc warnings |
Tue, 30 Jan, 05:59 |
Chris A. Mattmann (JIRA) |
[jira] Work started: (NUTCH-384) Protocol-file plugin does not allow the parse plugins framework to operate properly |
Tue, 30 Jan, 06:05 |
Chris Mattmann |
Re: Next Nutch release |
Tue, 16 Jan, 16:40 |
Chris Mattmann |
Re: How to Become a Nutch Developer |
Sun, 21 Jan, 20:27 |
Chris Mattmann |
Re: Reviving Nutch 0.7 |
Mon, 22 Jan, 15:35 |
Chris Mattmann |
Re: [jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Thu, 25 Jan, 18:43 |
Chris Mattmann |
Re: [jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Thu, 25 Jan, 19:11 |
Chris Mattmann |
Re: RSS-fecter and index individul-how can i realize this function |
Wed, 31 Jan, 02:16 |
Chris Mattmann |
Re: RSS-fecter and index individul-how can i realize this function |
Wed, 31 Jan, 03:34 |
Chris Mattmann |
Re: RSS-fecter and index individul-how can i realize this function |
Wed, 31 Jan, 06:44 |
DS jha |
sort result on different set of terms |
Wed, 10 Jan, 15:02 |