Nutch开发邮件 |
How do I use nuch tomerge multiple webdb? |
Fri, 09 Jun, 04:41 |
Bill de hÓra |
which web app? |
Fri, 16 Jun, 22:28 |
Björn Wilmsmann |
wildcard / regular expression searches |
Tue, 06 Jun, 22:12 |
Jérôme Charron |
Re: svn commit: r411943 - in /lucene/nutch/trunk/lib: commons-logging-1.0.4.jar hadoop-0.2.1.jar hadoop-0.3.1.jar log4j-1.2.13.jar |
Tue, 06 Jun, 09:02 |
Jérôme Charron |
Re: svn commit: r411943 - in /lucene/nutch/trunk/lib: commons-logging-1.0.4.jar hadoop-0.2.1.jar hadoop-0.3.1.jar log4j-1.2.13.jar |
Tue, 06 Jun, 15:04 |
Jérôme Charron |
Re: Status of language plugin |
Wed, 07 Jun, 08:58 |
Jérôme Charron |
Nutch logging questions |
Fri, 09 Jun, 17:35 |
Jérôme Charron |
Re: [jira] Resolved: (NUTCH-303) logging improvements |
Tue, 13 Jun, 19:44 |
Jérôme Charron |
Re: [Nutch-cvs] svn commit: r414681 - /lucene/nutch/trunk/src/java/org/apache/nutch/protocol/ProtocolFactory.java |
Fri, 16 Jun, 10:32 |
Jérôme Charron |
Re: <noinde>do not index</noindex> |
Thu, 22 Jun, 14:02 |
Jérôme Charron |
Re: svn commit: r416346 [1/3] - in /lucene/nutch/trunk/src: java/org/apache/nutch/analysis/ java/org/apache/nutch/clustering/ java/org/apache/nutch/crawl/ java/org/apache/nutch/fetcher/ java/org/apache/nutch/indexer/ java/org/apache/nutch/net/ java/o |
Thu, 22 Jun, 20:58 |
Jérôme Charron |
Plugin Repository caching |
Fri, 23 Jun, 09:10 |
Jérôme Charron |
Re: Possible memory leak? |
Wed, 28 Jun, 11:57 |
Lourival Júnior |
Re: resolving IP in... |
Wed, 07 Jun, 17:25 |
Lourival Júnior |
Re: resolving IP in... |
Wed, 07 Jun, 18:50 |
Lourival Júnior |
Adding new urls in WebDB |
Fri, 09 Jun, 11:46 |
Lourival Júnior |
Re: Adding new urls in WebDB |
Fri, 09 Jun, 13:17 |
Lourival Júnior |
Re: Adding new urls in WebDB |
Fri, 09 Jun, 17:26 |
Lourival Júnior |
nutch-default.xml configuration |
Mon, 12 Jun, 14:33 |
Lourival Júnior |
Re: nutch-default.xml configuration |
Mon, 12 Jun, 14:51 |
Lourival Júnior |
Re: Re[2]: nutch-default.xml configuration |
Mon, 12 Jun, 15:06 |
Uygar Yüzsüren |
parse OutOfMemoryError? |
Mon, 05 Jun, 12:35 |
AJ Chen |
does nutch follow HEAD <link> element? |
Fri, 16 Jun, 20:33 |
AJ Chen |
Re: does nutch follow HEAD <link> element? |
Sat, 17 Jun, 00:36 |
Andrzej Bialecki |
Re: summary |
Mon, 05 Jun, 10:55 |
Andrzej Bialecki |
Re: search engine spam detector |
Mon, 05 Jun, 13:12 |
Andrzej Bialecki |
Re: [Nutch-cvs] svn commit: r411594 - /lucene/nutch/trunk/contrib/web2/plugins/build.xml |
Mon, 05 Jun, 15:00 |
Andrzej Bialecki |
Re: [jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Mon, 05 Jun, 16:50 |
Andrzej Bialecki |
Re: [jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Mon, 05 Jun, 17:34 |
Andrzej Bialecki |
Re: [Nutch-cvs] svn commit: r411594 - /lucene/nutch/trunk/contrib/web2/plugins/build.xml |
Tue, 06 Jun, 23:20 |
Andrzej Bialecki |
Re: anchor text modifications |
Fri, 09 Jun, 08:30 |
Andrzej Bialecki |
Re: 0.8 release |
Sat, 10 Jun, 08:41 |
Andrzej Bialecki |
Re: [Nutch-cvs] svn commit: r414681 - /lucene/nutch/trunk/src/java/org/apache/nutch/protocol/ProtocolFactory.java |
Fri, 16 Jun, 08:59 |
Andrzej Bialecki |
Re: does nutch follow HEAD <link> element? |
Fri, 16 Jun, 23:07 |
Andrzej Bialecki |
Re: Possible memory leak? |
Wed, 28 Jun, 11:39 |
Andrzej Bialecki (JIRA) |
[jira] Created: (NUTCH-300) Clustering API improvements |
Mon, 05 Jun, 15:20 |
Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-300) Clustering API improvements |
Mon, 05 Jun, 18:23 |
Andrzej Bialecki (JIRA) |
[jira] Commented: (NUTCH-293) support for Crawl-delay in Robots.txt |
Wed, 07 Jun, 20:28 |
Andrzej Bialecki (JIRA) |
[jira] Created: (NUTCH-308) Maximum search time limit |
Thu, 22 Jun, 00:59 |
Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-308) Maximum search time limit |
Thu, 22 Jun, 01:01 |
Andrzej Bialecki (JIRA) |
[jira] Closed: (NUTCH-308) Maximum search time limit |
Mon, 26 Jun, 19:42 |
Andrzej Bialecki (JIRA) |
[jira] Updated: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Mon, 26 Jun, 20:33 |
Andy Hedges (JIRA) |
[jira] Commented: (NUTCH-129) rtf-parser does not work when opened with wordpad files and saved |
Sun, 25 Jun, 20:22 |
Brian Higgins |
anchor text modifications |
Fri, 09 Jun, 05:36 |
Chris A. Mattmann (JIRA) |
[jira] Commented: (NUTCH-294) Topic-maps of related searchwords |
Sat, 03 Jun, 17:59 |
Chris A. Mattmann (JIRA) |
[jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Sat, 03 Jun, 18:10 |
Chris A. Mattmann (JIRA) |
[jira] Assigned: (NUTCH-236) PdfParser and RSSParser Log4j appender redirection |
Sat, 03 Jun, 18:16 |
Chris A. Mattmann (JIRA) |
[jira] Commented: (NUTCH-236) PdfParser and RSSParser Log4j appender redirection |
Sat, 03 Jun, 18:18 |
Chris A. Mattmann (JIRA) |
[jira] Updated: (NUTCH-236) PdfParser and RSSParser Log4j appender redirection |
Sat, 03 Jun, 18:18 |
Chris A. Mattmann (JIRA) |
[jira] Updated: (NUTCH-187) Cannot start Nutch datanodes on Windows outside of a cygwin environment because of DF |
Sat, 03 Jun, 18:44 |
Chris A. Mattmann (JIRA) |
[jira] Resolved: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Sun, 04 Jun, 18:20 |
Chris A. Mattmann (JIRA) |
[jira] Closed: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Sun, 04 Jun, 18:22 |
Chris A. Mattmann (JIRA) |
[jira] Reopened: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Mon, 05 Jun, 18:40 |
Chris A. Mattmann (JIRA) |
[jira] Updated: (NUTCH-236) PdfParser and RSSParser Log4j appender redirection |
Fri, 09 Jun, 04:08 |
Chris A. Mattmann (JIRA) |
[jira] Created: (NUTCH-304) Change JIRA email address for nutch issues from apache incubator |
Fri, 09 Jun, 04:13 |
Chris A. Mattmann (JIRA) |
[jira] Updated: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Fri, 09 Jun, 20:02 |
Chris A. Mattmann (JIRA) |
[jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Thu, 15 Jun, 19:09 |
Chris Mattmann |
Re: [jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Mon, 05 Jun, 15:20 |
Chris Mattmann |
Re: [jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore |
Mon, 05 Jun, 17:01 |
Chris Schneider (JIRA) |
[jira] Created: (NUTCH-301) CommonGrams loads analysis.common.terms.file for each query |
Wed, 07 Jun, 02:50 |
Dagum, Leo |
nutch .72 out-of-the-box build issue |
Wed, 14 Jun, 17:34 |
Dawid Weiss (JIRA) |
[jira] Commented: (NUTCH-294) Topic-maps of related searchwords |
Tue, 06 Jun, 14:25 |
Dawid Weiss (JIRA) |
[jira] Commented: (NUTCH-294) Topic-maps of related searchwords |
Wed, 07 Jun, 07:35 |
Dawid Weiss (JIRA) |
[jira] Commented: (NUTCH-309) Uses commons logging Code Guards |
Thu, 29 Jun, 06:56 |
Dennis Kubes |
Re: resolving IP in... |
Wed, 07 Jun, 20:33 |
Dennis Kubes |
Re: resolving IP in... |
Thu, 08 Jun, 15:03 |
Dennis Kubes |
Re: How do I use nuch tomerge multiple webdb? |
Fri, 09 Jun, 05:02 |
Dennis Kubes |
Re: how to manipulate with MapWritable metaData in CrawlDatum structure |
Mon, 12 Jun, 03:32 |
Dennis Kubes |
Re: No space left on device |
Wed, 14 Jun, 13:46 |
Dennis Kubes (JIRA) |
[jira] Created: (NUTCH-295) More description for fetcher.threads.fetch property |
Fri, 02 Jun, 16:58 |
Dennis Kubes (JIRA) |
[jira] Updated: (NUTCH-295) More description for fetcher.threads.fetch property |
Fri, 02 Jun, 17:00 |
Dima Mazmanov |
Re: nutch-default.xml configuration |
Mon, 12 Jun, 15:38 |
Dima Mazmanov |
Re[2]: nutch-default.xml configuration |
Mon, 12 Jun, 15:54 |
Dima Mazmanov |
Re[4]: nutch-default.xml configuration |
Mon, 12 Jun, 16:11 |
Doug Cutting |
Re: svn commit: r411943 - in /lucene/nutch/trunk/lib: commons-logging-1.0.4.jar hadoop-0.2.1.jar hadoop-0.3.1.jar log4j-1.2.13.jar |
Tue, 06 Jun, 17:09 |
Doug Cutting |
Re: Nutch logging questions |
Fri, 09 Jun, 17:45 |
Doug Cutting |
IncrediBILL's Random Rants: How Much Nutch is TOO MUCH Nutch? |
Wed, 14 Jun, 17:03 |
Doug Cutting |
Re: svn commit: r416346 [1/3] - in /lucene/nutch/trunk/src: java/org/apache/nutch/analysis/ java/org/apache/nutch/clustering/ java/org/apache/nutch/crawl/ java/org/apache/nutch/fetcher/ java/org/apache/nutch/indexer/ java/org/apache/nutch/net/ java/org/apa... |
Thu, 22 Jun, 18:09 |
Doug Cutting (JIRA) |
[jira] Commented: (NUTCH-303) logging improvements |
Thu, 22 Jun, 18:26 |
Doug Cutting (JIRA) |
[jira] Resolved: (NUTCH-312) Fix for upcoming incompatibility with Hadoop-0.4 |
Wed, 28 Jun, 21:55 |
Enrico Triolo |
Possible memory leak? |
Wed, 28 Jun, 10:44 |
Enrico Triolo |
Re: Possible memory leak? |
Wed, 28 Jun, 12:12 |
Enrico Triolo (JIRA) |
[jira] Created: (NUTCH-314) Multiple language identifier instances |
Wed, 28 Jun, 12:30 |
Feng Ji |
how to manipulate with MapWritable metaData in CrawlDatum structure |
Mon, 12 Jun, 02:15 |
Francesco Cipriani |
webdb: old code <-> new code |
Wed, 21 Jun, 22:15 |
Fuad Efendi |
RE: following forms using nutch... |
Thu, 22 Jun, 05:15 |
Gal Nitzan |
RE: search speed |
Thu, 15 Jun, 09:32 |
Gal Nitzan |
RE: IncrediBILL's Random Rants: How Much Nutch is TOO MUCH Nutch? |
Thu, 15 Jun, 10:53 |
Grant Glouser (JIRA) |
[jira] Created: (NUTCH-306) DistributedSearch.Client liveAddresses concurrency problem |
Sat, 10 Jun, 01:34 |
Grant Glouser (JIRA) |
[jira] Updated: (NUTCH-306) DistributedSearch.Client liveAddresses concurrency problem |
Sat, 10 Jun, 01:36 |
Grant Glouser (JIRA) |
[jira] Updated: (NUTCH-306) DistributedSearch.Client liveAddresses concurrency problem |
Thu, 22 Jun, 00:49 |
Hasan Diwan (JIRA) |
[jira] Created: (NUTCH-299) Bittorrent Parser |
Sat, 03 Jun, 23:04 |
Hasan Diwan (JIRA) |
[jira] Updated: (NUTCH-299) Bittorrent Parser |
Sat, 03 Jun, 23:07 |
Hasan Diwan (JIRA) |
[jira] Commented: (NUTCH-299) Bittorrent Parser |
Sun, 04 Jun, 16:04 |
Jerome Charron (JIRA) |
[jira] Resolved: (NUTCH-298) if a 404 for a robots.txt is returned a NPE is thrown |
Mon, 05 Jun, 21:48 |
Jerome Charron (JIRA) |
[jira] Commented: (NUTCH-301) CommonGrams loads analysis.common.terms.file for each query |
Wed, 07 Jun, 08:29 |
Jerome Charron (JIRA) |
[jira] Commented: (NUTCH-275) Fetcher not parsing XHTML-pages at all |
Wed, 07 Jun, 11:49 |
Jerome Charron (JIRA) |
[jira] Resolved: (NUTCH-275) Fetcher not parsing XHTML-pages at all |
Wed, 07 Jun, 13:08 |
Jerome Charron (JIRA) |
[jira] Created: (NUTCH-303) logging improvements |
Wed, 07 Jun, 16:55 |
Jerome Charron (JIRA) |
[jira] Resolved: (NUTCH-301) CommonGrams loads analysis.common.terms.file for each query |
Wed, 07 Jun, 22:20 |