Rolando H. Martinelli - CoBuys, S.A. |
GETTING OUT OF MAILING LIST |
Tue, 20 Dec, 21:06 |
Rozina Sorathia |
problem in merging index |
Wed, 14 Dec, 05:32 |
Rozina Sorathia |
help!search yielding 0 hits with nutch 8 segments |
Thu, 29 Dec, 14:32 |
Sami Siren |
Re: [Fwd: Crawler submits forms?] |
Thu, 15 Dec, 21:39 |
Sami Siren (JIRA) |
[jira] Resolved: (NUTCH-146) mapred.job.tracker.info.port is defined 2 times in the nutch-default.xml |
Tue, 20 Dec, 18:27 |
Sami Siren (JIRA) |
[jira] Resolved: (NUTCH-145) build of war file fails on Chinese (zh) .xml files due to UTF-8 BOM |
Tue, 20 Dec, 18:33 |
Stefan Groschupf |
Re: NDFS/MapReduce? |
Thu, 01 Dec, 20:23 |
Stefan Groschupf |
Re: incremental crawling |
Fri, 02 Dec, 09:48 |
Stefan Groschupf |
Re: submitting a patch? |
Tue, 06 Dec, 17:05 |
Stefan Groschupf |
RCP known limitation or bug? |
Tue, 06 Dec, 17:13 |
Stefan Groschupf |
Re: RCP known limitation or bug? |
Wed, 07 Dec, 19:56 |
Stefan Groschupf |
Re: nutch questions |
Fri, 09 Dec, 10:27 |
Stefan Groschupf |
Re: nutch questions |
Fri, 09 Dec, 16:18 |
Stefan Groschupf |
Re: parse.getData().getMetadata().get("propName") is NULL? |
Fri, 09 Dec, 19:14 |
Stefan Groschupf |
Re: [jira] Commented: (NUTCH-135) http header meta data are case insensitive in the real world (e.g. Content-Type or content-type) |
Sat, 10 Dec, 14:41 |
Stefan Groschupf |
Re: [Nutch-dev] What are the limitations of nutch |
Mon, 12 Dec, 08:34 |
Stefan Groschupf |
Re: Hard-coded Content-type checks |
Tue, 13 Dec, 14:47 |
Stefan Groschupf |
Re: Standard metadata property names in the ParseData metadata |
Tue, 13 Dec, 18:59 |
Stefan Groschupf |
Re: [Fwd: Crawler submits forms?] |
Tue, 13 Dec, 19:02 |
Stefan Groschupf |
best file system for NDFS? |
Tue, 13 Dec, 19:22 |
Stefan Groschupf |
Re: [Fwd: Crawler submits forms?] |
Wed, 14 Dec, 11:18 |
Stefan Groschupf |
vote for issues to fix in 0.7.2 |
Wed, 14 Dec, 13:18 |
Stefan Groschupf |
Re: vote for issues to fix in 0.7.2 |
Wed, 14 Dec, 15:11 |
Stefan Groschupf |
Re: mapreduce fetcher doesn't fetch all urls |
Wed, 14 Dec, 22:50 |
Stefan Groschupf |
Re: mapreduce fetcher doesn't fetch all urls |
Wed, 14 Dec, 23:41 |
Stefan Groschupf |
Re: mapreduce fetcher doesn't fetch all urls |
Thu, 15 Dec, 00:04 |
Stefan Groschupf |
Re: mapreduce fetcher doesn't fetch all urls |
Thu, 15 Dec, 11:14 |
Stefan Groschupf |
Re: mapreduce fetcher doesn't fetch all urls |
Thu, 15 Dec, 11:22 |
Stefan Groschupf |
Re: Nutch design queries |
Thu, 15 Dec, 14:17 |
Stefan Groschupf |
vote results. |
Thu, 15 Dec, 16:14 |
Stefan Groschupf |
Re: vote results. |
Thu, 15 Dec, 16:58 |
Stefan Groschupf |
Re: mapreduce fetcher doesn't fetch all urls |
Fri, 16 Dec, 01:04 |
Stefan Groschupf |
Re: [Nutch-dev] distributed seach |
Fri, 16 Dec, 11:13 |
Stefan Groschupf |
"Something is Wrong with Google’s Mathematical Model" |
Fri, 16 Dec, 19:27 |
Stefan Groschupf |
[bug] overwriting job properties until runtime is not possible |
Sun, 18 Dec, 23:19 |
Stefan Groschupf |
Re: Latest version of Mapred |
Mon, 19 Dec, 18:00 |
Stefan Groschupf |
problems http-client |
Mon, 19 Dec, 18:37 |
Stefan Groschupf |
Re: problems http-client |
Mon, 19 Dec, 18:51 |
Stefan Groschupf |
Re: [Nutch-dev] distributed search |
Mon, 19 Dec, 23:38 |
Stefan Groschupf |
Re: nutch and google suggestion |
Tue, 20 Dec, 09:34 |
Stefan Groschupf |
Re: Static initializers |
Tue, 20 Dec, 14:48 |
Stefan Groschupf |
Re: [bug] overwriting job properties until runtime is not possible |
Wed, 21 Dec, 00:46 |
Stefan Groschupf |
Re: nutch-0.8-dev *mapred.input.subdir* problem ? |
Wed, 21 Dec, 11:05 |
Stefan Groschupf |
Re: nightly build |
Wed, 21 Dec, 11:10 |
Stefan Groschupf |
Re: nutch-0.8-dev *mapred.input.subdir* problem ? |
Wed, 21 Dec, 12:41 |
Stefan Groschupf |
Re: IndexSorter optimizer |
Wed, 21 Dec, 22:56 |
Stefan Groschupf |
Re: Static initializers |
Thu, 22 Dec, 00:03 |
Stefan Groschupf |
Commons HttpClient 3.0 released |
Thu, 22 Dec, 09:07 |
Stefan Groschupf |
Re: Removing old classes from trunk/ |
Fri, 23 Dec, 10:41 |
Stefan Groschupf |
Re: severe error in fetch |
Sun, 25 Dec, 22:44 |
Stefan Groschupf |
Fwd: bug in Nutch wiki - FAQ |
Mon, 26 Dec, 13:50 |
Stefan Groschupf |
[bug?] PRC called emthod require parameter |
Tue, 27 Dec, 18:17 |
Stefan Groschupf (JIRA) |
[jira] Created: (NUTCH-133) ParserFactory does not work as expected |
Tue, 06 Dec, 22:04 |
Stefan Groschupf (JIRA) |
[jira] Updated: (NUTCH-133) ParserFactory does not work as expected |
Tue, 06 Dec, 22:06 |
Stefan Groschupf (JIRA) |
[jira] Updated: (NUTCH-133) ParserFactory does not work as expected |
Tue, 06 Dec, 22:16 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-133) ParserFactory does not work as expected |
Wed, 07 Dec, 18:09 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-133) ParserFactory does not work as expected |
Wed, 07 Dec, 20:05 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-133) ParserFactory does not work as expected |
Thu, 08 Dec, 10:30 |
Stefan Groschupf (JIRA) |
[jira] Closed: (NUTCH-133) ParserFactory does not work as expected |
Thu, 08 Dec, 11:06 |
Stefan Groschupf (JIRA) |
[jira] Created: (NUTCH-135) http header meta data are case insensitive in the real world (e.g. Content-Type or content-type) |
Fri, 09 Dec, 20:51 |
Stefan Groschupf (JIRA) |
[jira] Updated: (NUTCH-135) http header meta data are case insensitive in the real world (e.g. Content-Type or content-type) |
Fri, 09 Dec, 21:14 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-135) http header meta data are case insensitive in the real world (e.g. Content-Type or content-type) |
Sat, 10 Dec, 03:56 |
Stefan Groschupf (JIRA) |
[jira] Assigned: (NUTCH-3) multi values of header discarded |
Sat, 10 Dec, 03:56 |
Stefan Groschupf (JIRA) |
[jira] Updated: (NUTCH-135) http header meta data are case insensitive in the real world (e.g. Content-Type or content-type) |
Sat, 10 Dec, 14:40 |
Stefan Groschupf (JIRA) |
[jira] Created: (NUTCH-136) mapreduce segment generator generates 50 % less than excepted urls |
Mon, 12 Dec, 22:50 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-140) Add alias capability in parse-plugins.xml file that allows mimeType->extensionId mapping |
Wed, 14 Dec, 10:36 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-143) Improper error numbers returned on exit |
Fri, 16 Dec, 10:55 |
Stefan Groschupf (JIRA) |
[jira] Updated: (NUTCH-3) multi values of header discarded |
Fri, 16 Dec, 19:37 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-3) multi values of header discarded |
Sat, 17 Dec, 10:17 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-3) multi values of header discarded |
Sat, 17 Dec, 16:34 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-3) multi values of header discarded |
Sat, 17 Dec, 16:58 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-144) corrupt language identifier tri files and bad language recognition for german |
Sat, 17 Dec, 17:01 |
Stefan Groschupf (JIRA) |
[jira] Reopened: (NUTCH-3) multi values of header discarded |
Sat, 17 Dec, 17:55 |
Stefan Groschupf (JIRA) |
[jira] Updated: (NUTCH-3) multi values of header discarded |
Sat, 17 Dec, 17:57 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-145) ant build of the war fie fails on Chinese (zh) .xml files due to UTF-8 BOM |
Tue, 20 Dec, 00:07 |
Stefan Groschupf (JIRA) |
[jira] Created: (NUTCH-146) mapred.job.tracker.info.port is defined 2 times in the nutch-default.xml |
Tue, 20 Dec, 17:42 |
Stefan Groschupf (JIRA) |
[jira] Commented: (NUTCH-148) org.apache.nutch.tools.CrawlTool throws error while doing deleteduplicates |
Fri, 23 Dec, 16:27 |
Stefan Groschupf (JIRA) |
[jira] Closed: (NUTCH-154) Unable to add/update new files to fetchlist/fetcher and thus index, when u rerun crawl tool on same db. |
Wed, 28 Dec, 13:19 |
Stefan Groschupf (JIRA) |
[jira] Closed: (NUTCH-55) Create dmoz.org search plugin - incorporate the dmoz.org title/category/description if available & |
Wed, 28 Dec, 13:21 |
Thomas Jaeger |
Re: Trunk is broken |
Fri, 30 Dec, 10:42 |
YourSoft |
Re: vote for issues to fix in 0.7.2 |
Wed, 14 Dec, 13:41 |
Zaheed Haque |
Re: [Fwd: Crawler submits forms?] |
Wed, 14 Dec, 08:25 |
Zaheed Haque |
Re: mapred merge to trunk |
Thu, 15 Dec, 23:30 |
an...@orbita1.ru |
Killing lines |
Tue, 06 Dec, 15:13 |
byron miller (JIRA) |
[jira] Commented: (NUTCH-134) Summarizer doesn't select the best snippets |
Wed, 07 Dec, 21:49 |
byron miller (JIRA) |
[jira] Commented: (NUTCH-95) DeleteDuplicates depends on the order of input segments |
Wed, 28 Dec, 04:28 |
byron miller (JIRA) |
[jira] Commented: (NUTCH-55) Create dmoz.org search plugin - incorporate the dmoz.org title/category/description if available & |
Wed, 28 Dec, 04:31 |
byron miller (JIRA) |
[jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Wed, 28 Dec, 04:49 |
byron miller (JIRA) |
[jira] Commented: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content |
Thu, 29 Dec, 01:56 |
byron miller (JIRA) |
[jira] Commented: (NUTCH-153) TextParser is only supposed to parse plain text, but if given postscript, it can take hours and then fail |
Thu, 29 Dec, 02:35 |
byron miller (JIRA) |
[jira] Commented: (NUTCH-92) DistributedSearch incorrectly scores results |
Thu, 29 Dec, 02:45 |
byron miller (JIRA) |
[jira] Commented: (NUTCH-134) Summarizer doesn't select the best snippets |
Thu, 29 Dec, 03:32 |
byron miller (JIRA) |
[jira] Created: (NUTCH-158) Process Sitemap data in text, rss or xml format as well as OAI-PMH |
Thu, 29 Dec, 20:00 |
byron miller (JIRA) |
[jira] Commented: (NUTCH-155) Remove web gui from the distribution to "contrib" and use OpenSearch Servlet |
Thu, 29 Dec, 20:05 |
byron miller (JIRA) |
[jira] Created: (NUTCH-159) Specify temp/working directory for crawl |
Sat, 31 Dec, 18:06 |
byron miller (JIRA) |
[jira] Commented: (NUTCH-123) Cache.jsp some times generate NullPointerException |
Sat, 31 Dec, 20:45 |
byron miller (JIRA) |
[jira] Commented: (NUTCH-42) enhance search.jsp such that it can also returns XML |
Sat, 31 Dec, 20:57 |
charlie |
about the question of clustering-carrot2 |
Thu, 08 Dec, 10:02 |
h...@ai.univ-paris8.fr |
translation of Nutch search page |
Wed, 14 Dec, 15:57 |
karamjit (JIRA) |
[jira] Created: (NUTCH-157) Problem during parsing msword document . It fetching properly but parsing is not working. Please show me the way how can i parse it |
Thu, 29 Dec, 12:25 |