Simão Fontes |
Re: % of different content types out there on the web |
Sat, 28 Jan, 16:01 |
José Gil (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1245) URL gone with 404 after db.fetch.interval.max stays db_unfetched in CrawlDb and is generated over and over again |
Tue, 10 Jan, 14:32 |
Ale |
make nutch plugin to get termfreqvectors |
Thu, 19 Jan, 23:36 |
Andreas Janning (Created) (JIRA) |
[jira] [Created] (NUTCH-1250) parse-html does not parse links with empty anchor |
Tue, 17 Jan, 16:27 |
Andrzej Bialecki (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1247) CrawlDatum.retries should be int |
Fri, 13 Jan, 22:14 |
Andrzej Bialecki (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1247) CrawlDatum.retries should be int |
Sat, 14 Jan, 13:55 |
Andrzej Bialecki (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1201) Allow for different FetcherThread impls |
Tue, 17 Jan, 19:37 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-nutchgora #117 |
Sun, 01 Jan, 04:15 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-trunk #1711 |
Sun, 01 Jan, 04:16 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-trunk #1713 |
Tue, 03 Jan, 04:10 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-trunk #1714 |
Wed, 04 Jan, 04:20 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-trunk #1715 |
Thu, 05 Jan, 04:17 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-trunk #1716 |
Thu, 05 Jan, 12:48 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-nutchgora #124 |
Sun, 08 Jan, 04:16 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-nutchgora #125 |
Mon, 09 Jan, 04:11 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-nutchgora #126 |
Tue, 10 Jan, 05:25 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-trunk #1721 |
Tue, 10 Jan, 05:37 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-nutchgora #127 |
Wed, 11 Jan, 04:15 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-nutchgora #128 |
Wed, 11 Jan, 17:34 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-nutchgora #129 |
Wed, 11 Jan, 18:20 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-trunk #1726 |
Sat, 14 Jan, 15:51 |
Apache Jenkins Server |
Build failed in Jenkins: nutch-trunk-maven #108 |
Sat, 14 Jan, 16:34 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-trunk #1727 |
Sat, 14 Jan, 16:50 |
Apache Jenkins Server |
Jenkins build is back to normal : nutch-trunk-maven #109 |
Sat, 14 Jan, 17:26 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-trunk #1730 |
Mon, 16 Jan, 04:30 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-nutchgora #146 |
Sun, 29 Jan, 04:11 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-trunk #1742 |
Sun, 29 Jan, 04:11 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-nutchgora #147 |
Mon, 30 Jan, 04:14 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-trunk #1743 |
Mon, 30 Jan, 04:26 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "AdminGroup" by LewisJohnMcgibbney |
Mon, 09 Jan, 16:10 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "bin/nutch_readdb" by MarkusJelsma |
Mon, 09 Jan, 16:14 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "bin/nutch_readdb" by MarkusJelsma |
Mon, 09 Jan, 16:15 |
Apache Wiki |
[Nutch Wiki] Update of "bin/nutch_readdb" by MarkusJelsma |
Mon, 09 Jan, 16:15 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "bin/nutch solrindex" by MarkusJelsma |
Mon, 09 Jan, 16:19 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "bin/nutch solrindex" by MarkusJelsma |
Tue, 10 Jan, 13:51 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "AdminGroup" by LewisJohnMcgibbney |
Fri, 13 Jan, 12:33 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "PluginCentral" by ElisabethAdler |
Sat, 14 Jan, 14:17 |
Apache Wiki |
[Nutch Wiki] Update of "IndexMetatags" by ElisabethAdler |
Sat, 14 Jan, 15:10 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "AdminGroup" by LewisJohnMcgibbney |
Tue, 17 Jan, 23:19 |
Arkadi Kosmynin (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1251) Deletion of duplicates fails with org.apache.solr.client.solrj.SolrServerException |
Tue, 17 Jan, 23:13 |
Arkadi Kosmynin (Created) (JIRA) |
[jira] [Created] (NUTCH-1251) Deletion of duplicates fails with org.apache.solr.client.solrj.SolrServerException |
Tue, 17 Jan, 22:41 |
Arkadi Kosmynin (Updated) (JIRA) |
[jira] [Updated] (NUTCH-1251) Deletion of duplicates fails with org.apache.solr.client.solrj.SolrServerException |
Tue, 17 Jan, 22:45 |
Dean Del Ponte (Commented) (JIRA) |
[jira] [Commented] (NUTCH-809) Parse-metatags plugin |
Wed, 11 Jan, 20:51 |
Dennis Spathis (Created) (JIRA) |
[jira] [Created] (NUTCH-1253) Incompatible neko and xerces versions |
Wed, 18 Jan, 14:26 |
Eddie Drapkin |
I want to volunteer some time |
Tue, 17 Jan, 19:07 |
Eddie Drapkin |
Re: I want to volunteer some time |
Tue, 17 Jan, 21:31 |
Edward Drapkin |
Re: [DISCUSS] Issues with Fetcher |
Sat, 21 Jan, 16:17 |
Edward Drapkin (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment |
Tue, 17 Jan, 18:24 |
Edward Drapkin (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1201) Allow for different FetcherThread impls |
Tue, 17 Jan, 18:50 |
Edward Drapkin (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1201) Allow for different FetcherThread impls |
Tue, 17 Jan, 19:17 |
Edward Drapkin (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1201) Allow for different FetcherThread impls |
Tue, 17 Jan, 20:01 |
Edward Drapkin (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1201) Allow for different FetcherThread impls |
Wed, 18 Jan, 14:32 |
Edward Drapkin (Created) (JIRA) |
[jira] [Created] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment |
Wed, 04 Jan, 22:31 |
Edward Drapkin (Updated) (JIRA) |
[jira] [Updated] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment |
Wed, 04 Jan, 22:33 |
Edward Drapkin (Updated) (JIRA) |
[jira] [Updated] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment |
Tue, 17 Jan, 18:00 |
Edward Drapkin (Updated) (JIRA) |
[jira] [Updated] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment |
Tue, 17 Jan, 18:02 |
Edward Drapkin (Updated) (JIRA) |
[jira] [Updated] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment |
Tue, 17 Jan, 18:02 |
Edward Drapkin (Updated) (JIRA) |
[jira] [Updated] (NUTCH-1242) Allow disabling of URL Filters in ParseSegment |
Tue, 17 Jan, 18:44 |
Elisabeth Adler (Commented) (JIRA) |
[jira] [Commented] (NUTCH-809) Parse-metatags plugin |
Sat, 14 Jan, 15:13 |
Elisabeth Adler (Updated) (JIRA) |
[jira] [Updated] (NUTCH-809) Parse-metatags plugin |
Thu, 12 Jan, 09:15 |
Ferdy Galema |
minor suggestion to ivy.xml of plugins (remove nutch.root property) |
Fri, 20 Jan, 11:21 |
Ferdy Galema (Closed) (JIRA) |
[jira] [Closed] (NUTCH-1255) Change ivy.xml of all plugins to remove "nutch.root" property |
Mon, 23 Jan, 14:04 |
Ferdy Galema (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1205) Upgrade gora modules to 0.2-incubating in ivy/ivy.xml |
Fri, 20 Jan, 15:41 |
Ferdy Galema (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1086) Rewrite protocol-httpclient |
Wed, 25 Jan, 13:08 |
Ferdy Galema (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1253) Incompatible neko and xerces versions |
Wed, 25 Jan, 13:44 |
Ferdy Galema (Created) (JIRA) |
[jira] [Created] (NUTCH-1255) Change ivy.xml of all plugins to remove "nutch.root" property |
Fri, 20 Jan, 14:44 |
Ferdy Galema (Created) (JIRA) |
[jira] [Created] (NUTCH-1263) FetcherJob must put 'fetchTime' on input |
Tue, 31 Jan, 13:00 |
Ferdy Galema (Updated) (JIRA) |
[jira] [Updated] (NUTCH-1205) Upgrade gora modules to 0.2-incubating in ivy/ivy.xml |
Fri, 20 Jan, 10:03 |
Ferdy Galema (Updated) (JIRA) |
[jira] [Updated] (NUTCH-1205) Upgrade gora modules to 0.2-incubating in ivy/ivy.xml |
Fri, 20 Jan, 10:05 |
Ferdy Galema (Updated) (JIRA) |
[jira] [Updated] (NUTCH-1205) Upgrade gora modules to 0.2-incubating in ivy/ivy.xml |
Fri, 20 Jan, 10:05 |
Ferdy Galema (Updated) (JIRA) |
[jira] [Updated] (NUTCH-1255) Change ivy.xml of all plugins to remove "nutch.root" property |
Mon, 23 Jan, 14:02 |
Ferdy Galema (Updated) (JIRA) |
[jira] [Updated] (NUTCH-1263) FetcherJob must put 'fetchTime' on input |
Tue, 31 Jan, 13:02 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1239) Webgraph should remove deleted pages from segment input |
Mon, 02 Jan, 14:06 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1232) Remove host field from index-basic |
Mon, 02 Jan, 14:06 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1232) Remove host field from index-basic |
Tue, 03 Jan, 04:12 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1239) Webgraph should remove deleted pages from segment input |
Tue, 03 Jan, 04:12 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1146) Get rid of _success files in webgraph code |
Thu, 05 Jan, 12:03 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1243) Junit jar removed from lib |
Thu, 05 Jan, 12:49 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1146) Get rid of _success files in webgraph code |
Thu, 05 Jan, 12:49 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1243) Junit jar removed from lib |
Thu, 05 Jan, 13:13 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1237) Improve javac arguements for more verbose output |
Thu, 05 Jan, 15:25 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1237) Improve javac arguements for more verbose output |
Fri, 06 Jan, 04:10 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1237) Improve javac arguements for more verbose output |
Fri, 06 Jan, 04:24 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1244) CrawlDBDumper to filter by regex |
Mon, 09 Jan, 17:04 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1244) CrawlDBDumper to filter by regex |
Tue, 10 Jan, 05:38 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1139) Indexer to delete documents |
Tue, 10 Jan, 14:08 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1139) Indexer to delete documents |
Wed, 11 Jan, 04:28 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1138) remove LogUtil from trunk and nutch gora |
Wed, 11 Jan, 17:33 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1189) add commented out default settings to gora.properties files |
Thu, 12 Jan, 04:19 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1177) Generator to select on retry interval |
Fri, 13 Jan, 15:25 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1248) Generator to select on status |
Fri, 13 Jan, 17:18 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1177) Generator to select on retry interval |
Sat, 14 Jan, 04:26 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1248) Generator to select on status |
Sat, 14 Jan, 04:26 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1176) Fix all javadoc warnings from nightly builds |
Sat, 14 Jan, 15:53 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1176) Fix all javadoc warnings from nightly builds |
Sat, 14 Jan, 16:35 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1176) Fix all javadoc warnings from nightly builds |
Sat, 14 Jan, 18:49 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1176) Fix all javadoc warnings from nightly builds |
Sat, 14 Jan, 19:15 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1255) Change ivy.xml of all plugins to remove "nutch.root" property |
Tue, 24 Jan, 04:15 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1255) Change ivy.xml of all plugins to remove "nutch.root" property |
Tue, 24 Jan, 04:21 |
Hudson (Commented) (JIRA) |
[jira] [Commented] (NUTCH-1260) Fetcher should log fetching of redirects |
Sat, 28 Jan, 04:21 |