-christian (JIRA) |
[jira] [Created] (NUTCH-1607) Make inproper multiValued field configurable |
Mon, 08 Jul, 11:33 |
Ahmet Emre Aladağ |
Adding nutch stage |
Mon, 01 Jul, 12:31 |
Ahmet Emre Aladağ |
Re: Adding nutch stage |
Fri, 12 Jul, 06:33 |
Gül Ahmet Türkoğlu (JIRA) |
[jira] [Updated] (NUTCH-1228) Change mapred.task.timeout to mapreduce.task.timeout in fetcher |
Wed, 17 Jul, 10:45 |
Gül Ahmet Türkoğlu (JIRA) |
[jira] [Updated] (NUTCH-1228) Change mapred.task.timeout to mapreduce.task.timeout in fetcher |
Wed, 17 Jul, 10:45 |
Gül Ahmet Türkoğlu (JIRA) |
[jira] [Updated] (NUTCH-1228) Change mapred.task.timeout to mapreduce.task.timeout in fetcher |
Wed, 17 Jul, 10:46 |
Gül Ahmet Türkoğlu (JIRA) |
[jira] [Updated] (NUTCH-1228) Change mapred.task.timeout to mapreduce.task.timeout in fetcher |
Wed, 17 Jul, 10:48 |
Gül Ahmet Türkoğlu (JIRA) |
[jira] [Commented] (NUTCH-1228) Change mapred.task.timeout to mapreduce.task.timeout in fetcher |
Thu, 25 Jul, 06:37 |
Markus Jelsma |
RE: [ANNOUNCE] Apache Nutch v2.2.1 Released |
Wed, 03 Jul, 07:50 |
cihad güzel (JIRA) |
[jira] [Created] (NUTCH-1615) Implementing A Feature for Fetching From Websites Dump |
Fri, 19 Jul, 12:58 |
cihad güzel (JIRA) |
[jira] [Commented] (NUTCH-1615) Implementing A Feature for Fetching From Websites Dump |
Fri, 19 Jul, 13:02 |
cihad güzel (JIRA) |
[jira] [Updated] (NUTCH-1317) Max content length by MIME-type |
Mon, 29 Jul, 11:13 |
cihad güzel (JIRA) |
[jira] [Commented] (NUTCH-1317) Max content length by MIME-type |
Mon, 29 Jul, 11:23 |
Amit Yadav (JIRA) |
[jira] [Created] (NUTCH-1612) Getting URl Malformed exception with Nutch 2.2 and Hadoop 1.0.3 |
Tue, 16 Jul, 05:50 |
Antoinette (JIRA) |
[jira] [Commented] (NUTCH-1406) index-metadata plugin: conversion to Solr date format |
Fri, 05 Jul, 14:43 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-trunk #2262 |
Mon, 01 Jul, 04:05 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-trunk #2263 |
Mon, 01 Jul, 11:33 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-trunk #2267 |
Thu, 04 Jul, 04:04 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-trunk #2268 |
Thu, 04 Jul, 09:08 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-nutchgora #674 |
Fri, 05 Jul, 11:04 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-trunk #2274 |
Fri, 05 Jul, 11:04 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-nutchgora #675 |
Sat, 06 Jul, 04:10 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-trunk #2275 |
Sat, 06 Jul, 04:15 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-trunk #2285 |
Mon, 15 Jul, 04:04 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-trunk #2286 |
Tue, 16 Jul, 04:12 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "PluginCentral" by LewisJohnMcgibbney |
Mon, 01 Jul, 00:22 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "Nutch2Tutorial" by LewisJohnMcgibbney |
Wed, 24 Jul, 21:16 |
Arvind Srini |
hello |
Tue, 23 Jul, 01:24 |
Arvind Srini |
hey. |
Tue, 23 Jul, 02:20 |
Brian (JIRA) |
[jira] [Commented] (NUTCH-1524) Internal links are not being saved even with change in parameter (db.ignore.internal.links) |
Wed, 03 Jul, 16:12 |
Brian (JIRA) |
[jira] [Comment Edited] (NUTCH-1524) Internal links are not being saved even with change in parameter (db.ignore.internal.links) |
Wed, 03 Jul, 16:14 |
Brian (JIRA) |
[jira] [Comment Edited] (NUTCH-1524) Internal links are not being saved even with change in parameter (db.ignore.internal.links) |
Wed, 03 Jul, 16:30 |
Brian (JIRA) |
[jira] [Updated] (NUTCH-1524) Internal links are not being saved even with change in parameter (db.ignore.internal.links) |
Wed, 03 Jul, 17:22 |
Brian (JIRA) |
[jira] [Created] (NUTCH-1608) SolrDeleteDuplicates bug: choosing preferred page when duplicates does not work |
Tue, 09 Jul, 15:15 |
Brian (JIRA) |
[jira] [Updated] (NUTCH-1608) SolrDeleteDuplicates bug: choosing preferred page when duplicates does not work |
Tue, 09 Jul, 15:21 |
Brian (JIRA) |
[jira] [Created] (NUTCH-1610) Can't run individual unit tests for plugins in nutch 2.x |
Fri, 12 Jul, 21:19 |
Brian (JIRA) |
[jira] [Created] (NUTCH-1613) Timeouts in protocol-httpclient when crawling same host with >2 threads and added cookie strings for both http protocols |
Tue, 16 Jul, 21:46 |
Brian (JIRA) |
[jira] [Updated] (NUTCH-1613) Timeouts in protocol-httpclient when crawling same host with >2 threads and added cookie strings for both http protocols |
Tue, 16 Jul, 21:50 |
Brian (JIRA) |
[jira] [Commented] (NUTCH-1613) Timeouts in protocol-httpclient when crawling same host with >2 threads and added cookie strings for both http protocols |
Wed, 17 Jul, 16:12 |
Brian (JIRA) |
[jira] [Created] (NUTCH-1614) Plugin to exclude URLs matching regex list from indexing - to enable crawl but do not index |
Wed, 17 Jul, 17:12 |
Brian (JIRA) |
[jira] [Updated] (NUTCH-1614) Plugin to exclude URLs matching regex list from indexing - to enable crawl but do not index |
Wed, 17 Jul, 17:16 |
Brian (JIRA) |
[jira] [Commented] (NUTCH-1614) Plugin to exclude URLs matching regex list from indexing - to enable crawl but do not index |
Wed, 17 Jul, 17:59 |
Brian (JIRA) |
[jira] [Comment Edited] (NUTCH-1614) Plugin to exclude URLs matching regex list from indexing - to enable crawl but do not index |
Wed, 17 Jul, 18:29 |
Brian (JIRA) |
[jira] [Commented] (NUTCH-1473) Column length too big for column 'text' (max = 21845); use BLOB or TEXT instead |
Wed, 24 Jul, 19:53 |
Brian (JIRA) |
[jira] [Commented] (NUTCH-1465) Support sitemaps in Nutch |
Fri, 26 Jul, 18:39 |
Ferdy Galema (JIRA) |
[jira] [Commented] (NUTCH-1457) Nutch2 Refactor the update process so that fetched items are only processed once |
Fri, 05 Jul, 16:29 |
Ferdy Galema (JIRA) |
[jira] [Commented] (NUTCH-1457) Nutch2 Refactor the update process so that fetched items are only processed once |
Wed, 10 Jul, 12:35 |
Ferdy Galema (JIRA) |
[jira] [Commented] (NUTCH-1457) Nutch2 Refactor the update process so that fetched items are only processed once |
Wed, 17 Jul, 19:43 |
Ferdy Galema (JIRA) |
[jira] [Commented] (NUTCH-1457) Nutch2 Refactor the update process so that fetched items are only processed once |
Tue, 30 Jul, 15:27 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1593) normalize option missing in SegmentMerger's usage |
Mon, 01 Jul, 11:34 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1594) count variable is never changed in ParseUtil class |
Mon, 01 Jul, 14:09 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1327) QueryStringNormalizer |
Tue, 02 Jul, 09:51 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1581) CrawlDB csv output to include metadata |
Tue, 02 Jul, 09:51 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1600) Injector overwrite does not always work properly |
Thu, 04 Jul, 09:09 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1601) ElasticSearchIndexer fails to properly delete documents |
Thu, 04 Jul, 09:09 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1597) HeadingsParseFilter to trim and remove exess whitespace |
Thu, 04 Jul, 10:11 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1596) HeadingsParseFilter not thread safe |
Thu, 04 Jul, 12:09 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1602) improve the readability of metadata in readdb dump normal |
Thu, 04 Jul, 16:19 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1520) SegmentMerger looses records |
Fri, 05 Jul, 09:11 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1598) ElasticSearchIndexer to read ImmutableSettings from config |
Fri, 05 Jul, 09:11 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1595) Upgrade to Tika 1.4 |
Fri, 05 Jul, 11:05 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1595) Upgrade to Tika 1.4 |
Fri, 05 Jul, 11:05 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1604) ProtocolFactory not thread-safe |
Mon, 08 Jul, 09:09 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1604) ProtocolFactory not thread-safe |
Mon, 08 Jul, 09:11 |
Jason Howes (JIRA) |
[jira] [Updated] (NUTCH-1591) Incorrect conversion of ByteBuffer to String |
Sun, 07 Jul, 06:23 |
Julien Nioche |
Re: [ANNOUNCE] Apache Nutch v2.2.1 Released |
Tue, 02 Jul, 17:48 |
Julien Nioche (JIRA) |
[jira] [Commented] (NUTCH-1599) Obtain consensus on new description of Nutch |
Tue, 02 Jul, 17:43 |
Julien Nioche (JIRA) |
[jira] [Commented] (NUTCH-1595) Upgrade to Tika 1.4 |
Wed, 03 Jul, 11:03 |
Julien Nioche (JIRA) |
[jira] [Commented] (NUTCH-1595) Upgrade to Tika 1.4 |
Wed, 03 Jul, 14:30 |
Julien Nioche (JIRA) |
[jira] [Commented] (NUTCH-1602) improve the readability of metadata in readdb dump normal |
Thu, 04 Jul, 10:05 |
Julien Nioche (JIRA) |
[jira] [Commented] (NUTCH-1595) Upgrade to Tika 1.4 |
Fri, 05 Jul, 09:59 |
Julien Nioche (JIRA) |
[jira] [Created] (NUTCH-1604) ProtocolFactory not thread-safe |
Fri, 05 Jul, 14:28 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1604) ProtocolFactory not thread-safe |
Fri, 05 Jul, 14:28 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1604) ProtocolFactory not thread-safe |
Mon, 08 Jul, 08:07 |
Julien Nioche (JIRA) |
[jira] [Resolved] (NUTCH-1604) ProtocolFactory not thread-safe |
Mon, 08 Jul, 08:51 |
Julien Nioche (JIRA) |
[jira] [Created] (NUTCH-1606) Check that Factory classes use the cache in a thread safe way |
Mon, 08 Jul, 08:59 |
Julien Nioche (JIRA) |
[jira] [Resolved] (NUTCH-806) Merge CrawlDBScanner with CrawlDBReader |
Mon, 29 Jul, 13:41 |
Lewis John McGibbney (JIRA) |
[jira] [Created] (NUTCH-1599) Obtain consensus on new description of Nutch |
Tue, 02 Jul, 16:51 |
Lewis John McGibbney (JIRA) |
[jira] [Assigned] (NUTCH-1599) Obtain consensus on new description of Nutch |
Tue, 02 Jul, 16:51 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1599) Obtain consensus on new description of Nutch |
Tue, 02 Jul, 16:51 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-1599) Obtain consensus on new description of Nutch |
Wed, 03 Jul, 19:32 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1608) SolrDeleteDuplicates bug: choosing preferred page when duplicates does not work |
Wed, 10 Jul, 18:49 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1124) JUnit test for scoring-opic |
Wed, 10 Jul, 20:09 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic |
Wed, 10 Jul, 20:17 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic |
Wed, 10 Jul, 20:17 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1124) JUnit test for scoring-opic |
Fri, 12 Jul, 18:21 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1457) Nutch2 Refactor the update process so that fetched items are only processed once |
Fri, 12 Jul, 18:25 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1605) mime type detector recognizes xlsx as zip file |
Fri, 12 Jul, 18:27 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-1609) java.net.MalformedURLException when running nutch crawl with apache-nutch-2.1.jar with hadoop |
Fri, 12 Jul, 18:30 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1610) Can't run individual unit tests for plugins in nutch 2.x |
Tue, 16 Jul, 22:04 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1609) java.net.MalformedURLException when running nutch crawl with apache-nutch-2.1.jar with hadoop |
Tue, 16 Jul, 22:06 |
Lewis John McGibbney (JIRA) |
[jira] [Closed] (NUTCH-1612) Getting URl Malformed exception with Nutch 2.2 and Hadoop 1.0.3 |
Wed, 17 Jul, 19:15 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1613) Timeouts in protocol-httpclient when crawling same host with >2 threads and added cookie strings for both http protocols |
Wed, 17 Jul, 19:15 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1228) Change mapred.task.timeout to mapreduce.task.timeout in fetcher |
Wed, 17 Jul, 19:17 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1124) JUnit test for scoring-opic |
Thu, 25 Jul, 22:13 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1618) Fetches some websites multiple times for long lasting queues |
Tue, 30 Jul, 18:13 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1611) Elastic Search Indexer Creates field in elastic search "boost" as a string value, so cannot be used in custom boost queries |
Tue, 30 Jul, 18:15 |
Lewis John Mcgibbney |
Re: [VOTE] Apache Nutch 2.2.1 RC#1 |
Tue, 02 Jul, 16:08 |
Lewis John Mcgibbney |
[RESULT] WAS Re: [VOTE] Apache Nutch 2.2.1 RC#1 |
Tue, 02 Jul, 16:28 |
Lewis John Mcgibbney |
[ANNOUNCE] Apache Nutch v2.2.1 Released |
Tue, 02 Jul, 16:32 |