Đạt Cao Mạnh |
Re: [jira] [Resolved] (NUTCH-1902) Missing nekohtml.jar |
Fri, 09 Jan, 06:50 |
Renato Marroquín Mogrovejo |
Re: [VOTE] Release Apache Nutch 2.3 |
Sat, 10 Jan, 19:36 |
Markus Jelsma |
RE: Option to disable Robots Rule checking |
Tue, 27 Jan, 23:58 |
Markus Jelsma |
RE: Option to disable Robots Rule checking |
Thu, 29 Jan, 11:35 |
Markus Jelsma |
RE: [DISCUSS] Release Apache Nutch 1.10 |
Sat, 31 Jan, 21:37 |
Radosław Stankiewicz (JIRA) |
[jira] [Commented] (NUTCH-1900) DockerFile for Nutch 2.x |
Tue, 27 Jan, 22:37 |
Radosław Stankiewicz (JIRA) |
[jira] [Commented] (NUTCH-1900) DockerFile for Nutch 2.x |
Tue, 27 Jan, 23:10 |
Radosław Stankiewicz (JIRA) |
[jira] [Work started] (NUTCH-1924) Nutch + HBase Docker |
Thu, 29 Jan, 22:19 |
Radosław Stankiewicz (JIRA) |
[jira] [Work stopped] (NUTCH-1924) Nutch + HBase Docker |
Thu, 29 Jan, 23:07 |
Radosław Stankiewicz (JIRA) |
[jira] [Commented] (NUTCH-1924) Nutch + HBase Docker |
Thu, 29 Jan, 23:12 |
Albinscode (JIRA) |
[jira] [Commented] (NUTCH-1870) Generic xsl parser plugin |
Sat, 10 Jan, 20:23 |
Alexander Kingson (JIRA) |
[jira] [Commented] (NUTCH-1922) DbUpdater overwrites fetch status for URLs from previous batches, causes repeated re-fetches |
Thu, 29 Jan, 20:49 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-trunk #2941 |
Tue, 20 Jan, 04:07 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-trunk #2942 |
Wed, 21 Jan, 04:07 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "Release_HOWTO" by LewisJohnMcgibbney |
Fri, 09 Jan, 08:31 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "Release_HOWTO" by LewisJohnMcgibbney |
Fri, 09 Jan, 08:38 |
Apache Wiki |
[Nutch Wiki] Update of "NutchTutorial" by SebastianNagel |
Fri, 16 Jan, 22:10 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "HttpAuthenticationSchemes" by LewisJohnMcgibbney |
Tue, 20 Jan, 22:51 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "FrontPage" by LewisJohnMcgibbney |
Thu, 22 Jan, 03:12 |
Apache Wiki |
[Nutch Wiki] Trivial Update of "Nutch2Roadmap" by LewisJohnMcgibbney |
Tue, 27 Jan, 20:23 |
Chris A. Mattmann (JIRA) |
[jira] [Created] (NUTCH-1904) Schema for Solr4 doesn't include _version_ field |
Sun, 04 Jan, 20:35 |
Chris A. Mattmann (JIRA) |
[jira] [Work started] (NUTCH-1904) Schema for Solr4 doesn't include _version_ field |
Sun, 04 Jan, 20:35 |
Chris A. Mattmann (JIRA) |
[jira] [Resolved] (NUTCH-1904) Schema for Solr4 doesn't include _version_ field |
Sun, 04 Jan, 20:37 |
Chris A. Mattmann (JIRA) |
[jira] [Created] (NUTCH-1905) Nutch index tool should be resilient to segments that don't have crawl_* data |
Sun, 04 Jan, 20:46 |
Chris A. Mattmann (JIRA) |
[jira] [Work started] (NUTCH-1905) Nutch index tool should be resilient to segments that don't have crawl_* data |
Sun, 04 Jan, 20:46 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-1660) Index filter for Page's latitude and longitude |
Sat, 10 Jan, 23:20 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-1815) Metadata Parsed with parse-tika is Duplicated |
Sun, 11 Jan, 00:26 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-1912) Dump tool -mimetype parameter needs to be optional to prevent NPE |
Tue, 13 Jan, 19:49 |
Chris A. Mattmann (JIRA) |
[jira] [Work started] (NUTCH-1916) Apache Nutch CXF-based REST services |
Wed, 14 Jan, 14:21 |
Chris A. Mattmann (JIRA) |
[jira] [Created] (NUTCH-1916) Apache Nutch CXF-based REST services |
Wed, 14 Jan, 14:21 |
Chris A. Mattmann (JIRA) |
[jira] [Created] (NUTCH-1927) Create a whitelist of IPs/hostnames to allow skipping of RobotRules parsing |
Thu, 29 Jan, 20:56 |
Chris A. Mattmann (JIRA) |
[jira] [Work started] (NUTCH-1927) Create a whitelist of IPs/hostnames to allow skipping of RobotRules parsing |
Thu, 29 Jan, 20:57 |
Chris A. Mattmann (JIRA) |
[jira] [Assigned] (NUTCH-1927) Create a whitelist of IPs/hostnames to allow skipping of RobotRules parsing |
Thu, 29 Jan, 20:57 |
Gerhard Gossen (JIRA) |
[jira] [Created] (NUTCH-1922) DbUpdater overwrites fetch status for URLs from previous batches, causes repeated re-fetches |
Mon, 26 Jan, 11:02 |
Gerhard Gossen (JIRA) |
[jira] [Commented] (NUTCH-1922) DbUpdater overwrites fetch status for URLs from previous batches, causes repeated re-fetches |
Tue, 27 Jan, 07:58 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1904) Schema for Solr4 doesn't include _version_ field |
Sun, 04 Jan, 20:50 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1140) index-more plugin, resetTitle method creates multiple values in the Title field |
Wed, 07 Jan, 22:50 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1856) Document webpage.avsc and host.avsc |
Fri, 09 Jan, 04:03 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1907) Incorrect output of Outlinks to Hosts within HostDbUpdateReducer |
Fri, 09 Jan, 06:42 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1779) Apply formatting to the code |
Fri, 09 Jan, 06:42 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1660) Index filter for Page's latitude and longitude |
Sat, 10 Jan, 23:50 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1881) ant target resolve-default to keep test libs |
Mon, 12 Jan, 22:25 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1912) Dump tool -mimetype parameter needs to be optional to prevent NPE |
Tue, 13 Jan, 20:57 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1919) Getting timeout when server returns Content-Length: 0 |
Fri, 16 Jan, 11:52 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1920) Upgrade Nutch to use Java 1.7 |
Sat, 17 Jan, 02:50 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1660) Index filter for Page's latitude and longitude |
Wed, 21 Jan, 22:10 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1893) Parse-tika fails to parse feed files |
Tue, 27 Jan, 21:57 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1893) Parse-tika fails to parse feed files |
Tue, 27 Jan, 22:45 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1920) Upgrade Nutch to use Java 1.7 |
Wed, 28 Jan, 00:41 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1889) Store all values from Tika metadata in Nutch metadata |
Fri, 30 Jan, 08:56 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1918) TikaParser specifies a default namespace when generating DOM |
Fri, 30 Jan, 09:50 |
John Lafitte |
Re: [VOTE] Release Apache Nutch 2.3 |
Sun, 11 Jan, 04:26 |
Jorge Luis Betancourt Gonzalez (JIRA) |
[jira] [Created] (NUTCH-1928) Indexing filter of documents by the MIME type |
Fri, 30 Jan, 22:14 |
Jorge Luis Betancourt Gonzalez (JIRA) |
[jira] [Updated] (NUTCH-1928) Indexing filter of documents by the MIME type |
Fri, 30 Jan, 22:17 |
Julien Nioche (JIRA) |
[jira] [Commented] (NUTCH-1815) Metadata Parsed with parse-tika is Duplicated |
Fri, 09 Jan, 11:02 |
Julien Nioche (JIRA) |
[jira] [Created] (NUTCH-1918) TikaParser specifies a default namespace when generating DOM |
Thu, 15 Jan, 11:02 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1918) TikaParser specifies a default namespace when generating DOM |
Thu, 15 Jan, 11:04 |
Julien Nioche (JIRA) |
[jira] [Created] (NUTCH-1919) Getting timeout when server returns Content-Length: 0 |
Thu, 15 Jan, 11:25 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1919) Getting timeout when server returns Content-Length: 0 |
Thu, 15 Jan, 11:26 |
Julien Nioche (JIRA) |
[jira] [Resolved] (NUTCH-1919) Getting timeout when server returns Content-Length: 0 |
Fri, 16 Jan, 11:31 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1889) Store all values from Tika metadata in Nutch metadata |
Thu, 29 Jan, 09:09 |
Julien Nioche (JIRA) |
[jira] [Commented] (NUTCH-1889) Store all values from Tika metadata in Nutch metadata |
Thu, 29 Jan, 09:09 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1918) TikaParser specifies a default namespace when generating DOM |
Thu, 29 Jan, 09:10 |
Julien Nioche (JIRA) |
[jira] [Commented] (NUTCH-1918) TikaParser specifies a default namespace when generating DOM |
Thu, 29 Jan, 09:11 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-685) Content-level redirect status lost in ParseSegment |
Thu, 29 Jan, 09:12 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-477) Extend URLFilters to support different filtering chains |
Thu, 29 Jan, 09:12 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-840) Port tests from parse-html to parse-tika |
Thu, 29 Jan, 09:13 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1269) Improve distribution of URLS with multi-segment generation |
Thu, 29 Jan, 09:13 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1267) urlmeta to delegate indexing to index-metadata |
Thu, 29 Jan, 09:13 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1197) Add statically configured field values to solrindex-mapping.xml |
Thu, 29 Jan, 09:13 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1477) NPE when injecting with DataFileAvroStore |
Thu, 29 Jan, 09:13 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1746) OutOfMemoryError in Mappers |
Thu, 29 Jan, 09:14 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1815) Metadata Parsed with parse-tika is Duplicated |
Thu, 29 Jan, 09:14 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1687) Pick queue in Round Robin |
Thu, 29 Jan, 09:14 |
Julien Nioche (JIRA) |
[jira] [Resolved] (NUTCH-1889) Store all values from Tika metadata in Nutch metadata |
Fri, 30 Jan, 08:39 |
Julien Nioche (JIRA) |
[jira] [Resolved] (NUTCH-1918) TikaParser specifies a default namespace when generating DOM |
Fri, 30 Jan, 09:07 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1342) Read time out protocol-http |
Wed, 07 Jan, 03:00 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1140) index-more plugin, resetTitle method creates multiple values in the Title field |
Wed, 07 Jan, 03:46 |
Lewis John McGibbney (JIRA) |
[jira] [Created] (NUTCH-1906) Typo in CrawlDbReader command line help |
Wed, 07 Jan, 15:39 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1140) index-more plugin, resetTitle method creates multiple values in the Title field |
Wed, 07 Jan, 16:40 |
Lewis John McGibbney (JIRA) |
[jira] [Created] (NUTCH-1907) Incorrect output of Outlinks to Hosts within HostDbUpdateReducer |
Wed, 07 Jan, 16:45 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1709) Generated classes o.a.n.storage.Host and o.a.n.storage.ProtocolStatus contain methods not defined in source .avsc |
Wed, 07 Jan, 17:45 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-1755) Project name bug in build.xml |
Wed, 07 Jan, 18:09 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1755) Project name bug in build.xml |
Wed, 07 Jan, 18:09 |
Lewis John McGibbney (JIRA) |
[jira] [Work started] (NUTCH-1856) Document webpage.avsc and host.avsc |
Wed, 07 Jan, 21:15 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1856) Document webpage.avsc and host.avsc |
Wed, 07 Jan, 21:18 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1899) upgrade restlet lib to prevent build failure |
Wed, 07 Jan, 21:20 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1902) Missing nekohtml.jar |
Wed, 07 Jan, 21:21 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1900) DockerFile for Nutch 2.x |
Wed, 07 Jan, 21:22 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1900) DockerFile for Nutch 2.x |
Wed, 07 Jan, 21:22 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1856) Document webpage.avsc and host.avsc |
Fri, 09 Jan, 03:42 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-1899) upgrade restlet lib to prevent build failure |
Fri, 09 Jan, 03:43 |
Lewis John McGibbney (JIRA) |
[jira] [Closed] (NUTCH-1899) upgrade restlet lib to prevent build failure |
Fri, 09 Jan, 03:43 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-1856) Document webpage.avsc and host.avsc |
Fri, 09 Jan, 03:54 |
Lewis John McGibbney (JIRA) |
[jira] [Work stopped] (NUTCH-1856) Document webpage.avsc and host.avsc |
Fri, 09 Jan, 03:54 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1218) Improve trunk API documentation |
Fri, 09 Jan, 06:05 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-1218) Improve trunk API documentation |
Fri, 09 Jan, 06:05 |
Lewis John McGibbney (JIRA) |
[jira] [Closed] (NUTCH-1218) Improve trunk API documentation |
Fri, 09 Jan, 06:06 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-881) Good quality documentation for Nutch |
Fri, 09 Jan, 06:07 |
Lewis John McGibbney (JIRA) |
[jira] [Closed] (NUTCH-881) Good quality documentation for Nutch |
Fri, 09 Jan, 06:08 |