MengYing Wang |
[Problem solved] Can't crawl filesystem with protocol-file plugin - java.lang.NullPointerException |
Thu, 30 Oct, 05:15 |
MengYing Wang |
Re: How could I make more metadata indexed in Solr? |
Thu, 30 Oct, 19:11 |
MengYing Wang |
Re: [Problem solved] Can't crawl filesystem with protocol-file plugin - java.lang.NullPointerException |
Fri, 31 Oct, 06:50 |
Apache Jenkins Server |
Build failed in Jenkins: Nutch-nutchgora #1213 |
Sat, 01 Nov, 04:03 |
Lewis John McGibbney (JIRA) |
[jira] [Created] (NUTCH-1886) Review and update default.properties |
Sat, 01 Nov, 16:36 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1843) Upgrade to Gora 0.5 |
Sat, 01 Nov, 16:45 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-1843) Upgrade to Gora 0.5 |
Sat, 01 Nov, 16:46 |
Apache Jenkins Server |
Jenkins build is back to normal : Nutch-nutchgora #1214 |
Sat, 01 Nov, 17:41 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1709) Generated classes o.a.n.storage.Host and o.a.n.storage.ProtocolStatus contain methods not defined in source .avsc |
Sat, 01 Nov, 17:53 |
Lewis John McGibbney (JIRA) |
[jira] [Work started] (NUTCH-1709) Generated classes o.a.n.storage.Host and o.a.n.storage.ProtocolStatus contain methods not defined in source .avsc |
Sat, 01 Nov, 17:54 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1709) Generated classes o.a.n.storage.Host and o.a.n.storage.ProtocolStatus contain methods not defined in source .avsc |
Sat, 01 Nov, 17:54 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1483) Can't crawl filesystem with protocol-file plugin |
Sat, 01 Nov, 18:07 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1791) Null pointer exceptions with gora-cassandra-0.4 |
Sat, 01 Nov, 18:11 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1855) Upgrade Hadoop dependencies to Hadoop 2 |
Sat, 01 Nov, 18:12 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-840) Port tests from parse-html to parse-tika |
Sat, 01 Nov, 18:12 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1886) Review and update default.properties |
Sat, 01 Nov, 18:14 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1884) NullPointerException in parsechecker and indexchecker with symlinks in file URL |
Sat, 01 Nov, 18:15 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1709) Generated classes o.a.n.storage.Host and o.a.n.storage.ProtocolStatus contain methods not defined in source .avsc |
Sat, 01 Nov, 18:17 |
Lewis John McGibbney (JIRA) |
[jira] [Assigned] (NUTCH-1820) remove field "orig" which duplicates "id" |
Sat, 01 Nov, 18:33 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1820) remove field "orig" which duplicates "id" |
Sat, 01 Nov, 18:33 |
Lewis John McGibbney (JIRA) |
[jira] [Work started] (NUTCH-1820) remove field "orig" which duplicates "id" |
Sat, 01 Nov, 18:45 |
Lewis John McGibbney (JIRA) |
[jira] [Resolved] (NUTCH-1820) remove field "orig" which duplicates "id" |
Sat, 01 Nov, 18:45 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1885) Protocol-file should treat symbolic links as redirects |
Sat, 01 Nov, 18:46 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1878) urlnormalizer-regex to keep third slash in file:///path/index.html |
Sat, 01 Nov, 18:47 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1644) Should have a parser that uses xpath |
Sat, 01 Nov, 18:47 |
Lewis John Mcgibbney |
Patch reviews for 2.X |
Sat, 01 Nov, 18:51 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1679) UpdateDb using batchId, link may override crawled page. |
Sat, 01 Nov, 18:51 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1823) Upgrade to elasticsearch 1.2/1.3 |
Sat, 01 Nov, 18:52 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1880) URLUtil should not add additional slashes for file URLs |
Sat, 01 Nov, 18:52 |
Lewis John McGibbney (JIRA) |
[jira] [Updated] (NUTCH-1879) Regex URL normalizer should remove multiple slashes after file: protocol |
Sat, 01 Nov, 18:52 |
Mattmann, Chris A (3980) |
Re: Patch reviews for 2.X |
Sat, 01 Nov, 19:00 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-1483) Can't crawl filesystem with protocol-file plugin |
Sat, 01 Nov, 19:02 |
Albinscode |
Re: [jira] [Updated] (NUTCH-1644) Should have a parser that uses xpath |
Sat, 01 Nov, 20:48 |
Talat UYARER (JIRA) |
[jira] [Updated] (NUTCH-1644) Should have a parser that uses xpath |
Sat, 01 Nov, 21:27 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1644) Should have a parser that uses xpath |
Sun, 02 Nov, 00:15 |
Renato Javier Marroquín Mogrovejo (JIRA) |
[jira] [Commented] (NUTCH-1791) Null pointer exceptions with gora-cassandra-0.4 |
Sun, 02 Nov, 13:07 |
Mattmann, Chris A (3980) |
NSF DataViz Hackathon for Polar CyberInfrastructure: New York, NY 11/3/2014 - 11/4/2014 Call for Remote Participation |
Sun, 02 Nov, 17:37 |
Tom Barber |
Re: NSF DataViz Hackathon for Polar CyberInfrastructure: New York, NY 11/3/2014 - 11/4/2014 Call for Remote Participation |
Sun, 02 Nov, 18:44 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1791) Null pointer exceptions with gora-cassandra-0.4 |
Mon, 03 Nov, 14:46 |
Renato Javier Marroquín Mogrovejo (JIRA) |
[jira] [Commented] (NUTCH-1791) Null pointer exceptions with gora-cassandra-0.4 |
Mon, 03 Nov, 17:36 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1791) Null pointer exceptions with gora-cassandra-0.4 |
Mon, 03 Nov, 20:16 |
Renato Javier Marroquín Mogrovejo (JIRA) |
[jira] [Commented] (NUTCH-1791) Null pointer exceptions with gora-cassandra-0.4 |
Mon, 03 Nov, 21:12 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1791) Null pointer exceptions with gora-cassandra-0.4 |
Mon, 03 Nov, 21:59 |
Sebastian Nagel |
Re: [jira] [Updated] (NUTCH-1644) Should have a parser that uses xpath |
Mon, 03 Nov, 22:50 |
Albin Vigier |
Re: [jira] [Updated] (NUTCH-1644) Should have a parser that uses xpath |
Tue, 04 Nov, 16:09 |
Albinscode |
Re: [jira] [Updated] (NUTCH-1644) Should have a parser that uses xpath |
Tue, 04 Nov, 16:16 |
amit sehas |
Nutch 2.X question |
Tue, 04 Nov, 18:26 |
Sebastian Nagel |
Re: Patch reviews for 2.X |
Tue, 04 Nov, 18:59 |
Albinscode (JIRA) |
[jira] [Commented] (NUTCH-1870) Generic xsl parser plugin |
Tue, 04 Nov, 19:54 |
Albinscode (JIRA) |
[jira] [Updated] (NUTCH-1870) Generic xsl parser plugin |
Tue, 04 Nov, 19:55 |
Albinscode (JIRA) |
[jira] [Comment Edited] (NUTCH-1870) Generic xsl parser plugin |
Tue, 04 Nov, 19:56 |
Albinscode (JIRA) |
[jira] [Comment Edited] (NUTCH-1870) Generic xsl parser plugin |
Tue, 04 Nov, 19:56 |
Sebastian Nagel (JIRA) |
[jira] [Updated] (NUTCH-1885) Protocol-file should treat symbolic links as redirects |
Tue, 04 Nov, 20:44 |
Sebastian Nagel (JIRA) |
[jira] [Resolved] (NUTCH-1483) Can't crawl filesystem with protocol-file plugin |
Tue, 04 Nov, 21:13 |
Sebastian Nagel (JIRA) |
[jira] [Resolved] (NUTCH-1878) urlnormalizer-regex to keep third slash in file:///path/index.html |
Tue, 04 Nov, 21:15 |
Sebastian Nagel (JIRA) |
[jira] [Resolved] (NUTCH-1879) Regex URL normalizer should remove multiple slashes after file: protocol |
Tue, 04 Nov, 21:15 |
Sebastian Nagel (JIRA) |
[jira] [Resolved] (NUTCH-1880) URLUtil should not add additional slashes for file URLs |
Tue, 04 Nov, 21:16 |
Sebastian Nagel (JIRA) |
[jira] [Resolved] (NUTCH-1885) Protocol-file should treat symbolic links as redirects |
Tue, 04 Nov, 21:16 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1879) Regex URL normalizer should remove multiple slashes after file: protocol |
Tue, 04 Nov, 21:53 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1483) Can't crawl filesystem with protocol-file plugin |
Tue, 04 Nov, 21:53 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1885) Protocol-file should treat symbolic links as redirects |
Tue, 04 Nov, 21:53 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1880) URLUtil should not add additional slashes for file URLs |
Tue, 04 Nov, 21:53 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1879) Regex URL normalizer should remove multiple slashes after file: protocol |
Tue, 04 Nov, 21:58 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1483) Can't crawl filesystem with protocol-file plugin |
Tue, 04 Nov, 21:58 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1885) Protocol-file should treat symbolic links as redirects |
Tue, 04 Nov, 21:58 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1880) URLUtil should not add additional slashes for file URLs |
Tue, 04 Nov, 21:58 |
amit sehas |
Nutch 2.X question |
Wed, 05 Nov, 01:10 |
Julien Nioche (JIRA) |
[jira] [Created] (NUTCH-1887) Specify HTMLMapper to use in TikaParser |
Wed, 05 Nov, 15:45 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1887) Specify HTMLMapper to use in TikaParser |
Wed, 05 Nov, 15:51 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1887) Specify HTMLMapper to use in TikaParser |
Wed, 05 Nov, 15:52 |
Lewis John Mcgibbney |
Re: dev Digest 4 Nov 2014 21:53:35 -0000 Issue 1905 |
Wed, 05 Nov, 20:19 |
Chris A. Mattmann (JIRA) |
[jira] [Commented] (NUTCH-1887) Specify HTMLMapper to use in TikaParser |
Wed, 05 Nov, 20:28 |
Sebastian Nagel (JIRA) |
[jira] [Resolved] (NUTCH-1825) protocol-http may hang for certain web pages |
Thu, 06 Nov, 21:54 |
Sebastian Nagel (JIRA) |
[jira] [Updated] (NUTCH-1884) NullPointerException in parsechecker and indexchecker with symlinks in file URL |
Thu, 06 Nov, 21:56 |
Sebastian Nagel (JIRA) |
[jira] [Updated] (NUTCH-1884) NullPointerException in parsechecker and indexchecker with symlinks in file URL |
Thu, 06 Nov, 21:57 |
Sebastian Nagel (JIRA) |
[jira] [Resolved] (NUTCH-1884) NullPointerException in parsechecker and indexchecker with symlinks in file URL |
Thu, 06 Nov, 22:01 |
Sebastian Nagel (JIRA) |
[jira] [Commented] (NUTCH-1887) Specify HTMLMapper to use in TikaParser |
Thu, 06 Nov, 22:25 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1825) protocol-http may hang for certain web pages |
Thu, 06 Nov, 22:43 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1825) protocol-http may hang for certain web pages |
Thu, 06 Nov, 22:52 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1884) NullPointerException in parsechecker and indexchecker with symlinks in file URL |
Thu, 06 Nov, 22:52 |
Lewis John Mcgibbney |
Re: Nutch 2.X question |
Fri, 07 Nov, 02:37 |
Lewis John Mcgibbney |
Re: Nutch 2.3 |
Fri, 07 Nov, 03:31 |
Lewis John McGibbney (JIRA) |
[jira] [Commented] (NUTCH-1709) Generated classes o.a.n.storage.Host and o.a.n.storage.ProtocolStatus contain methods not defined in source .avsc |
Fri, 07 Nov, 04:04 |
Julien Nioche (JIRA) |
[jira] [Created] (NUTCH-1888) Specify HTMLMapper to use in TikaParser |
Fri, 07 Nov, 09:58 |
Julien Nioche (JIRA) |
[jira] [Resolved] (NUTCH-1887) Specify HTMLMapper to use in TikaParser |
Fri, 07 Nov, 10:00 |
Julien Nioche (JIRA) |
[jira] [Created] (NUTCH-1889) Store all values from Tika metadata in Nutch metadata |
Fri, 07 Nov, 10:25 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1889) Store all values from Tika metadata in Nutch metadata |
Fri, 07 Nov, 10:42 |
Hudson (JIRA) |
[jira] [Commented] (NUTCH-1887) Specify HTMLMapper to use in TikaParser |
Fri, 07 Nov, 10:50 |
Julien Nioche (JIRA) |
[jira] [Updated] (NUTCH-1592) XPath works on documents parsed with parse-html but not parse-tika |
Fri, 07 Nov, 14:43 |
Julien Nioche (JIRA) |
[jira] [Commented] (NUTCH-1883) bin/crawl: use function to run bin/nutch and check exit value |
Fri, 07 Nov, 14:50 |
kaveh minooie (JIRA) |
[jira] [Updated] (NUTCH-1140) index-more plugin, resetTitle method creates multiple values in the Title field |
Fri, 07 Nov, 19:07 |
Mattmann, Chris A (3980) |
Re: svn commit: r1637236 - in /nutch: branches/2.x/ branches/2.x/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/ trunk/ trunk/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/ |
Sat, 08 Nov, 17:17 |
Renato Javier Marroquín Mogrovejo (JIRA) |
[jira] [Commented] (NUTCH-1791) Null pointer exceptions with gora-cassandra-0.4 |
Sun, 09 Nov, 11:45 |
Sebastian Nagel (JIRA) |
[jira] [Reopened] (NUTCH-1883) bin/crawl: use function to run bin/nutch and check exit value |
Sun, 09 Nov, 16:04 |
Sebastian Nagel (JIRA) |
[jira] [Updated] (NUTCH-1883) bin/crawl: use function to run bin/nutch and check exit value |
Sun, 09 Nov, 17:28 |
Sebastian Nagel (JIRA) |
[jira] [Reopened] (NUTCH-1829) Generator : unable to distinguish real errors |
Sun, 09 Nov, 17:35 |
Sebastian Nagel (JIRA) |
[jira] [Updated] (NUTCH-1829) Generator : unable to distinguish real errors |
Sun, 09 Nov, 17:35 |
Sebastian Nagel (JIRA) |
[jira] [Comment Edited] (NUTCH-1829) Generator : unable to distinguish real errors |
Sun, 09 Nov, 17:36 |
Julien Nioche (JIRA) |
[jira] [Commented] (NUTCH-1883) bin/crawl: use function to run bin/nutch and check exit value |
Mon, 10 Nov, 10:10 |
kaveh minooie (JIRA) |
[jira] [Updated] (NUTCH-1140) index-more plugin, resetTitle method creates multiple values in the Title field |
Mon, 10 Nov, 19:12 |