Christian Kohlschütter (JIRA) |
[jira] Created: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Fri, 07 May, 20:15 |
Christian Kohlschütter (JIRA) |
[jira] Updated: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Fri, 07 May, 20:20 |
Christian Kohlschütter (JIRA) |
[jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Fri, 14 May, 11:07 |
Alex Ott |
Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents |
Mon, 31 May, 07:23 |
Alex Ott |
Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents |
Mon, 31 May, 07:30 |
Alex Ott |
Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents |
Mon, 31 May, 09:13 |
Alex Ott |
Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents |
Mon, 31 May, 19:17 |
Andrew Khoury (JIRA) |
[jira] Created: (TIKA-434) Bug in TagSoup causes IOException |
Thu, 27 May, 21:09 |
Andrew Khoury (JIRA) |
[jira] Commented: (TIKA-434) Bug in TagSoup causes IOException |
Thu, 27 May, 21:47 |
Andrew Khoury (JIRA) |
[jira] Updated: (TIKA-434) Bug in TagSoup causes IOException |
Thu, 27 May, 21:47 |
Andrew Khoury (JIRA) |
[jira] Updated: (TIKA-434) Bug in TagSoup causes IOException |
Thu, 27 May, 21:49 |
Andrew Khoury (JIRA) |
[jira] Updated: (TIKA-434) Bug in TagSoup causes IOException |
Thu, 27 May, 21:49 |
Andrew Khoury (JIRA) |
[jira] Updated: (TIKA-434) Bug in TagSoup causes IOException |
Thu, 27 May, 21:51 |
Andrzej Bialecki |
Re: Attributes in XHTML output |
Tue, 11 May, 09:40 |
Andrzej Bialecki |
Re: Attributes in XHTML output |
Tue, 11 May, 15:04 |
Apache Hudson Server |
Hudson build is back to normal : Tika-trunk #312 |
Tue, 11 May, 14:20 |
Chris A. Mattmann (JIRA) |
[jira] Assigned: (TIKA-379) Html elements and attributes not available in XHTML representation |
Wed, 05 May, 04:44 |
Chris A. Mattmann (JIRA) |
[jira] Created: (TIKA-421) DOAP file to recognize Tika on projects.a.o |
Sat, 08 May, 17:44 |
Chris A. Mattmann (JIRA) |
[jira] Resolved: (TIKA-421) DOAP file to recognize Tika on projects.a.o |
Sat, 08 May, 17:56 |
Chris A. Mattmann (JIRA) |
[jira] Commented: (TIKA-421) DOAP file to recognize Tika on projects.a.o |
Sat, 08 May, 18:00 |
Chris A. Mattmann (JIRA) |
[jira] Assigned: (TIKA-391) Intermittent errors detecting xls files |
Fri, 21 May, 17:41 |
Chris A. Mattmann (JIRA) |
[jira] Commented: (TIKA-391) Intermittent errors detecting xls files |
Fri, 21 May, 19:42 |
Chris A. Mattmann (JIRA) |
[jira] Created: (TIKA-432) Include NOTICE and LICENSE file updates for NCAR NetCDF parser lib |
Fri, 21 May, 19:48 |
Chris A. Mattmann (JIRA) |
[jira] Resolved: (TIKA-432) Include NOTICE and LICENSE file updates for NCAR NetCDF parser lib |
Fri, 21 May, 19:54 |
Chris A. Mattmann (JIRA) |
[jira] Updated: (TIKA-391) Intermittent errors detecting xls files |
Fri, 21 May, 20:02 |
Chris A. Mattmann (JIRA) |
[jira] Resolved: (TIKA-379) Html elements and attributes not available in XHTML representation |
Sun, 30 May, 23:51 |
Chris A. Mattmann (JIRA) |
[jira] Commented: (TIKA-402) Support for iWork documents |
Mon, 31 May, 16:48 |
Christoph Weidling (JIRA) |
[jira] Created: (TIKA-435) After using the GUI part of the cli sometimes temporary files are not removed. |
Mon, 31 May, 12:18 |
Daan de Wit |
Re: Tika now listed on projects.a.o |
Wed, 12 May, 06:37 |
Dave Meikle |
Re: [jira] Commented: (TIKA-396) Parser Attachements from Outlook Messages |
Sun, 02 May, 16:40 |
David Tran (JIRA) |
[jira] Created: (TIKA-423) Parse docx and output to text file missing words |
Mon, 17 May, 03:04 |
David Tran (JIRA) |
[jira] Updated: (TIKA-423) Parse docx and output to text file missing words |
Mon, 17 May, 03:04 |
David Tran (JIRA) |
[jira] Updated: (TIKA-423) Parse docx and output to text file missing words |
Mon, 17 May, 03:06 |
Erik Hetzner (JIRA) |
[jira] Created: (TIKA-425) Exception parsing mp3 |
Wed, 19 May, 00:10 |
Erik Hetzner (JIRA) |
[jira] Created: (TIKA-426) Parsing javascript as XML |
Wed, 19 May, 00:30 |
Erik Hetzner (JIRA) |
[jira] Created: (TIKA-427) Parsing CSS as XML |
Wed, 19 May, 00:34 |
Erik Hetzner (JIRA) |
[jira] Created: (TIKA-428) Unexpected RuntimeException when parsing PPTM (?) file |
Wed, 19 May, 01:30 |
Erik Hetzner (JIRA) |
[jira] Commented: (TIKA-428) Unexpected RuntimeException when parsing PPTM (?) file |
Wed, 19 May, 01:30 |
Erik Hetzner (JIRA) |
[jira] Created: (TIKA-429) Error parsing DTD |
Wed, 19 May, 20:33 |
Erik Hetzner (JIRA) |
[jira] Created: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. |
Fri, 21 May, 17:59 |
Erik Hetzner (JIRA) |
[jira] Commented: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. |
Fri, 21 May, 18:01 |
Gerd Bremer (JIRA) |
[jira] Commented: (TIKA-425) Exception parsing mp3 |
Wed, 19 May, 11:14 |
Gerd Bremer (JIRA) |
[jira] Issue Comment Edited: (TIKA-425) Exception parsing mp3 |
Wed, 19 May, 11:16 |
Gerd Bremer (JIRA) |
[jira] Updated: (TIKA-425) Exception parsing mp3 |
Wed, 19 May, 12:33 |
Grant Ingersoll (JIRA) |
[jira] Created: (TIKA-433) Tika + Hadoop |
Tue, 25 May, 21:13 |
Grant Ingersoll (JIRA) |
[jira] Commented: (TIKA-433) Tika + Hadoop |
Wed, 26 May, 10:38 |
Grant Ingersoll (JIRA) |
[jira] Commented: (TIKA-433) Tika + Hadoop |
Wed, 26 May, 12:43 |
Ian Holsman |
Re: confirm unsubscribe from dev@tika.apache.org |
Thu, 27 May, 18:52 |
Jukka Zitting |
Alternative RTF parsers (Was: [jira] Commented: (TIKA-422) Wrong charset conversion in some RTF documents.) |
Wed, 12 May, 14:34 |
Jukka Zitting |
Re: Boilerpipe issue with Maven central repository |
Fri, 21 May, 08:20 |
Jukka Zitting |
Re: Improved handling of attributes |
Wed, 26 May, 15:02 |
Jukka Zitting |
Re: Improved handling of attributes |
Wed, 26 May, 15:28 |
Jukka Zitting (JIRA) |
[jira] Created: (TIKA-419) Allow parser lookup from a custom class loader |
Tue, 04 May, 15:56 |
Jukka Zitting (JIRA) |
[jira] Resolved: (TIKA-419) Allow parser lookup from a custom class loader |
Tue, 04 May, 16:09 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-422) Wrong charset conversion in some RTF documents. |
Wed, 12 May, 13:06 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Wed, 12 May, 13:25 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-242) Incremental configuration AutoDetectParser |
Wed, 12 May, 13:41 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Wed, 12 May, 13:58 |
Jukka Zitting (JIRA) |
[jira] Resolved: (TIKA-415) Findbugs: XHTMLDowngradeHandler equals() comparing different types |
Wed, 12 May, 14:38 |
Jukka Zitting (JIRA) |
[jira] Resolved: (TIKA-417) Unable to parse the content for UCS2 Litte Endian encoded file |
Wed, 12 May, 15:34 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-418) RuntimeException while getting content for ppsx, ppsm, pptm, thmx and xps file types |
Wed, 12 May, 15:46 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-402) Support for Keynote and Pages documents |
Wed, 12 May, 16:44 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. |
Wed, 26 May, 08:45 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-430) Automatically let all valid XHTML 1.0 attributes through from HTML documents |
Wed, 26 May, 08:47 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-429) Error parsing DTD |
Wed, 26 May, 08:51 |
Jukka Zitting (JIRA) |
[jira] Resolved: (TIKA-425) Exception parsing mp3 |
Wed, 26 May, 09:28 |
Jukka Zitting (JIRA) |
[jira] Resolved: (TIKA-428) Unexpected RuntimeException when parsing PPTM (?) file |
Wed, 26 May, 09:34 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-418) RuntimeException while getting content for ppsx, ppsm, pptm, thmx and xps file types |
Wed, 26 May, 09:46 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Wed, 26 May, 10:16 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-427) Parsing CSS as XML |
Wed, 26 May, 10:20 |
Jukka Zitting (JIRA) |
[jira] Resolved: (TIKA-424) Avoid ArrayIndexOutOfBoundsException on some mp3 files |
Wed, 26 May, 10:24 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-418) RuntimeException while getting content for ppsx, ppsm, pptm, thmx and xps file types |
Wed, 26 May, 10:26 |
Jukka Zitting (JIRA) |
[jira] Resolved: (TIKA-413) DWG Parser |
Wed, 26 May, 12:14 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-433) Tika + Hadoop |
Wed, 26 May, 12:49 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-402) Support for Keynote and Pages documents |
Wed, 26 May, 15:13 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-416) Out-of-process text extraction |
Thu, 27 May, 21:51 |
Jukka Zitting (JIRA) |
[jira] Updated: (TIKA-402) Support for iWork documents |
Mon, 31 May, 15:04 |
Jukka Zitting (JIRA) |
[jira] Commented: (TIKA-402) Support for iWork documents |
Mon, 31 May, 16:38 |
Jukka Zitting (JIRA) |
[jira] Resolved: (TIKA-435) After using the GUI part of the cli sometimes temporary files are not removed. |
Mon, 31 May, 17:00 |
Julien Nioche (JIRA) |
[jira] Updated: (TIKA-379) Html elements and attributes not available in XHTML representation |
Tue, 04 May, 09:19 |
Julien Nioche (JIRA) |
[jira] Commented: (TIKA-433) Tika + Hadoop |
Wed, 26 May, 07:34 |
Julien Nioche (JIRA) |
[jira] Commented: (TIKA-430) Automatically let all valid XHTML 1.0 attributes through from HTML documents |
Wed, 26 May, 09:28 |
Julien Nioche (JIRA) |
[jira] Commented: (TIKA-433) Tika + Hadoop |
Wed, 26 May, 11:14 |
Ken Krugler |
Attributes in XHTML output |
Tue, 11 May, 00:56 |
Ken Krugler |
Re: Attributes in XHTML output |
Tue, 11 May, 13:22 |
Ken Krugler |
Html5 parsing spec |
Tue, 18 May, 19:54 |
Ken Krugler |
Boilerpipe issue with Maven central repository |
Fri, 21 May, 00:58 |
Ken Krugler |
Improved handling of attributes |
Fri, 21 May, 01:08 |
Ken Krugler |
Re: Boilerpipe issue with Maven central repository |
Fri, 21 May, 03:50 |
Ken Krugler |
Re: Improved handling of attributes |
Thu, 27 May, 16:16 |
Ken Krugler (JIRA) |
[jira] Assigned: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Fri, 07 May, 20:45 |
Ken Krugler (JIRA) |
[jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Fri, 07 May, 20:47 |
Ken Krugler (JIRA) |
[jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Wed, 12 May, 13:43 |
Ken Krugler (JIRA) |
[jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Wed, 12 May, 16:34 |
Ken Krugler (JIRA) |
[jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Sun, 16 May, 13:36 |
Ken Krugler (JIRA) |
[jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages |
Sun, 16 May, 13:42 |
Ken Krugler (JIRA) |
[jira] Created: (TIKA-430) Automatically let all valid XHTML 1.0 attributes through from HTML documents |
Fri, 21 May, 01:04 |
Ken Krugler (JIRA) |
[jira] Assigned: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. |
Wed, 26 May, 17:10 |
Ken Krugler (JIRA) |
[jira] Commented: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. |
Wed, 26 May, 17:12 |
Martijn v Groningen |
Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents |
Mon, 31 May, 09:10 |