tika-dev mailing list archives: March 2015

Site index · List index
Message list« Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · 9 · Next »Thread · Author · Date
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1079) Word document hits AIOOBE in SummaryExtractor.parseSummaries Sat, 14 Mar, 00:51
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1082) Incorrect date in Doc metadata Sat, 14 Mar, 00:54
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1085) PDF header and mime detection Sat, 14 Mar, 00:56
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1088) Unsupported AutoCAD drawing version: AC1009 Sat, 14 Mar, 00:57
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1088) Unsupported AutoCAD drawing version: AC1009 Sat, 14 Mar, 00:57
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1094) Bugged WordExtractor#handleSpecialCharacterRun method Sat, 14 Mar, 01:03
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1095) Only gibberish extracted from this PDF Sat, 14 Mar, 01:12
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1098) not able to parse pdfs/docs/ppts using 1.1 tika parser‏‏ Sat, 14 Mar, 01:18
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1106) CLAVIN Integration Sat, 14 Mar, 01:19
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-651) Unescaped attribute value generated Sat, 14 Mar, 03:12
Tyler Palsulich (JIRA) [jira] [Closed] (TIKA-942) HTTP Accept header evaluator Sat, 14 Mar, 03:13
Tyler Palsulich (JIRA) [jira] [Closed] (TIKA-1134) ContentHandler gets ignorable whitespace for <br> tags when parsing HTML Sat, 14 Mar, 03:20
Tyler Palsulich (JIRA) [jira] [Closed] (TIKA-1095) Only gibberish extracted from this PDF Sat, 14 Mar, 20:42
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1107) Can't parse velocity file Sat, 14 Mar, 21:12
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1107) Can't parse velocity file Sat, 14 Mar, 21:13
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1108) Represent individual slides in pptx Sat, 14 Mar, 21:19
Tyler Palsulich (JIRA) [jira] [Closed] (TIKA-1111) Class loading issues when running in OSGi environment Sat, 14 Mar, 21:21
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-289) Add magic byte patterns from file(1) Sat, 14 Mar, 21:27
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1114) sgml mime type is not detected when passed in as byte stream Sat, 14 Mar, 21:28
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-891) Use POST in addition to PUT on method calls in tika-server Sat, 14 Mar, 21:42
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1116) Wrong detection of XLS/Doc fil Sat, 14 Mar, 21:44
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1117) IWorkPackageParser should not close the InputStream Sat, 14 Mar, 22:01
Tyler Palsulich (JIRA) [jira] [Closed] (TIKA-1120) Enable direct use of org.apache.tika.mime.MediaType.detect(...) Sat, 14 Mar, 22:19
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1131) Output sentence-break "hints" for files such as PPT/X Sun, 15 Mar, 00:17
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1137) Wasted work in WontBeSerializedError.writeObject() Sun, 15 Mar, 02:10
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1138) Empty body and empty title with some TXT documents Sun, 15 Mar, 03:25
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1140) Better table representation, cell spanning in Word Extractor Sun, 15 Mar, 03:27
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1141) javascript files that contain "<html" are detected as text/html Sun, 15 Mar, 03:27
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1143) Fails to parse some PPT file Sun, 15 Mar, 03:30
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1144) Changes in styling mechanism, inner table support and list support for Word Extractor Sun, 15 Mar, 03:31
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1149) Improve parser lookup performance Sun, 15 Mar, 03:35
Tyler Palsulich (JIRA) [jira] [Created] (TIKA-1576) Upgrade metadata-extractor to version 2.7.2 Sun, 15 Mar, 05:05
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1576) Upgrade metadata-extractor to version 2.7.2 Sun, 15 Mar, 05:08
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1154) Tika hangs on format detection of malformed HTML file. Sun, 15 Mar, 05:19
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1155) Number Format is converted with an error Sun, 15 Mar, 18:28
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1161) Dates incorrectly extracted from PDF Sun, 15 Mar, 18:34
Tyler Palsulich (JIRA) [jira] [Closed] (TIKA-1163) NPE thrown by TikaConfig.getDefaultConfig() Sun, 15 Mar, 18:36
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1143) Fails to parse some PPT file Sun, 15 Mar, 19:56
Tyler Palsulich (JIRA) [jira] [Assigned] (TIKA-1143) Fails to parse some PPT file Sun, 15 Mar, 19:56
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1143) Fails to parse some PPT file Sun, 15 Mar, 19:57
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1165) Autodetect and parse Asciidoc Sun, 15 Mar, 20:00
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1168) The IWork NumbersContentHandler returns unsupported Metadata PropertyType Sun, 15 Mar, 20:01
Tyler Palsulich (JIRA) [jira] [Closed] (TIKA-1172) Out Of Memory exception occurring in GUI on 20MB pdf Sun, 15 Mar, 20:02
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1174) Invalid characters in filtered PDF output Sun, 15 Mar, 20:42
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1140) Better table representation, cell spanning in Word Extractor Sun, 15 Mar, 21:15
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1180) Matroska (mkv, mka, webm) Detector Sun, 15 Mar, 21:16
Tyler Palsulich (JIRA) [jira] [Closed] (TIKA-1181) RTFParser not keeping HTML font colors and underscore tags. Sun, 15 Mar, 21:18
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1185) Date/Time/Timezone in Outlook 2010 messages Sun, 15 Mar, 21:31
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1176) ChmDirectoryListingSet does not correctly enumerate directory entries Sun, 15 Mar, 22:07
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1186) Missing sender mail address in Outlook 2010 Sun, 15 Mar, 22:08
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1188) Microsoft Project Support Sun, 15 Mar, 22:09
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package Sun, 15 Mar, 22:12
Tyler Palsulich (JIRA) [jira] [Closed] (TIKA-1194) Missing text from MS Word (DOC) file Sun, 15 Mar, 22:13
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1195) Microsoft office filetype xlsb not threated correctly ( no content after indexed )) Sun, 15 Mar, 22:15
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1195) XLSB support Sun, 15 Mar, 22:15
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1195) XLSB support Sun, 15 Mar, 22:16
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1199) Tika extracts weird signs instead of text Sun, 15 Mar, 22:19
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1203) Some metadata not extracted from PDF files when NonSequentialPDFParser is used Sun, 15 Mar, 22:26
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1204) DWFX files detection Sun, 15 Mar, 22:37
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1206) rfc822 standard headers Sun, 15 Mar, 22:38
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1220) Parser implementration for IFC files Sun, 15 Mar, 22:54
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1222) Tika does not extract attachments from RFC822 files Mon, 16 Mar, 00:16
Tyler Palsulich (JIRA) [jira] [Closed] (TIKA-1235) empty docx creates exception Mon, 16 Mar, 00:17
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1243) Support for 7z archives Mon, 16 Mar, 02:26
Tyler Palsulich (JIRA) [jira] [Closed] (TIKA-1245) Incorrect MIME type detection Mon, 16 Mar, 02:28
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1255) WordExtractor - bold hyperlink not closed properly Mon, 16 Mar, 02:31
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1203) Some metadata not extracted from PDF files when NonSequentialPDFParser is used Mon, 16 Mar, 12:07
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1203) Some metadata not extracted from PDF files when NonSequentialPDFParser is used Mon, 16 Mar, 19:36
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1266) Tika OSGI Bundle needs Bundle-ClassPath to work in Equinox Fri, 20 Mar, 19:09
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1276) Missing embedded dependencies in tika-bundle Fri, 20 Mar, 19:14
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1579) Add file type to NetCDFParser Fri, 20 Mar, 19:17
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1578) Add file type description to HDFParsers Fri, 20 Mar, 19:18
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1154) Tika hangs on format detection of malformed HTML file. Fri, 20 Mar, 19:19
Tyler Palsulich (JIRA) [jira] [Comment Edited] (TIKA-1154) Tika hangs on format detection of malformed HTML file. Fri, 20 Mar, 19:20
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1580) ISA-Tab parsers Fri, 20 Mar, 19:23
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1194) Missing text from MS Word (DOC) file Fri, 20 Mar, 19:38
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1114) sgml mime type is not detected when passed in as byte stream Fri, 20 Mar, 19:44
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1267) Improve Mbox file detection Fri, 20 Mar, 19:49
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1273) old tika-server jar artifact contains no manifest so not able to invoke from shell Fri, 20 Mar, 19:54
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1287) Update NetCDF .jar file on Maven Central Fri, 20 Mar, 20:05
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1288) Epub's content extracted partially Fri, 20 Mar, 20:12
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1289) Ligatures convert on text extraction Fri, 20 Mar, 20:15
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1293) Netscape bookmark files are not being detected as HTML Fri, 20 Mar, 20:17
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1304) Implement Metadata Property with PropertyType ALT Fri, 20 Mar, 20:21
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1296) Add case insensitive matching for text/html mime type Fri, 20 Mar, 20:30
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1306) ClassCastException WARN [main] (COSDocument.java:303) - java.lang.ClassCastException: org.apache.pdfbox.cos.COSString cannot be cast to org.apache.pdfbox.cos.COSName in o.a.t.parser.pdf.PDFParserTest Fri, 20 Mar, 20:33
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1306) ClassCastException WARN [main] (COSDocument.java:303) - java.lang.ClassCastException: org.apache.pdfbox.cos.COSString cannot be cast to org.apache.pdfbox.cos.COSName in o.a.t.parser.pdf.PDFParserTest Fri, 20 Mar, 20:35
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1307) Jenkins Java7 job requires a profile in order to build 'tika-java7' module. Fri, 20 Mar, 20:36
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1307) Jenkins Java7 job requires a profile in order to build 'tika-java7' module. Fri, 20 Mar, 20:39
Tyler Palsulich (JIRA) [jira] [Updated] (TIKA-1308) Support in memory parse mode(don't create temp file): to support run Tika in GAE Fri, 20 Mar, 20:40
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1308) Support in memory parse mode(don't create temp file): to support run Tika in GAE Fri, 20 Mar, 20:41
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1314) An inappropriate comment of CharsetDetector.detect() Fri, 20 Mar, 20:45
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1321) Add experimental Stax/Streaming XWPF/docx extractor Fri, 20 Mar, 20:48
Tyler Palsulich (JIRA) [jira] [Resolved] (TIKA-1324) Use a common path for the Tika Server unpacker resources Fri, 20 Mar, 20:49
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1325) Move the font metadata definitions to properties Fri, 20 Mar, 20:52
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1326) MSI file detection Fri, 20 Mar, 20:53
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1329) Add RecursiveParserWrapper aka Jukka's (and Nick's) RecursiveMetadataParser Fri, 20 Mar, 20:54
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1344) Ability to generate self-contained HTML with images Fri, 20 Mar, 20:55
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1351) Parser implementations should accept null content handlers Fri, 20 Mar, 21:02
Tyler Palsulich (JIRA) [jira] [Commented] (TIKA-1354) ForkParser doesn't work in OSGI container Fri, 20 Mar, 21:05
Message list« Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · 9 · Next »Thread · Author · Date
Box list
Feb 2020112
Jan 2020122
Dec 2019221
Nov 2019211
Oct 2019331
Sep 201982
Aug 2019153
Jul 2019196
Jun 2019172
May 2019328
Apr 2019194
Mar 201956
Feb 201985
Jan 2019222
Dec 2018158
Nov 2018339
Oct 2018298
Sep 2018267
Aug 2018171
Jul 2018235
Jun 2018200
May 2018228
Apr 2018138
Mar 2018368
Feb 2018249
Jan 2018128
Dec 2017176
Nov 2017263
Oct 2017142
Sep 2017236
Aug 2017214
Jul 2017364
Jun 2017310
May 2017493
Apr 2017426
Mar 2017405
Feb 2017235
Jan 2017375
Dec 2016359
Nov 2016351
Oct 2016385
Sep 2016476
Aug 2016242
Jul 2016197
Jun 2016328
May 2016344
Apr 2016620
Mar 2016423
Feb 2016463
Jan 2016296
Dec 2015185
Nov 2015170
Oct 2015320
Sep 2015388
Aug 2015397
Jul 2015323
Jun 2015307
May 2015317
Apr 2015475
Mar 2015891
Feb 2015445
Jan 2015601
Dec 2014253
Nov 2014389
Oct 2014481
Sep 2014364
Aug 2014393
Jul 2014328
Jun 2014671
May 2014298
Apr 2014161
Mar 2014226
Feb 2014293
Jan 2014150
Dec 2013155
Nov 201384
Oct 2013100
Sep 201386
Aug 2013103
Jul 2013146
Jun 2013138
May 2013126
Apr 201374
Mar 201370
Feb 2013174
Jan 2013205
Dec 2012109
Nov 2012124
Oct 2012118
Sep 201261
Aug 2012173
Jul 2012274
Jun 2012102
May 2012174
Apr 2012180
Mar 2012200
Feb 2012125
Jan 2012189
Dec 2011287
Nov 2011259
Oct 2011336
Sep 2011356
Aug 2011197
Jul 2011120
Jun 2011122
May 2011184
Apr 2011137
Mar 2011161
Feb 2011111
Jan 201185
Dec 201099
Nov 2010252
Oct 2010144
Sep 2010168
Aug 2010253
Jul 2010192
Jun 2010154
May 2010132
Apr 2010115
Mar 201090
Feb 201062
Jan 2010134
Dec 2009125
Nov 2009179
Oct 200989
Sep 2009115
Aug 200946
Jul 200977
Jun 200994
May 200981
Apr 200936
Mar 200996
Feb 200974
Jan 200993
Dec 2008112
Nov 2008147
Oct 200854
Sep 2008108
Aug 200826
Jul 200817
Jun 200820
May 200816
Apr 200844
Mar 200873
Feb 200836
Jan 200888
Dec 200785
Nov 2007100
Oct 2007424
Sep 2007265
Aug 200719
Jul 200730
Jun 200751
May 200721
Apr 200712
Mar 200712