tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinay Kawade (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2196) IllegalArgumentException on a valid Excel file
Date Tue, 09 Jan 2018 19:36:00 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16319016#comment-16319016
] 

Vinay Kawade commented on TIKA-2196:
------------------------------------

This seems to be happening when a cell is set to custom format with double quotes, for example:

{code:java}
""ddd,mmm dd
or
""dddd, mmmm dd, yyyy
{code}

As per,
https://bz.apache.org/bugzilla/show_bug.cgi?id=54786

the double double quotes are replaced by a single single quote


> IllegalArgumentException on a valid Excel file
> ----------------------------------------------
>
>                 Key: TIKA-2196
>                 URL: https://issues.apache.org/jira/browse/TIKA-2196
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.14
>         Environment: Windows 7 x64, JVM 1.8.0_101
>            Reporter: Seva Alekseyev
>         Attachments: 2007 Experiment watch.xls
>
>
> On the attached Excel file, which opens fine in Excel, Tika throws the following error:
> java.lang.IllegalArgumentException: Cannot format given Object as a Number
> 	at java.text.DecimalFormat.format:-1
> 	at org.apache.poi.ss.usermodel.ExcelGeneralNumberFormat.format:67
> 	at java.text.Format.format:-1
> 	at org.apache.poi.ss.usermodel.DataFormatter.performDateFormatting:736
> 	at org.apache.poi.ss.usermodel.DataFormatter.formatRawCellContents:804
> 	at org.apache.poi.ss.usermodel.DataFormatter.formatRawCellContents:785
> 	at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.formatNumberDateCell:143
> 	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener$TikaFormatTrackingHSSFListener.formatNumberDateCell:633
> 	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.internalProcessRecord:405
> 	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processRecord:336
> 	at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.processRecord:92
> 	at org.apache.poi.hssf.eventusermodel.HSSFRequest.processRecord:109
> 	at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents:179
> 	at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents:136
> 	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processFile:312
> 	at org.apache.tika.parser.microsoft.ExcelExtractor.parse:169
> 	at org.apache.tika.parser.microsoft.OfficeParser.parse:177
> 	at org.apache.tika.parser.microsoft.OfficeParser.parse:130



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message