tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Seva Alekseyev (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (TIKA-2205) IllegalArgumentException on a valid Excel file
Date Tue, 13 Dec 2016 21:23:58 GMT

     [ https://issues.apache.org/jira/browse/TIKA-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Seva Alekseyev updated TIKA-2205:
---------------------------------
    Description: 
The attached file, which opens in Excel, errors out in Tika:

java.lang.IllegalArgumentException: Cannot format given Object as a Number
	at java.text.DecimalFormat.format:-1
	at java.text.Format.format:-1
	at org.apache.poi.ss.usermodel.DataFormatter.performDateFormatting:736
	at org.apache.poi.ss.usermodel.DataFormatter.formatRawCellContents:804
	at org.apache.poi.ss.usermodel.DataFormatter.formatRawCellContents:785
	at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.formatNumberDateCell:143
	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener$TikaFormatTrackingHSSFListener.formatNumberDateCell:633
	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.internalProcessRecord:432
	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processRecord:336
	at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.processRecord:92
	at org.apache.poi.hssf.eventusermodel.HSSFRequest.processRecord:109
	at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents:179
	at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents:136
	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processFile:312
	at org.apache.tika.parser.microsoft.ExcelExtractor.parse:169
	at org.apache.tika.parser.microsoft.OfficeParser.parse:177
	at org.apache.tika.parser.microsoft.OfficeParser.parse:130
	at gov.nih.niaid.fscanner.Extract.ExtractContents:69


  was:
The attached file, which opens in Excel, errors out in Tika:

java.lang.IllegalArgumentException: Cannot format given Object as a Number
	at java.text.DecimalFormat.format:-1
	at java.text.Format.format:-1
	at org.apache.poi.ss.usermodel.DataFormatter.performDateFormatting:736
	at org.apache.poi.ss.usermodel.DataFormatter.formatRawCellContents:804
	at org.apache.poi.ss.usermodel.DataFormatter.formatRawCellContents:785
	at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.formatNumberDateCell:143
	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener$TikaFormatTrackingHSSFListener.formatNumberDateCell:633
	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.internalProcessRecord:432
	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processRecord:336
	at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.processRecord:92
	at org.apache.poi.hssf.eventusermodel.HSSFRequest.processRecord:109
	at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents:179
	at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents:136
	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processFile:312
	at org.apache.tika.parser.microsoft.ExcelExtractor.parse:169
	at org.apache.tika.parser.microsoft.OfficeParser.parse:177
	at org.apache.tika.parser.microsoft.OfficeParser.parse:130
	at gov.nih.niaid.fscanner.Extract.ExtractContents:69
org.apache.tika.exception.TikaException for 63269/<\\ai-storm\FScan\Scan_2016-12-11_11-14-13\Folders\51541330\engelAPBD
copy.pptx>: "Error creating OOXML extractor"
org.apache.tika.exception.TikaException: Error creating OOXML extractor
	at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse:120
	at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse:87


> IllegalArgumentException on a valid Excel file
> ----------------------------------------------
>
>                 Key: TIKA-2205
>                 URL: https://issues.apache.org/jira/browse/TIKA-2205
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.14
>         Environment: Windows 7 x64, JVM 1.8.0_101
>            Reporter: Seva Alekseyev
>         Attachments: SAT19-11-25-09_Selected Dates.xls
>
>
> The attached file, which opens in Excel, errors out in Tika:
> java.lang.IllegalArgumentException: Cannot format given Object as a Number
> 	at java.text.DecimalFormat.format:-1
> 	at java.text.Format.format:-1
> 	at org.apache.poi.ss.usermodel.DataFormatter.performDateFormatting:736
> 	at org.apache.poi.ss.usermodel.DataFormatter.formatRawCellContents:804
> 	at org.apache.poi.ss.usermodel.DataFormatter.formatRawCellContents:785
> 	at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.formatNumberDateCell:143
> 	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener$TikaFormatTrackingHSSFListener.formatNumberDateCell:633
> 	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.internalProcessRecord:432
> 	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processRecord:336
> 	at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.processRecord:92
> 	at org.apache.poi.hssf.eventusermodel.HSSFRequest.processRecord:109
> 	at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents:179
> 	at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents:136
> 	at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processFile:312
> 	at org.apache.tika.parser.microsoft.ExcelExtractor.parse:169
> 	at org.apache.tika.parser.microsoft.OfficeParser.parse:177
> 	at org.apache.tika.parser.microsoft.OfficeParser.parse:130
> 	at gov.nih.niaid.fscanner.Extract.ExtractContents:69



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message