poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 48872] New: Issues with poi.ss.usermodel.DataFormatter and DateUtil
Date Mon, 08 Mar 2010 20:47:41 GMT

           Summary: Issues with poi.ss.usermodel.DataFormatter and
           Product: POI
           Version: 3.6
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: POI Overall
        AssignedTo: dev@poi.apache.org
        ReportedBy: robert.kish@ncogroup.com

Created an attachment (id=25104)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=25104)
Notes, code examples, and sample data.

I'm working on a project where I try to take an Excel 97 or an Excel 07 file
and export it as a CSV file using Java code. For testing, we use Excel's save
as CSV converter. I have noted many discrepancies and have provided some
corrections/additions/and hacks to address them. I would like to share my
findings with someone here so that someone who is willing and able can merge
them properly into the POI project.

DateUtil issue
1. The method isADateFormat does not know how to handle quoted values, for
example mm"/"dd"/"yyyy.

DataFormatter issues

2. Format strings for zero values are not properly handled. They should use the
third component (delimited by ;) if available.

3. Color checking is too simple - it removes all text in [] when it should
probably limit itself to the color words. This has an effect if trying to
implement elapsed time [h], [m], and [s].

4. Some stupid date formats are not handled by the createFormat method:
a. m\\/d\\/yyyy 
b. mm"/"dd"/"yyyy

5. Elapsed time processing is missing.

6. It appears that _( and _) processing does not match with Excel. Trying to
format using built in formats does not work as  expected when _( and _) are
corrected. (In other words, the built in format Strings have to be modified

7. * padding is not working as expected.

8. Fractions are not being processed, and I've tried and it's really ugly to
implement the same as Excel, but they could be replaced with spaces instead of
left as ?.

9. If the resulting format does not contain any # or 0, then the number still
gets appended to the format string. This happens most often when trying to
print a 0 value using a third position format - many formats just print - or
blank. and never "-0".

10. Excel appears to use half_up rounding scheme, and not the DecimalFormat

11. formatRawCellContents needs a way to send it the 1904 windowing parameter.

12. An invalid date in Excel processing should be returned as 255 #s (when
doing CSV conversion).

Attached items:
Shareable Notes.txt
Has some of these notes with a little more wording and some cut/paste code

is my example of this DataFormatter class coded with my hacks. It should be
used as a reference.

All Excel Formats.xls
An Excel 97 workbook containing 1 example of all formats available under
"custom" option. This is just a sampling, because I know more can be generated
if I go into individual options.

All Excel Formats.csv
The above file saved as a CSV.

All Excel Formats.xlsx
The above file saved as a 2007 Excel.

All Excel Formats.0.csv
My program's output from processing the XLS or XLSX file.
I custom formatted dates to show 4 digit years and 2 digit hours, etc.
The discrepancies are:
    There's no 1/0/1900 date but instead 12/31/1899.
    Exponent processing just doesn't work the same way - Java doesn't round
digits after decimal.
    Fractions... I tried using BigFraction class, but pulled it out from this
release - Excel is weird - it rounds down except when it gets close to next
whole number.

Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

View raw message