poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 49599] Comment.setAuthor does not encode multi-byte characters (Chinese) well
Date Fri, 16 Jul 2010 11:54:47 GMT
https://issues.apache.org/bugzilla/show_bug.cgi?id=49599

--- Comment #1 from André-John Mas <andrejohn.mas@gmail.com> 2010-07-16 07:54:44 EDT
---
Just to sum up the thread:

The serialize() method in org.apache.poi.hssf.record.NoteRecord is not calling
the StringUtil.putUnicodeLE() method, because the field_5_hasMultibyte instance
variable is false, even when the author field contains double-byte characters.
In fact other than when a file is read field_5_hasMultibyte is never set to
true.

Two possible solutions:
 - add logic to work out if we have non-latin characters, since the issue is
not just affecting double-byte characters
 - set the field_5_hasMultibyte variable to be true and always write out
unicode characters, unless there is a usage scenario this could break.

I tested on MacOS X 10.6.4 and used Excel 2008 to see the result. Changing the
variable to true resulted in Chinese text to appear correctly for the author.

BTW We should probably be extending the unit tests for ensuring non-latin
characters are getting stored properly.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Mime
View raw message