poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yegor Kozlov <yegor.koz...@dinom.ru>
Subject Re: Apache POI 3.8 (SXSSFWorkbook) - Unreadable Content
Date Wed, 03 Aug 2011 12:11:01 GMT
So far the plan is to release it in late August.

Yegor

On Wed, Aug 3, 2011 at 3:01 PM, Guilherme Vieira <jguilhermemv@gmail.com> wrote:
> Yegor,
>
> I'm glad to help you to find this issue. This is exactly the problem. By now
> I'm trying to fix it in my own code by replacing these characters. Though
> the beta4 is not out yet I'm still using it in my project because I need to
> write an excel with >100.000 lines. Do you know when the beta4 is gonna be
> out?
>
> Cheers,
> José Guilherme Macedo Vieira
>
>
>
> 2011/8/3 Yegor Kozlov <yegor.kozlov@dinom.ru>
>
>> The culprit is the non-break space (charcode=\u00a0). I was able to
>> reproduce the trouble with the following code:
>>
>>        Workbook wb = new SXSSFWorkbook();
>>        Sheet sh = wb.createSheet();
>>         Row row = sh.createRow(0);
>>        row.createCell(0).setCellValue("ALEXANDRE\u00a0MARINHO DE SOUZA");
>>         FileOutputStream out = new FileOutputStream("/temp/test.xlsx");
>>        wb.write(out);
>>        out.close();
>>
>> The fix is coming soon and will be included in 3.8-beta4.
>>
>> Cheers,
>> Yegor
>>
>> On Wed, Aug 3, 2011 at 1:59 PM, Guilherme Vieira <jguilhermemv@gmail.com>
>> wrote:
>> > Dear Yegor,
>> >
>> > Your tip didn't work. So I guessed that there was a non-printable
>> character
>> > instead of white spaces. That said I tried to encode it with
>> > URLEncoder.encode("the name goes here","ASCII"); and guess what? The
>> encoded
>> > name is as below:
>> >
>> > ALEXANDRE%3BF+MARINHO+DE+SOUZA
>> >
>> > It interesting because I can't remove it with replace all because we have
>> > non-printable characters. So, I'm trying to find a regular expression
>> that
>> > matches to these expressions (%3BF and + ,respectively). It would be nice
>> if
>> > I could find a regular expression that matches to any special
>> non-printable
>> > characters. So, how do I proceed?
>> >
>> > And thanks in advance for your answer as well for your GREAT work in
>> Apache
>> > POI with the Big Grid Demo approach. It is just wonderful. Can't wait for
>> > the final release (3.8-beta4).
>> >
>> > Best regards,
>> > José Guilherme Macedo Vieira
>> >
>> >
>> > 2011/8/3 Yegor Kozlov <yegor.kozlov@dinom.ru>
>> >
>> >> Tweak your report generator and try the following tricks before
>> >> passing strings to SXSSFCell:
>> >>
>> >>  (a) string.replaceAll("\\s+", ""); // replace multiple white spaces
>> >> with a single space
>> >>  (b) string.replace(' ', '_'); // replace white spaces with underscore
>> >>
>> >> Does any of (a) and (b) help?
>> >>
>> >> My hunch is that the problem is in something else, not in double white
>> >> spaces. At least, I can't reproduce the problem with the following
>> >> code snippet:
>> >>
>> >>        Workbook wb = new SXSSFWorkbook();
>> >>        Sheet sh = wb.createSheet();
>> >>        for(int i = 0; i < 10000; i++) {
>> >>            Row row = sh.createRow(i);
>> >>            row.createCell(0).setCellValue("ALEXANDRE__MARINHO DE
>> SOUZA");
>> >>            row.createCell(1).setCellValue("ALEXANDRE MARINHO DE SOUZA");
>> >>            row.createCell(2).setCellValue("ALEXANDRE  MARINHO DE
>> SOUZA");
>> >>            row.createCell(3).setCellValue("ALEXANDRE   MARINHO DE
>> SOUZA");
>> >>        }
>> >>
>> >>        FileOutputStream out = new FileOutputStream("/temp/test.xlsx");
>> >>        wb.write(out);
>> >>        out.close();
>> >>
>> >> The generated file is readable and all spaces are there.
>> >>
>> >> Yegor
>> >>
>> >> On Tue, Aug 2, 2011 at 11:49 PM, Guilherme Vieira
>> >> <jguilhermemv@gmail.com> wrote:
>> >> > So, I've searched column by column in the problematic line in order
to
>> >> > identify the problem. The problem is quite weird. It's a string column
>> in
>> >> > the database. This column stores people names.
>> >> >
>> >> > In my problem the name is: ALEXANDRE__MARINHO DE SOUZA
>> >> >
>> >> > Of course, without the underline character. Instead it is a whitespace
>> >> > character. So, when with double whitespace character the file is
>> >> corrupted.
>> >> > And when I manually remove the one whitespace in the IDE, the file
is
>> >> also
>> >> > corrupted. But when I change the whole name manually in the IDE,
>> setting
>> >> the
>> >> > value to ALEXANDRE_MARINHO DE SOUZA, it works. It's strange. I don't
>> know
>> >> > why SXSSF is not accepting two whitespaces.
>> >> >
>> >> > Anyone have a clue?
>> >> >
>> >> >
>> >> >
>> >> > 2011/8/2 jguilhermemv <jguilhermemv@gmail.com>
>> >> >
>> >> >> I tried without merged region and it didn't work. So, I noticed
that
>> >> there
>> >> >> is a line in the file which present the error. It's the line (2451)
>> and
>> >> >> until the line 2450 everything works great. But for some reason
when
>> it
>> >> >> reach the line 2450 it just doesn't work. I checked if the was
any
>> null
>> >> >> values, but there wasn't. The writing routine is right, otherwise
it
>> >> >> wouldn't write until the line 2450.
>> >> >>
>> >> >> What can I do now?
>> >> >>
>> >> >> Best regards.
>> >> >> José Guilherme Macedo Vieira
>> >> >>
>> >> >>
>> >> >> 2011/8/2 Nick Burch-11 [via Apache POI] <
>> >> >> ml-node+4658878-753894702-237524@n5.nabble.com>
>> >> >>
>> >> >> > On Tue, 2 Aug 2011, jguilhermemv wrote:
>> >> >> > > Regarding the file, it makes use of some CellStyles and
Merged
>> >> Regions.
>> >> >> >
>> >> >> > Try without them, and see if that fixes it. You need to narrow
your
>> >> >> > problem down before you can figure out what to correct. Try
to
>> >> identify
>> >> >> > the simplest file that fails, and the most complex one that
works,
>> the
>> >> >> gap
>> >> >> > there is your issue
>> >> >> >
>> >> >> > Nick
>> >> >> >
>> >> >> >
>> ---------------------------------------------------------------------
>> >> >> > To unsubscribe, e-mail: [hidden email]<
>> >> >> http://user/SendEmail.jtp?type=node&node=4658878&i=0>
>> >> >> > For additional commands, e-mail: [hidden email]<
>> >> >> http://user/SendEmail.jtp?type=node&node=4658878&i=1>
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > ------------------------------
>> >> >> >  If you reply to this email, your message will be added to
the
>> >> discussion
>> >> >> > below:
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> http://apache-poi.1045710.n5.nabble.com/Apache-POI-3-8-SXSSFWorkbook-Unreadable-Content-tp4658852p4658878.html
>> >> >> >  To unsubscribe from Apache POI 3.8 (SXSSFWorkbook) - Unreadable
>> >> Content,
>> >> >> click
>> >> >> > here<
>> >> >>
>> >>
>> http://apache-poi.1045710.n5.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4658852&code=amd1aWxoZXJtZW12QGdtYWlsLmNvbXw0NjU4ODUyfDg3MzU2ODc4NA==
>> >> >> >.
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >> --
>> >> >> View this message in context:
>> >> >>
>> >>
>> http://apache-poi.1045710.n5.nabble.com/Apache-POI-3-8-SXSSFWorkbook-Unreadable-Content-tp4658852p4659737.html
>> >> >> Sent from the POI - Dev mailing list archive at Nabble.com.
>> >> >>
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
>> >> For additional commands, e-mail: dev-help@poi.apache.org
>> >>
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
>> For additional commands, e-mail: dev-help@poi.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Mime
View raw message