cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Flynn, Peter" <pfl...@ucc.ie>
Subject Re: [2.1] Overzealous escaping of high Unicode code points
Date Wed, 07 Jun 2017 08:43:50 GMT
I had a related problem with 3–4 CJK characters being converted to their &#hex; format.
Very weird, but it turned out to be the old and buggy copy of jtidy, and I can't figure out
how to replace it.

I haven't had the problem you describe, though, and I have a user who has implemented emoji
in Cocoon, see http://research.ucc.ie/emojis/

P

--
Peter Flynn | Academic and Collaborative Technologies | IT Services | University College Cork
| Ireland | pflynn@ucc.ie | http://research.ucc.ie/profiles/H505/pflynn | Sent from Hiri<https://www.hiri.com/>


On 2017-06-06 17:08:51+01:00 Christopher Schultz wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

All,

I've been testing my application for use with high Unicode code points
such as emoji like 😍 which is this one:
http://www.fileformat.info/info/unicode/char/1F60D/index.htm

My application and database can handle this code point, but Cocoon
butchers it in a way that I have seen before -- the way that
commons-lang's StringEscapeUtils.escapeXml/escapeHtml seems to do.

Instead of letting the character through as-is, it tries to convert it
into these two numbered entities:

��

Oddly enough, those are the two double-byte UTF-16 characters you'd
get, but they shouldn't be split-up like that, I don't think.

I haven't found a version of commons-lang 2.x that doesn't break these
kinds of characters. commons-lang3 does the right thing, but they are
incompatible libraries.

Does anyone know the code well enough to know how difficult it would
be to change the way Cocoon 2.1 escapes its output? For example, by
using commons-lang3?

I haven't tried Cocoon 2.2, yet, and I can't tell what dependencies it
has. I also can't exactly tell what to do now that I've downloaded the
binary package. Can this just be used as a drop-in replacement for
Cocoon 2.1.x? Cocoon 2.1.x could build a WAR file that I then
customized for my own application, adding various libraries and
configuration files to it. I think I'll follow-up with a separate post
about this.

- -chris

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJZNtOBAAoJEBzwKT+lPKRYEuIP/3gSJZDNEbzsHkI5zYjMZbFf
vKvRRnBSl+6IdrcUasftf+AkXIIYwj6xnUQ7winsLW/n8TdDG6jPqsg4Khsozc6z
aa23qDly62gmCsqpLohXxt/ZNKdPY4sOTghaaEUFTtTgpeD3M/INF90myT8SwO4K
WUtqVparSqp/Zf9JMm3OCIguMKbsRNYWVIQuiJxDQJkWYwrw0iVk2v8mc6iz/mDF
w6np4EvFr9fqdDufKpPw8anEkrp5JEuTx47vMOtz4sixVr2C6ehgP4zs3kVzdVid
QPeUsrosV1tsRC9bMVLGmjo7UhNseeXCp/AceIT6AQE8Q1clgy9GcoNMf60dgGku
et0xoGptYgbCfmJL+PuA9y7fJYjgTTQheqzuC721n2/sx+kyBSBWSMIhqia2sd4y
spcT4kw+uChsWjwoeGOHOm4IimrVgXkfJeHVSXV4m66sHS9t+bDiiErwS1SikvSV
qF64/L0u8hYFLD1ehURoHBi4foE1Td3eRGOGHgodcYL9C8U+Yv+fWaiYQ5O4CCnW
pToFvVoQOdZY+VVC8hz1ggbRMSxjT2GQLLJ2mjbGzGUJjlwyQaoZnADSSu0efj88
O2AlWB2Bf/Ag6E4C9jEjj+cauBfR+1NIK7F1Jo6C02yY1SUOSoOAFDZ7EkO4qYAO
YhvgSQXNmKps6rusNjNZ
=q8Eh
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message