db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Okken,Brett" <BOK...@CERNER.COM>
Subject RE: Executing SQL file: character encoding
Date Tue, 26 Mar 2013 12:06:41 GMT
UNICODE is really more of a code set rather than a character encoding. It defines a set of
code points (characters) which are valid, but does not define how to encode (represent as
bytes). There are multiple encodings which work with the Unicode code set. Utf-8 is probably
the most well known, but utf-16 is also pretty common.

I think the documentation here[1] is simply stating that derby supports all the characters
defined by Unicode. It is /NOT/ stating a character encoding for default use. The documentation[2]
for the derby.ui.codeset-property references the "default system." I would read that as saying
that the default character encoding of the system will be used unless you specify some other
encoding. If the only characters you are having problems with are in those outside of ascii,
then probably some other 8 bit encoding is being used by default (such as one of the iso-8859
variants or one of the windows code pages). 

[1] - http://www.javadb.net/sql-parser-support-for-unicode.html
[2] - http://www.javadb.net/derby.ui.codeset-property.html

Brett Okken | CAMM Platform Services | Lead Architect | 816.201.6112 | www.cerner.com | bokken@cerner.com

-----Original Message-----
From: rgasch [mailto:rgasch@gmail.com] 
Sent: Tuesday, March 26, 2013 4:15 AM
To: derby-user@db.apache.org
Subject: Executing SQL file: character encoding

I am importing a SQL file which originally was created by mysqldump. The character encoding
of this file is UTF-8. For reference, I am doing this on a Linux machine (although that should
not matter).

After massaging the file to account for the syntactic differences in MySql and Derby, the
SQL is processed without errors. However, I am having trouble correctly loading non-ASCII
characters (such as À,Á,Â,Ã,Å,à,á,â,ã,å,Ò,Ó,Ô,Õ,Ø,ò,ó,ô,õ,ø,È,É,Ê,Ë,è,é,ê,ë,Ç,ç,Ì,Í,Î,Ï,ì,í,î,ï,Ù,Ú,Û,ù,ú,û,ÿ,Ñ,ñ,ß,ä,Ä,ö,Ö,ü,Ü).

I've managed to solve this by using the derby.ui.codeset=utf8 definition in a properties file,
however I would prefer to have a solution which does not rely on this. 

So, in order to achieve this, what character encoding to I have to save this SQL file in,
in order to be able to import data containing such characters without having to rely on the
derby.ui.codeset setting? I know that the derby docs say that derby expects the files to be
encoded in UNICODE, but somehow that doesn't seem to be equal to UTF8 so I'm at a loss as
to what exactly it expects.


View this message in context: http://apache-database.10148.n7.nabble.com/Executing-SQL-file-character-encoding-tp128380.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.

CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation
and are intended only for the addressee. The information contained in this message is confidential
and may constitute inside or non-public information under international, federal, or state
securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such
information is strictly prohibited and may be unlawful. If you are not the addressee, please
promptly delete this message and notify the sender of the delivery error by e-mail or you
may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
View raw message