calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ye Ding (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CALCITE-2163) Using "UTF16" as default charset failed
Date Sun, 04 Feb 2018 05:38:00 GMT
Ye Ding created CALCITE-2163:
--------------------------------

             Summary: Using "UTF16" as default charset failed
                 Key: CALCITE-2163
                 URL: https://issues.apache.org/jira/browse/CALCITE-2163
             Project: Calcite
          Issue Type: Bug
            Reporter: Ye Ding
            Assignee: Julian Hyde


I have a project that need to handle non-ASCII character, so I have set default charset to
"UTF16" by setting "saffron.default.charset" to "UTF16", but failed with below error stack

{code:txt}
Caused by: java.nio.charset.UnsupportedCharsetException: UTF-16
	at org.apache.calcite.util.NlsString.<init>(NlsString.java:72)
	at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:882)
	at org.apache.calcite.rex.RexBuilder.<init>(RexBuilder.java:117)
	at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1046)
	at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
	... 29 more
{code}

Having explored related source code I found a suspicious code that may cause the problem.

Here is a code block from RexBuilder, between L869 and L883.

{code:java}
case CHAR:
  // Character literals must have a charset and collation. Populate
  // from the type if necessary.
  assert o instanceof NlsString;
  NlsString nlsString = (NlsString) o;
  if ((nlsString.getCollation() == null)
      || (nlsString.getCharset() == null)) {
    assert type.getSqlTypeName() == SqlTypeName.CHAR;
    assert type.getCharset().name() != null;
    assert type.getCollation() != null;
    o = new NlsString(
        nlsString.getValue(),
        type.getCharset().name(),
        type.getCollation());
  }
{code}

At the last line, a *Java* charset name is used to construct NlsString.

But from the code of NlsString's constructor, the charsetName is supposed to be *SQL* charset
name.

{code:java}
  public NlsString(
      String value,
      String charsetName,
      SqlCollation collation) {
    assert value != null;
    if (null != charsetName) {
      charsetName = charsetName.toUpperCase(Locale.ROOT);
      this.charsetName = charsetName;
      String javaCharsetName =
          SqlUtil.translateCharacterSetName(charsetName);
      if (javaCharsetName == null) {
        throw new UnsupportedCharsetException(charsetName);
      }
      this.charset = Charset.forName(javaCharsetName);
      CharsetEncoder encoder = charset.newEncoder();
      ....
{code}

I have not read and fully understood codes, so I'm not sure if it's the root cause of the
problem. Currently I've managed to work around it by setting "saffron.default.charset" to
"UTF-16LE".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message