freemarker-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Dekany <ddek...@apache.org>
Subject Re: [FM3] Rename encoding to charset, use Charset instead of String
Date Mon, 27 Mar 2017 14:03:23 GMT
I have second thoughts regarding encoding VS charset... When it's
about the charset of a file, people always seem to use "encoding":

- In Eclipse the setting name is called "Text file encoding"
- In IntetelliJ it's called "File Encoding"
- In Notepad++ the related top-level menu point is "Encoding"
- In TextMate it's called "File Encoding"

I didn't run into any case where an editor uses the term "charset" for
this.

Worse, there's this: <?xml version="1.0" encoding="UTF-8"?>. We also
have `<#ftl encoding="...">`. Because it's somewhat reminiscent of the
XML declaration, people will tend to write "encoding". Of course, we
can have "encoding" there and still call the setting sourceCharset,
but that would be a bit confusing.

Things get less obvious when it comes to settings like
URLEscapingCharset and outputEncoding (these are the FM2 names)...

For URL escaping... first of all, the FM2 name isn't very good, as
this kind of escaping is called "URL encoding", not escaping. (But FM
have auto-escaping, and all the related directives and built-ins, all
using the term escaping, so I guess that's how it got the wrong name.)
Anyway, I think the charset term is used more often in this context
(or rather the mutated forms of it, like "character set"). Certainly
because it would be confusing to talk about the encoding used for URL
encoding, as opposed to the the charset used for URL encoding.

For the charset of the output, in Content-Type HTTP response header
you have "charset". So developers are often talking about the charset,
rather than about the encoding. But, web browsers call this thing the
encoding of the page, though that's because from their side it's
analogous to opening a file, so they inherit the terminology from file
editors.

So, yeah... you can't be consistent with everything. Maybe the charset
VS encoding terminology choices of FM2 were the right compromise.
Except that we will still say "sourceEncoding" instead of just
"encoding", and use the Charset type instead of String.


Friday, March 24, 2017, 4:50:09 PM, Woonsan Ko wrote:

> On Tue, Mar 21, 2017 at 2:39 PM, Daniel Dekany <ddekany@apache.org> wrote:
>> Tuesday, March 21, 2017, 3:31:56 PM, Woonsan Ko wrote:
>>
>>> +1 on both.
>>
>> Furthermore, as the "encoding" parameter of
>> getTemplate/#include/#import was removed in FM3, the
>> locale-to-encoding map (`Configuration.setEncoding(Locale, String)`)
>> was also removed. So now it should just be `charset`, not
>> `defaultCharset` (similarly as we have Template.charset). However,
>> that name is still pretty bad, as it doesn't tell if the charset of
>> what it is. It's the charset of the the template file when we read it.
>> So, maybe, it should be "sourceCharset"?
>
> Yes, "sourceCharset" helps clarify the meaning, indeed!
>
> Cheers,
>
> Woonsan
>
>>
>>> Woonsan
>>>
>>> On Sun, Mar 19, 2017 at 2:22 PM, Daniel Dekany <ddekany@freemail.hu> wrote:
>>>> We have this retro terminology where instead of charset we say
>>>> encoding. (I understand that encoding has a wider meaning, but we only
>>>> intend to support encoding/decoding via a charset.) So I think
>>>> cfg.setDefaultEncoding and <#ftl encoding=...> and such should be
>>>> renamed to cfg.setDefaultCharset and <#ftl charset=...>.
>>>>
>>>> Also, in the Java API-s we should use Charset instead of a String
>>>> (java.nio.charset.Charset didn't exist when FM 2.3 was created).
>>>>
>>>> --
>>>> Thanks,
>>>>  Daniel Dekany
>>>>
>>>
>>
>> --
>> Thanks,
>>  Daniel Dekany
>>
>

-- 
Thanks,
 Daniel Dekany


Mime
View raw message