freemarker-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Dekany <ddek...@apache.org>
Subject Re: [FM3] Rename encoding to charset, use Charset instead of String
Date Wed, 05 Apr 2017 20:24:30 GMT
I would like to point out that because nobody has complained, I have
implemented and committed this (a week ago or something). That is, the
names are "sourceEncoding" and "outputEncoding" and
"URLEscapingCharset". Of course, all these settings have
java.nio.charset.Charset type (as opposed to String in FM2).

But nothing is graved into stone until there's a release. However,
some people beside me have to take their time, check this thing out,
and criticize it. Because more eyes see more. As always,
src/manual/en_US/FM3-CHANGE-LOG.txt contains the
(not-entirely-trivial-) changes made so far.

BTW, right now I'm working towards immutable Configuration-s (the
builder thing) which kind of implies immutable TemplateConfiguration
and immutable Template-s. I'm in the middle of this, so that part is a
bit messy ATM, but it compiles and is supposed to work without bugs.
(But it's not backward compatible, mind you.)


Monday, March 27, 2017, 4:03:23 PM, Daniel Dekany wrote:

> I have second thoughts regarding encoding VS charset... When it's
> about the charset of a file, people always seem to use "encoding":
>
> - In Eclipse the setting name is called "Text file encoding"
> - In IntetelliJ it's called "File Encoding"
> - In Notepad++ the related top-level menu point is "Encoding"
> - In TextMate it's called "File Encoding"
>
> I didn't run into any case where an editor uses the term "charset" for
> this.
>
> Worse, there's this: <?xml version="1.0" encoding="UTF-8"?>. We also
> have `<#ftl encoding="...">`. Because it's somewhat reminiscent of the
> XML declaration, people will tend to write "encoding". Of course, we
> can have "encoding" there and still call the setting sourceCharset,
> but that would be a bit confusing.
>
> Things get less obvious when it comes to settings like
> URLEscapingCharset and outputEncoding (these are the FM2 names)...
>
> For URL escaping... first of all, the FM2 name isn't very good, as
> this kind of escaping is called "URL encoding", not escaping. (But FM
> have auto-escaping, and all the related directives and built-ins, all
> using the term escaping, so I guess that's how it got the wrong name.)
> Anyway, I think the charset term is used more often in this context
> (or rather the mutated forms of it, like "character set"). Certainly
> because it would be confusing to talk about the encoding used for URL
> encoding, as opposed to the the charset used for URL encoding.
>
> For the charset of the output, in Content-Type HTTP response header
> you have "charset". So developers are often talking about the charset,
> rather than about the encoding. But, web browsers call this thing the
> encoding of the page, though that's because from their side it's
> analogous to opening a file, so they inherit the terminology from file
> editors.
>
> So, yeah... you can't be consistent with everything. Maybe the charset
> VS encoding terminology choices of FM2 were the right compromise.
> Except that we will still say "sourceEncoding" instead of just
> "encoding", and use the Charset type instead of String.
>
>
> Friday, March 24, 2017, 4:50:09 PM, Woonsan Ko wrote:
>
>> On Tue, Mar 21, 2017 at 2:39 PM, Daniel Dekany <ddekany@apache.org> wrote:
>>> Tuesday, March 21, 2017, 3:31:56 PM, Woonsan Ko wrote:
>>>
>>>> +1 on both.
>>>
>>> Furthermore, as the "encoding" parameter of
>>> getTemplate/#include/#import was removed in FM3, the
>>> locale-to-encoding map (`Configuration.setEncoding(Locale, String)`)
>>> was also removed. So now it should just be `charset`, not
>>> `defaultCharset` (similarly as we have Template.charset). However,
>>> that name is still pretty bad, as it doesn't tell if the charset of
>>> what it is. It's the charset of the the template file when we read it.
>>> So, maybe, it should be "sourceCharset"?
>>
>> Yes, "sourceCharset" helps clarify the meaning, indeed!
>>
>> Cheers,
>>
>> Woonsan
>>
>>>
>>>> Woonsan
>>>>
>>>> On Sun, Mar 19, 2017 at 2:22 PM, Daniel Dekany <ddekany@freemail.hu>
wrote:
>>>>> We have this retro terminology where instead of charset we say
>>>>> encoding. (I understand that encoding has a wider meaning, but we only
>>>>> intend to support encoding/decoding via a charset.) So I think
>>>>> cfg.setDefaultEncoding and <#ftl encoding=...> and such should
be
>>>>> renamed to cfg.setDefaultCharset and <#ftl charset=...>.
>>>>>
>>>>> Also, in the Java API-s we should use Charset instead of a String
>>>>> (java.nio.charset.Charset didn't exist when FM 2.3 was created).
>>>>>
>>>>> --
>>>>> Thanks,
>>>>>  Daniel Dekany
>>>>>
>>>>
>>>
>>> --
>>> Thanks,
>>>  Daniel Dekany
>>>
>>
>

-- 
Thanks,
 Daniel Dekany


Mime
View raw message