lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Wachter <>
Subject Re: encoding of german analyzer source files
Date Fri, 26 Nov 2004 12:00:14 GMT
Hi Daniel,

I am using NetBeans 3.6 which certainly is unicode aware. Yet, NetBeans 
seems not to detect that the source files of Lucene are UTF-8 encoded 
automatically. I guess that it uses the platform specific default 
encoding which is ISO-8859-1 for my Linux operating system.

I think what Java lacks is a means to indicate the encoding of source 
files (e.g. <?java encoding="ISO-8859-1"?> in a XMLish way). The 
encoding has to be fed into the system from the outside. What else could 
be the reason for having an encoding switch to the java compiler? 
Therefore I think it is best to have Java source files to be plain ASCII.


Daniel Naber wrote:

>On Friday 26 November 2004 11:42, Stefan Wachter wrote:
>>With UTF-8 encoding theses source
>>files look rather strange when viewed on an "ISO-8859-1" development
>>environment because they contain german umlauts and the "sharp s".
>Your editor / IDE needs to be unicode aware and you have to set it up 
>accordingly. That and the fact that build.xml specifies the encodig 
>explicitly should make everything work, no matter what your default encoding 
>>In order to make the source of the german analyzer class platform
>>independent I propose to use the corresonding Java unicode escapes where
>>the special characters are used.
>That makes the source more difficult to read.
> Daniel
>To unsubscribe, e-mail:
>For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message