lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Wachter <>
Subject encoding of german analyzer source files
Date Fri, 26 Nov 2004 10:42:18 GMT
Hi all,

in the 1.4.2 distribution the source files of the german analyzer 
classes are encoded in UTF-8. (CHANGES.txt reports that Otis Gospodnetic 
changed the file encoding of theses files to UTF-8. I guess that their 
original encoding was ISO-8859-1.) With UTF-8 encoding theses source 
files look rather strange when viewed on an "ISO-8859-1" development 
environment because they contain german umlauts and the "sharp s". In 
addition, they can not be compiled directly under such an environment. (The 
lucene build.xml sets the java compiler encoding to "utf-8" which make 
thinks fine.)

In order to make the source of the german analyzer class platform 
independent I propose to use the corresonding Java unicode escapes where 
the special characters are used.

Best regards,

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message