james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noel J. Bergman" <n...@devtech.com>
Subject RE: Encoding issue in BayesianAnalyzer?
Date Tue, 31 May 2005 22:13:10 GMT
Stefano Bagnara wrote:

> I think the code you are pointing is 

Yes, that's it.  And I am finding that something in our build process is corrupting it.  I
just did:

 $ rm src/java/org/apache/james/util/BayesianAnalyzer.java 
 $ svn up
 Restored 'src/java/org/apache/james/util/BayesianAnalyzer.java'
 At revision 179287.
 $ svn diff
 $ ./build.sh clean dist-lite
 $ svn diff
 Index: src/java/org/apache/james/util/BayesianAnalyzer.java
 ===================================================================
 --- src/java/org/apache/james/util/BayesianAnalyzer.java        (revision 179287)
 +++ src/java/org/apache/james/util/BayesianAnalyzer.java        (working copy)
 @@ -471,7 +471,7 @@
              if (Character.isLetter(ch)
              || ch == '-'
              || ch == '$'
 -            || ch == ''
 +            || ch == '�'
              || ch == '!'
              || ch == '\''
              ) {

Now, during the build we run <fixcrlf> during the build process.  Could ANT be corrupting
the file?

> It probably is the EURO character (€, unicode \u20AC i think).
> http://www.fileformat.info/info/unicode/char/20ac/index.htm

Perhaps the safest thing is to hex encode the character.

	--- Noel


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Mime
View raw message