lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ian Parkin" <>
Subject RE: How to handle umlauts ?
Date Fri, 20 Sep 2002 04:16:40 GMT
Hello All,

> > I have 100,000+ small HTML files that are mainly in the
> > english language. I
> > just noticed that we have some user names with umlauts. These
> > are seemingly
> > stored and searchable as the '?' character.

This appears to be a Solaris thing. I develop under Solaris 9 and then burn 
my application onto a multi-platform CD (Unix/Win/Mac). It is only when I 
run the application under Solaris that the umlauts appear as the '?' 
character. On all other platforms the characters are correctly displayed. 
All platforms are running Java 1.3.1.

Two questions :-

1) Has anyone any experience with such behaviour ? (Apologies for the 
non-lucene content)

2) How to search on text containing umlauts ? At the moment a search on   
"jürgen" returns no hits, but a search on "rgen" will return posts by user 



Send and receive Hotmail on your mobile device:

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message