lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lukas Zapletal <l...@root.cz>
Subject Converting ISO88592 files to UTF8 and indexing`em
Date Fri, 03 Jan 2003 10:18:45 GMT
Dears,

I have a problem. I need to index Czech content that is in HTML files in 
ISO-8859-2. Is there any way to convert them to UTF and index them?
What stream or reader have I use? Is it possible?

How can I construct queries after that... Some systems have ISO-8859-2 and 
some systems Win-1250.
Is there any way to convert query string from default (system) encoding to 
UTF8?

People programming ENGLISH systems are so happy... ;-)

-- 
Lukas Zapletal
http://www.tanecni-olomouc.cz/lzap
lzap@root.cz

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message