lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wunderw...@netflix.com>
Subject Re: solr utf 16 ?
Date Wed, 25 Apr 2007 17:44:10 GMT
UTF-16 support should not require any changes to the XML parsing.
All XML parsers are required to support that encoding. The real
change is implementing RFC 3023 (XML Media Types) so that the
encoding can be specified over HTTP.

wunder

On 4/23/07 11:13 AM, "Mike Klaas" <mike.klaas@gmail.com> wrote:

> On 4/23/07, brian beard <brian_s_beard@hotmail.com> wrote:
>> Yes. I'm assuming if you have UTF-16 encoded data in a document that needs
>> to be added to the index, that solr would not be able to handle this?
> 
> I believe that handling arbitrary encodings is on the list of future
> enhancements, but I couldn't give you a timeline.
> 
> For the time being, consider that
>  1. utf-8 is the "lingua franca" of xml document encoding
>  2. it is very easy to convert it yourself (it would be a 3-4 line
> python commandline filter, frinstance).
> 
> -Mike


Mime
View raw message