lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <yo...@apache.org>
Subject Re: Solr Update Handler Failes with Some Doc Characters
Date Wed, 09 May 2007 15:45:43 GMT
On 5/9/07, av_work@yahoo.com <av_work@yahoo.com> wrote:
> I run the example using Jetty on Windows 2003 machine. When I submit some documents containing
upper ASCII characters, Solr update handler fails with an XML parsing error saying that it
encountered an EOF before the closing tags.

Normally if there is a charset mixup, you will just get weird looking results.
I suppose that if a char that is greater than 128 is used, and Solr is
treating as UTF-8, then the following char would be treated as part of
a single multibyte character.  Hence if the char is directly followed
by XML markup, part of that XML markup will be lost (hence the parse
exception).

In short, this is probably a char encoding issue.  What character
encoding are you using when posting to Solr, and is it declared in the
HTTP header?

-Yonik

Mime
View raw message