lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: com.ctc.wstx.exc.WstxUnexpectedCharException error
Date Thu, 27 Aug 2009 17:14:46 GMT

: I have a valid xml document that begins:

how are you inspecting the document?

I suspect that what you actually have is a documenting containing hte 
literal bytes "R&D" but some tool you are using to view the document is 
displaying the $ to you as &amp;

	...OR...

your source document has the literal byts "R&amp;D" in it, but some code 
you are using is parsing that as xml and put wrtting it (over the wire) to 
solr has a string literal without reencoding ("R&D")

try running "nc -l" in place of solr, and have your indexing code post to 
it -- then see what you get.

Solr certianly doesn't have a problem with proerly escaped ampersands, but 
it will complain about illegal xml escape sequences...

$ java -Ddata=args -jar post.jar '<add><doc><field name="id">R&amp;D</field></doc></add>'
SimplePostTool: version 1.2
SimplePostTool: WARNING: Make sure your XML documents are encoded in 
UTF-8, other encodings are not currently supported
SimplePostTool: POSTing args to http://localhost:8983/solr/update..
SimplePostTool: COMMITting Solr index changes..

$ java -Ddata=args -jar post.jar '<add><doc><field name="id">R&D</field></doc></add>'
SimplePostTool: version 1.2
SimplePostTool: WARNING: Make sure your XML documents are encoded in 
UTF-8, other encodings are not currently supported
SimplePostTool: POSTing args to http://localhost:8983/solr/update..
SimplePostTool: FATAL: Solr returned an error: 
comctcwstxexcWstxLazyException_Unexpected_character__code_60_expected_a_semicolon_after_the_reference_for_entity_D__at_


-Hoss


Mime
View raw message