lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phillip Farber <pfar...@umich.edu>
Subject com.ctc.wstx.exc.WstxUnexpectedCharException error
Date Tue, 25 Aug 2009 21:52:24 GMT
I have a valid xml document that begins:

<add><doc><field name="id">mdp.39015052775379</field>
<field name="rights">2</field>
<field name="title">Technology transfer and in-house R&amp;D in Indian 
industry : in the later 1990s / edited and with an introduction by Binay 
Kumar Pattnaik. v.1</field>
<field name="author">Not found</field>
<field name="ocr"> TECHNOLOGY
TRANSFER AND
IN.HOUSE R&amp;D
IN
INDIAN
INDUSTRY

I believe Solr is throwing an exception when it sees the line:

IN.HOUSE R&amp;D

The error message is:

SEVERE: [com.ctc.wstx.exc.WstxLazyException] 
com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' 
(code 32); expected
  a semi-colon after the reference for entity 'D'

This seems wrong.  It is as though the parser has converted &amp;D to &D 
and then complains about a missing semi-colon.

Can anyone make sense of this?

Full traceback follows.

Thanks!!

Phil

----
	
Solr Specification Version: 1.3.0.2008.12.04.08.06.02
Solr Implementation Version: nightly exported - yonik - 2008-12-04 08:06:02
Lucene Specification Version: 2.9-dev
Lucene Implementation Version: 2.9-dev 719313 - 2008-11-20 23:51:24
Current Time: Tue Aug 25 17:51:57 EDT 2009
Server Start Time:Tue Aug 25 17:10:44 EDT 2009

Aug 25, 2009 12:42:16 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/mbooks-ls-shard-2 path=/update params={} status=500 
QTime=4
Aug 25, 2009 12:42:16 PM org.apache.solr.common.SolrException log
SEVERE: [com.ctc.wstx.exc.WstxLazyException] 
com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' 
(code 32); expected
  a semi-colon after the reference for entity 'D'
  at [row,col {unknown-source}]: [4,57]
	at 
com.ctc.wstx.exc.WstxLazyException.throwLazily(WstxLazyException.java:45)
	at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:729)
	at 
com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3659)
	at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
	at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:276)
	at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
	at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
	at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
	at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1313)
	at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
	at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
	at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
	at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
	at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
	at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174)
	at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:548)
	at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
	at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
	at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
	at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
	at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:874)
	at 
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
	at 
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
	at 
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
	at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
	at java.lang.Thread.run(Thread.java:619)
Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected 
character ' ' (code 32); expected a semi-colon after the reference
for entity 'D'
  at [row,col {unknown-source}]: [4,57]
	at 
com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:648)
	at com.ctc.wstx.sr.StreamScanner.parseEntityName(StreamScanner.java:1994)
	at 
com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1496)
	at 
com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4681)
	at 
com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4126)
	at 
com.ctc.wstx.sr.BasicStreamReader.finishToken(BasicStreamReader.java:3701)
	at 
com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3649)
	... 24 more



Mime
View raw message