lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rupert Fiasco <rufia...@gmail.com>
Subject Re: Responses getting truncated
Date Tue, 25 Aug 2009 16:31:13 GMT
Using wt=json also yields an invalid document. So after more
investigation it appears that I can always "break" the response by
pulling back a specific field via the "fl" parameter. If I leave off a
field then the response is valid, if I include it then Solr yields an
invalid document - a truncated document. This happens in any response
format (xml, json, ruby).

I am using the SolrJ client to add documents to in my index. My field
is a normal "text" field type and the text itself is the first 1000
characters of an article.

> It can very well be an issue with the data itself. For example, if the data
> contains un-escaped characters which invalidates the response

When I look at the document in using wt=xml then all XML entities are
escaped. When I look at it under wt=ruby then all single quotes are
escaped, same for json, so it appears that all escaping it taking
place. The core problem seems to be that the document is just
truncated - it just plain end of files. Jetty's log says its sending
back an HTTP 200 so all is well.

Any ideas on how I can dig deeper?

Thanks
-Rupert


On Mon, Aug 24, 2009 at 4:31 PM, Uri Boness<uboness@gmail.com> wrote:
> It can very well be an issue with the data itself. For example, if the data
> contains un-escaped characters which invalidates the response. I don't know
> much about ruby, but what do you get with wt=json?
>
> Rupert Fiasco wrote:
>>
>> I am seeing our responses getting truncated if and only if I search on
>> our main text field.
>>
>> E.g. I just do some basic like
>>
>> title_t:arthritis
>>
>> Then I get a valid document back. But if I add in our larger text field:
>>
>> title_t:arthritis OR text_t:arthritis
>>
>> then the resultant document is NOT valid XML (if using wt=xml) or Ruby
>> (using wt=ruby). If I run these through curl on the command its
>> truncated and if I run the search through the web-based admin panel
>> then I get an XML parse error.
>>
>> This appears to have just started recently and the only thing we have
>> done is change our indexer from a PHP one to a Java one, but
>> functionally they are identical.
>>
>> Any thoughts? Thanks in advance.
>>
>> - Rupert
>>
>>
>

Mime
View raw message