lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Rochkind <rochk...@jhu.edu>
Subject Re: Solr: Images, Docs and Binary data
Date Wed, 06 Apr 2011 18:47:04 GMT
On 4/6/2011 2:39 PM, Markus Jelsma wrote:
>> Ha, there's a binary field type?!
>>
>> I've stored binary data in an ordinary "String" field type, and it's
>> worked.  But there were some headaches to get it to work, might have
>> been smoother if I had realized there was actually a binary field type.
> How, you can't just embed control characters in an XML body? The need to be at
> least encoded as not to write tabs, deletes, backspaces and whatever garbage,
> base64 in Solr's case.

In my case using SolrJ with BinaryUpdateHandler. I think. That code was 
actually written by someone else, a while ago.

However I've managed to do it at indexing -- ultimately getting it into 
a String-type stored field -- my binary data comes back not UUEncoded, 
but XML-escaped, ie:

#30;

This works for me because my "binary" data is actually MOSTLY ascii (so 
this isn't as terribly inefficient as it could be), but it has some 
control characters in it that need to be preserved. And nearly any 
library you use for consuming XML responses will properly un-escape 
things like #30; when reading.

Mime
View raw message