uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <richard.eck...@gmail.com>
Subject Re: How to use the new binary CAS (de)serialization?
Date Mon, 05 Aug 2013 16:23:17 GMT
I have it working by now. My last issue was the subtyping of 
the document annotation, which was a problem in a unit test
that I wrote, but is unlikely to be a problem in actual use.

-- Richard

Am 05.08.2013 um 17:40 schrieb Marshall Schor <msa@schor.com>:

> I think if you "pre-read" some info from a stream, and then pass that stream to
> the reinit (or other method of binary deserialization), it just continues
> reading from wherever the stream was positioned, so I think your approach ought
> to work...
> -Marshall
> On 8/2/2013 2:34 PM, Richard Eckart de Castilho wrote:
>> Hm, I just notice that my problem analysis was not quite correct.
>> BinaryCasSerDes6 indeed is able to handle the header… so my problem
>> must be somewhere else.
>> -- Richard 
>> Am 02.08.2013 um 20:29 schrieb Richard Eckart de Castilho <richard.eckart@gmail.com>:
>>> Hi,
>>> I'm still trying to use the new serialization methods but continue
>>> running into problems.
>>> Last time we discussed that I need to know the original type system
>>> when I want to deserialize a format 6 binary CAS into a CAS.
>>> So when I serialize the CAS now, I first write a header, then I
>>> dump the type system into my output stream, and then the binary CAS
>>> using 
>>> serializeWithCompression(cas, outputStream, cas.getTypeSystem());
>>> When I read the data, I check for my header. If it is there, I
>>> read the type system.
>>> Now I wanted to call
>>> deserializeCAS(cas, inputStream, typeSystem, null);
>>> Unfortunately, that fails. The reason is, that this signature of
>>> deserializeCAS immediately uses the BinaryCasSerDes6 to read
>>> data from the input stream. However, serializeWithCompression
>>> writes a header before the data that BinaryCasSerDes6. This
>>> header is read by a deserializeCAS(cas, inputStream), but
>>> in this signature, I have no way of specifying the original
>>> type system.
>>> Of course I can copy the whole header checking code from CASImpl,
>>> but I don't think that is a good solution. I think the
>>> deserializeCAS methods that UIMA provides should either all deal
>>> with the header that the serializeWithCompression methods write,
>>> or none should.
>>> Maybe a solution for this dilemma is something that could also
>>> go into a 2.4.2 release.
>>> Cheers,
>>> -- Richard

View raw message