uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marshall Schor (JIRA)" <...@uima.apache.org>
Subject [jira] [Created] (UIMA-2498) add lenient version for binary compressed serialization/deserialization
Date Mon, 12 Nov 2012 20:25:13 GMT
Marshall Schor created UIMA-2498:

             Summary: add lenient version for binary compressed serialization/deserialization
                 Key: UIMA-2498
                 URL: https://issues.apache.org/jira/browse/UIMA-2498
             Project: UIMA
          Issue Type: Improvement
          Components: Core Java Framework
    Affects Versions: 2.4.0SDK
            Reporter: Marshall Schor
            Assignee: Marshall Schor
            Priority: Minor
             Fix For: 2.4.1SDK

Extend the binary compressed serialization to support cases where the type systems are not
exactly the same.

There are 2 use cases.
First: the source is a previously saved file. The goal is to deserialize it into e.g. a tool,
where the type system in the tool may be somewhat different than the type system used to create
the file. (For instance, it may be at a different version level).

Second: the source is a client for a UIMA-AS service.  In this case, the client has read the
service's type system, and has merged it with its own.

Difference in the type systems could be:
Type exists in one, not in the other; 
Type exists in both, but with different features (including those from super types).  Features
could be added/subtracted.  Features could have different ranges (incompatible ranges should
cause error messages).

A suggested impl approach: create a mapper that maps typecodes and feature codes; set it up
by comparing two type systems.  For the first use case, implement a version of deserialization
that takes an extra input of the source type system, and creates the converter, and then does
deserialization with the conversions.  For the 2nd use case, during initialization time, after
the service's type system has been read (for merging into the client's type system definition),
use this to create the same mappper between type codes / feature codes; when sending a CAS
via binary serialization, send it via the mapping converter for type codes and feature codes.

Try to arrange things so that the creation of the mapper can be done once per "set" of CASes,
rather than once per CAS.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message