uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Re: Alternate CAS implementation
Date Thu, 02 Apr 2015 06:57:29 GMT
Hi Nick,

On 02.04.2015, at 01:37, Nick Hill <apache@nickhill.org> wrote:

>> From my point of view, it would be nice if it was possible to configure the UIMA
framework to produce either this new kind of CAS or the old one without having to exchange
a JAR - doing so statically at initialization time or even dynamically at runtime. E.g. to
allow easily running test cases against both implementations.
> When you say "produce", there shouldn't be any visible difference in anything output
or persisted, the impl is just how the CAS is stored internally in memory while processing
is happening.
> It won't be possible to switch the impl being used at runtime. There are classes for
example with the same names but different impls (e.g. CASImpl). I know this isn't ideal for
tests/comparisons between the two impls but quite a lot of things are currently tightly-coupled
to the heap internals and so switching a jar doesn't seem too big a price to pay given no
other code changes are needed.

What do you plan to be the ultimate goal of this experiment? Is it to support different CAS
implementations or is it to replace the existing CAS implementation with a totally different

Most things in UIMA are created through factories (not the CAS so far). So theoretically,
one could replace most classes by custom classes by reconfiguring the framework to use different
factory classes or having the factories produce different implementations. Can you imagine
that as well for the CAS?

Does it mean that the UIMA-C++ implementation is going to be discontinued officially?

>> Having to recompile the JCas classes is a bit of a blocker to me - but I remember
that Marshall was contemplating about a way to generate JCas classes at runtime, so this might
just be a temporary blocker.
> When I say recompile, I don't mean regenerate using JCasGen, just recompile .class files
from the existing jcas .java files. I would expect that you would typically only be using
one version (other than for comparison purposes - to validate functional equivalence and/or
compare performance), and so this isn't something that would need to be done often.

Compiled JCas classes tend to be shipped as part of frameworks. This means that it will not
be possible to switch to a new CAS impl just by replacing a JAR. It will also mean that components
from different UIMA-based frameworks cannot be mixed and matched anymore unless some broker
like UIMA-AS is used. 

>> In one context, we also rely heavily on CAS addresses serving as unique identifiers
of feature structures in the CAS. Does your implementation provide any stable feature structure
IDs, preferably ones that are part of the system and not actually declared as features?
> Yes, there are various cases where an 'equivalent' of an FS address is required (for
example if the LL API is being used). In this case the id gets allocated on the fly and will
subsequently be unique to that FS within the CAS. In many cases an FS might never have such
an ID allocated (it's not really part of the non-LL "public" APIs), but you can always 'request'

I imagine that IDs would be necessary to implement stuff like delta-CAS later on too.

Are any of the changes so far in any way related to potentially allowing additions to the
type system at runtime?

What would be the incentive/benefit for the developer of a UIMA-based framework/applications
or for the users of such frameworks/applications to switch to the new implementation?


-- Richard
View raw message