uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: Delta CAS
Date Thu, 10 Jul 2008 14:32:31 GMT
Eddie Epstein wrote:
> On Thu, Jul 10, 2008 at 5:30 AM, Thilo Goetz <twgoetz@gmx.de> wrote:
>> I would like to lift this discussion to a higher
>> level of abstraction, as Adam is trying to.  What
>> are the actual requirements against the CAS?  Here's
>> what I think I understood.
>> You want to be able to obtain from the CAS a marker
>> object.  Then you want to be able to query the CAS
>> with the marker and an FS and ask if the FS was
>> added before or after the marker was obtained.  Is
>> that right?
> That's right, for a simple delta CAS reply from a service, but actually many
> markers for journaling so that additions can be attributed to a specific
> annotator.

Ok, I think that shouldn't make a difference.

> Bhavani Iyer wrote:
>>> If we are thinking of Delta CAS in the context of service the largest xmi
>>> id
>>> works. But
>>> we were also using the same mechanism to support tracking CAS activity by
>>> component.
>>> I suppose in the second case the additional overhead of maintaining a list
>>> of the FSs that
>>> are added may be acceptable.
> A requirement for delta CAS is to identity new FS and modified FS. A low
> cost way to get modified FS is to add the FS-id to a simple list each time a
> feature in a preexisting FS is set, then sort and remove duplicates. Being
> able to identify and ignore new FS at setFeature() time allows huge
> reduction in overhead because modifications are so much less frequent than
> additions.

So you want to do this outside the CAS?  Or not?

> Yes, spreading FS across [power of 2 size] segments should eliminate holes
> and more complicated bookkeeping for large arrays.
> Eddie

View raw message