uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eddie Epstein" <eaepst...@gmail.com>
Subject Re: Delta CAS
Date Thu, 10 Jul 2008 12:49:22 GMT
On Thu, Jul 10, 2008 at 5:30 AM, Thilo Goetz <twgoetz@gmx.de> wrote:

> I would like to lift this discussion to a higher
> level of abstraction, as Adam is trying to.  What
> are the actual requirements against the CAS?  Here's
> what I think I understood.
>
> You want to be able to obtain from the CAS a marker
> object.  Then you want to be able to query the CAS
> with the marker and an FS and ask if the FS was
> added before or after the marker was obtained.  Is
> that right?


That's right, for a simple delta CAS reply from a service, but actually many
markers for journaling so that additions can be attributed to a specific
annotator.

Bhavani Iyer wrote:
>
>> If we are thinking of Delta CAS in the context of service the largest xmi
>> id
>> works. But
>> we were also using the same mechanism to support tracking CAS activity by
>> component.
>> I suppose in the second case the additional overhead of maintaining a list
>> of the FSs that
>> are added may be acceptable.
>
>
A requirement for delta CAS is to identity new FS and modified FS. A low
cost way to get modified FS is to add the FS-id to a simple list each time a
feature in a preexisting FS is set, then sort and remove duplicates. Being
able to identify and ignore new FS at setFeature() time allows huge
reduction in overhead because modifications are so much less frequent than
additions.

Yes, spreading FS across [power of 2 size] segments should eliminate holes
and more complicated bookkeeping for large arrays.

Eddie

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message