uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bhavani Iyer" <bhavan...@gmail.com>
Subject Re: Delta CAS
Date Thu, 10 Jul 2008 15:33:51 GMT
The journaling information is associated with a CAS but not part of the
CAS.  The marker object is fine for identify new FSs. To identify
modifications to  pre-existing FSs,  this activity has to be logged by the
CAS APIs  that set feature value and add/remove FS from the index

In addition, we'll need APIs to access this journalling information for XMI
serialization and for viewing activity by component for debugging.


On Thu, Jul 10, 2008 at 10:32 AM, Thilo Goetz <twgoetz@gmx.de> wrote:

> Eddie Epstein wrote:
>> On Thu, Jul 10, 2008 at 5:30 AM, Thilo Goetz <twgoetz@gmx.de> wrote:
>>  I would like to lift this discussion to a higher
>>> level of abstraction, as Adam is trying to.  What
>>> are the actual requirements against the CAS?  Here's
>>> what I think I understood.
>>> You want to be able to obtain from the CAS a marker
>>> object.  Then you want to be able to query the CAS
>>> with the marker and an FS and ask if the FS was
>>> added before or after the marker was obtained.  Is
>>> that right?
>> That's right, for a simple delta CAS reply from a service, but actually
>> many
>> markers for journaling so that additions can be attributed to a specific
>> annotator.
> Ok, I think that shouldn't make a difference.
>> Bhavani Iyer wrote:
>>> If we are thinking of Delta CAS in the context of service the largest xmi
>>>> id
>>>> works. But
>>>> we were also using the same mechanism to support tracking CAS activity
>>>> by
>>>> component.
>>>> I suppose in the second case the additional overhead of maintaining a
>>>> list
>>>> of the FSs that
>>>> are added may be acceptable.
>>>  A requirement for delta CAS is to identity new FS and modified FS. A low
>> cost way to get modified FS is to add the FS-id to a simple list each time
>> a
>> feature in a preexisting FS is set, then sort and remove duplicates. Being
>> able to identify and ignore new FS at setFeature() time allows huge
>> reduction in overhead because modifications are so much less frequent than
>> additions.
> So you want to do this outside the CAS?  Or not?
>> Yes, spreading FS across [power of 2 size] segments should eliminate holes
>> and more complicated bookkeeping for large arrays.
>> Eddie

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message