uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: Delta CAS
Date Tue, 08 Jul 2008 20:32:08 GMT
Marshall Schor wrote:
> Thilo Goetz wrote:
>> Bhavani Iyer wrote:
>>> Hi Thilo,
>>> There are two separate requirements being addressed here:
>>> 1) delta CAS for optimizing remote services.
>>>      Here its agreed that there should be no measurable overhead when 
>>> there
>>> is no remoting.
>>>      There will be a single test against the high water mark.  The high
>>> water mark defaults to 0.  Only when the high
>>>      water mark is set to a value greater than 0 is logging of  CAS
>>> operations on FSs below the high water mark enabled.
>>> 2)  Journaling for debugging  aggregate components.
>>>     This capability is for Core UIMA as well as for remote services. 
>>> This
>>> will have some additional overhead and will be have to be explicitly 
>>> enabled
>>> by the aggregate controller for a component. Basically the aggregate
>>> controller enables journaling by setting the high water mark before 
>>> the call
>>> to process.
>>> Regarding using the high water mark, this is already being used for 
>>> merging
>>> CAS.
>> That's not a good thing, and certainly no justification of using
>> the same design here.  
> Can you say more about why this is not a good thing?  I see it as an 
> internal design detail.

Precisely.  It's an implementation detail of the CAS heap that
we should be able to change -- that we must be able to change
if we would like to improve on the heap.  The CAS heap and
in particular the way it grows is a major performance bottleneck
for large documents.  If we have other parts of UIMA depend on
the (bad) implementation details now, we'll never be able to
improve on the design.

>> I thought you needed to keep a list of the
>> added FSs anyway.
> I don't think such a list is kept.  There is a list of modified FSs 
> below below the high water mark, and a list of things 
> added/removed/modified with the indexes (in other words, if new feature 
> structures were added, but not indexed, they would not be in any list).
> -Marshall

View raw message