uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: small memory footprint tradeoff configuration
Date Fri, 20 Feb 2009 12:04:02 GMT
One of the ideas for GC was to change the basic heap design to use java
objects for feature structures.  I'm thinking of some kind of explicit
GC, called by the user, at a point where they know a bunch of objects is
no longer needed (because they've just deleted things out of the index,
for instance).  The use case is one where some set of annotators might
generate many "alternatives", and then a subsequent annotator "picks"
one, and removes the others from the index.

I'm thinking that the implementation might be based on the deep CAS copy
code we already have, modified in an attempt to avoid needing extra space. 

I think this would avoid many of the other issues mentioned in the
previous thread http://markmail.org/thread/aolbz4nrvmgjhuyb.  If there
are issues/concerns with this kind of approach, please post/discuss.


Thilo Goetz wrote:
> Marshall Schor wrote:
>> Some users are beginning to ask for the ability to shift the internal
>> tradeoffs UIMA takes toward having a smaller memory footprint, at some
>> cost in performance.
>> Several areas in particular have come up: 
>>   1) "interning" string objects, so that only one copy exists
>>   2) having some way to "compact" or garbage-collect the CAS
> My suggestions for garbage collection in the CAS met with strong
> resistance on this list in the past.  I'll be interested to see
> what you'll come up with to overcome that resistance.
>> Are there other things that should be considered for trade-off here?
>> -Marshall

View raw message