uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: small memory footprint tradeoff configuration
Date Fri, 20 Feb 2009 12:23:54 GMT
Marshall Schor wrote:
> One of the ideas for GC was to change the basic heap design to use java
> objects for feature structures.  I'm thinking of some kind of explicit
> GC, called by the user, at a point where they know a bunch of objects is
> no longer needed (because they've just deleted things out of the index,
> for instance).  The use case is one where some set of annotators might
> generate many "alternatives", and then a subsequent annotator "picks"
> one, and removes the others from the index.
> I'm thinking that the implementation might be based on the deep CAS copy
> code we already have, modified in an attempt to avoid needing extra space. 
> I think this would avoid many of the other issues mentioned in the
> previous thread http://markmail.org/thread/aolbz4nrvmgjhuyb.  If there
> are issues/concerns with this kind of approach, please post/discuss.

It would change the internal IDs of FSs, which was always a
big no-no for some people.

> -Marshall
> Thilo Goetz wrote:
>> Marshall Schor wrote:
>>> Some users are beginning to ask for the ability to shift the internal
>>> tradeoffs UIMA takes toward having a smaller memory footprint, at some
>>> cost in performance.
>>> Several areas in particular have come up: 
>>>   1) "interning" string objects, so that only one copy exists
>>>   2) having some way to "compact" or garbage-collect the CAS
>> My suggestions for garbage collection in the CAS met with strong
>> resistance on this list in the past.  I'll be interested to see
>> what you'll come up with to overcome that resistance.
>>> Are there other things that should be considered for trade-off here?
>>> -Marshall

View raw message