uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: Delta CAS
Date Thu, 10 Jul 2008 09:26:15 GMT
Eddie Epstein wrote:
> No opinions, but a few observations:
> 1M is way too big for some applications that need very small, but very many
> CASes.

I agree.

> Large arrays may be bigger than whatever segment size is chosen, making
> segment management a bit more complicated.
> There will be holes at the top of every segment when the next FS doesn't
> fit.

Not necessarily.  Why couldn't you spread FSs and arrays
across segments?

> Eddie
> On Wed, Jul 9, 2008 at 2:37 PM, Marshall Schor <msa@schor.com> wrote:
>> Here's a suggestion suggested by previous posts, and common hardware design
>> for segmented memory.
>> Take the int values that represent feature structure (fs) references.
>>  Today, these are positive numbers from 1 (I think) to around 4 billion.
>>  These values are used directly as an index into the heap.
>> Change this to split the bits in these int values into two parts, let's
>> call them upper and lower.  For example
>> xxxx xxxx xxxx yyyy yyyy yyyy yyyy yyyy
>> where the xxx's are the upper bits (each x represents a hex digit), and the
>> y's the lower bits.  The y's in this case can represent numbers up to 1
>> million (approx), and the xxx's represent 4096 values.
>> Then allocate the heap using multiple 1 meg entry tables, and store each
>> one in the 4096 entry reference array.  The heap reference would be some
>> bit-wise shifting and indexed lookup in addition to what we have now and
>> would probably be very fast, and could be optimized for the xxx=0 case to be
>> even faster.
>> This breaks heaps of over 1 meg into separate parts, which would make them
>> more managable, I think, and keeps the high-water mark method viable, too.
>> Opinions?
>> -Marshall

View raw message