uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: Delta CAS
Date Wed, 09 Jul 2008 18:37:08 GMT
Here's a suggestion suggested by previous posts, and common hardware 
design for segmented memory.

Take the int values that represent feature structure (fs) references.  
Today, these are positive numbers from 1 (I think) to around 4 billion.  
These values are used directly as an index into the heap.

Change this to split the bits in these int values into two parts, let's 
call them upper and lower.  For example
xxxx xxxx xxxx yyyy yyyy yyyy yyyy yyyy

where the xxx's are the upper bits (each x represents a hex digit), and 
the y's the lower bits.  The y's in this case can represent numbers up 
to 1 million (approx), and the xxx's represent 4096 values.

Then allocate the heap using multiple 1 meg entry tables, and store each 
one in the 4096 entry reference array.  The heap reference would be some 
bit-wise shifting and indexed lookup in addition to what we have now and 
would probably be very fast, and could be optimized for the xxx=0 case 
to be even faster.

This breaks heaps of over 1 meg into separate parts, which would make 
them more managable, I think, and keeps the high-water mark method 
viable, too.

Opinions?

-Marshall



Mime
View raw message