uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: Delta CAS
Date Thu, 10 Jul 2008 09:24:42 GMT
Pretty much what I had in mind.  Don't forget that
ints are signed, so unless you just drop the high
bit, the watermark may not work.  That's my concern.

I agree with Eddie that a fixed scheme may not work
for all deployments, so we would need to be flexible.

--Thilo

Marshall Schor wrote:
> Here's a suggestion suggested by previous posts, and common hardware 
> design for segmented memory.
> 
> Take the int values that represent feature structure (fs) references.  
> Today, these are positive numbers from 1 (I think) to around 4 billion.  
> These values are used directly as an index into the heap.
> 
> Change this to split the bits in these int values into two parts, let's 
> call them upper and lower.  For example
> xxxx xxxx xxxx yyyy yyyy yyyy yyyy yyyy
> 
> where the xxx's are the upper bits (each x represents a hex digit), and 
> the y's the lower bits.  The y's in this case can represent numbers up 
> to 1 million (approx), and the xxx's represent 4096 values.
> 
> Then allocate the heap using multiple 1 meg entry tables, and store each 
> one in the 4096 entry reference array.  The heap reference would be some 
> bit-wise shifting and indexed lookup in addition to what we have now and 
> would probably be very fast, and could be optimized for the xxx=0 case 
> to be even faster.
> 
> This breaks heaps of over 1 meg into separate parts, which would make 
> them more managable, I think, and keeps the high-water mark method 
> viable, too.
> 
> Opinions?
> 
> -Marshall
> 

Mime
View raw message