phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From maryannxue <...@git.apache.org>
Subject [GitHub] phoenix pull request: Phoenix-2405
Date Wed, 01 Jun 2016 02:12:08 GMT
Github user maryannxue commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/171#discussion_r65293136
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/iterate/DeferredByteBufferSegmentQueue.java
---
    @@ -0,0 +1,123 @@
    +package org.apache.phoenix.iterate;
    +
    +import org.apache.commons.io.output.DeferredFileOutputStream;
    +import org.apache.phoenix.memory.MemoryManager;
    +import org.apache.phoenix.memory.MemoryManager.MemoryChunk;
    +
    +import java.io.*;
    +import java.util.*;
    +
    +public abstract class DeferredByteBufferSegmentQueue<T> extends BufferSegmentQueue<T>
{
    +
    +    final MemoryChunk chunk;
    +
    +    public DeferredByteBufferSegmentQueue(int index, int thresholdBytes,
    +                                          boolean hasMaxQueueSize, MemoryManager memoryManager)
{
    +        super(index, thresholdBytes, hasMaxQueueSize);
    +        chunk = memoryManager.allocate(thresholdBytes);
    --- End diff --
    
    "thresholdBytes" might be confusing here. There are actually two occurrences of memory
usage here, first one being in-memory priority queue for sorting, once that part, the size
of which is rather an estimate (based on the priority queue data structure) than an actual
value, has reached the threshold, the priority queue content should be written to a some kind
of file OutputStream, which is now DeferredFileOutputStream. The second memory usage is that
used by DeferredFileOutputStream itself, since its content will first stay in memory before
its own threshold is reached.
    Therefore, we might need to allocate twice (it's not real allocate anyway, it's for tracking
memory usage actually). But a better way to do this is to make use of SpoolingResultIterator
logic to handle the entire second part as mentioned above. They should be exactly the same
logic except that SpoolingResultIterator writes and reads Tuples and what you need here is
something that writes and reads ResultEntry. So see if you can apply some abstraction here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message