hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From takeshi <takeshi.m...@gmail.com>
Subject Re: HTableUtil and equivalent
Date Tue, 01 Oct 2013 07:45:25 GMT
Hi,

I think your assumption is right. Each HTable instance has its own
'writeAsyncBuffer' to store all Put objects, till its 'writeBufferSize'
reaches or its 'flushCommits()' method being called.

Here is the docs,
http://hbase.apache.org/book/perf.writing.html#perf.hbase.client.autoflush

Here is the code snippet for o.a.h.hbase.client.HTable
{code:java}
public class HTable implements HTableInterface {
  ...
  @Override
  public void put(final Put put)
      throws InterruptedIOException, RetriesExhaustedWithDetailsException {
    doPut(put);
    if (autoFlush) {
      flushCommits();
    }
  }
  ...
  /**
   * Add the put to the buffer. If the buffer is already too large, sends
the buffer to the
   *  cluster.
   * @throws RetriesExhaustedWithDetailsException if there is an error on
the cluster.
   * @throws InterruptedIOException if we were interrupted.
   */
  private void doPut(Put put) throws InterruptedIOException,
RetriesExhaustedWithDetailsException {
    if (ap.hasError()){
      backgroundFlushCommits(true);
    }

    validatePut(put);

    currentWriteBufferSize += put.heapSize();
    writeAsyncBuffer.add(put);

    while (currentWriteBufferSize > writeBufferSize) {
      backgroundFlushCommits(false);
    }
  }
}
{code}



Best regards

takeshi


2013/10/1 Graeme Wallace <graeme.wallace@farecompare.com>

> Hi,
>
> I've got a scenario whereby i'm pulling a stream of messages off a Kafka
> topic, reformatting them and then i want to write them into HBase.
>
> I've seen various suggestions on how to improve performance - but wondered
> if there was a way whereby I could do something equivalent to HTableUtil
> but without having to maintain my own lists of Puts.
>
> Is the underlying output buffer (assuming autoFlush=false) associated with
> an HTable instance or do all HTable instances share the same output buffer
> ? ie if it was one per HTable i could make sure that only keys in the same
> region get written through that HTable.
>
>
>
>
> --
> Graeme Wallace
> CTO
> FareCompare.com
> O: 972 588 1414
> M: 214 681 9018
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message