hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: significant scan performance difference between Thrift(c++) and Java: 4X slower
Date Sat, 07 Mar 2015 19:00:13 GMT
You can try the REST gateway, though it has the same basic architecture as
the thrift gateway. May be the details work out in your favor over rest.

On Fri, Mar 6, 2015 at 11:31 PM, nidmgg <nidmgg@gmail.com> wrote:

> Stack,
>
> Thanks for the quick response. Well, the extra layer really kill the
> Performance. The 'hop' is so expensive
>
> Is there another C/C++ api to try out?  I saw there is a jira Hbase-1015,
> but was inactive for a while.
>
> Demai
>
> Stack <stack@duboce.net> wrote:
>
> >Is it because of the 'hop'?  Java goes against RS. The thrift C++ goes to
> a
> >thriftserver which hosts a java client and then it goes to the RS?
> >St.Ack
> >
> >On Fri, Mar 6, 2015 at 4:46 PM, Demai Ni <nidmgg@gmail.com> wrote:
> >
> >> hi, guys,
> >>
> >> I am trying to get a rough idea about the performance comparison between
> >> c++ and java client when access HBase table, and is surprised to find
> out
> >> that Thrift (c++) is 4X slower
> >>
> >> The performance result is:
> >> C++:  real    *16m11.313s*; user    5m3.642s; sys    2m21.388s
> >> Java: real    *4m6.012s*;user    0m31.228s; sys    0m8.018s
> >>
> >>
> >> I have a single node HBase(98.6) cluster, with 1X TPCH loaded, and use
> the
> >> largest table : lineitem, which has 6M rows, roughly 600MB data.
> >>
> >> For c++ client, I used the thrift example provided by hbase-examples,
> the
> >> C++ code looks like:
> >>
> >> >  std::string t("lineitem");
> >> >  int scanner =  client.scannerOpenWithScan(t, tscan, dummyAttributes);
> >> >  int count = 0;
> >> > ..
> >> >  while (true) {
> >> >    std::vector<TRowResult> value;
> >> >    client.scannerGet(value, scanner);
> >> >    if (value.size() == 0) break;
> >> >    count ++;
> >> >  }
> >> >
> >> >  std::cout << count << " rows scanned"<< std::endl;
> >> >
> >>
> >> For java client is the most simple one:
> >>
> >> >     HTable table = new HTable(conf,"lineitem");
> >> >
> >> >     Scan scan = new Scan();
> >> >     ResultScanner resScanner;
> >> >     resScanner = table.getScanner(scan);
> >> >     int count = 0;
> >> >     for (Result res: resScanner) {
> >> >       count ++;
> >> >     }
> >> >
> >>
> >>
> >>
> >> Since most of the time should be on I/O, I don't expect any significant
> >> difference between Thrift(C++) and Java. Any ideas? Many thanks
> >>
> >> Demai
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message