hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: java driver versus thrift
Date Thu, 22 Jul 2010 21:24:13 GMT

The situation is fairly complex and there is no super clear answer.
Here are some facts:

- Thrift requires the use of a shared-infrastructure client/server app
that is another scaling factor
- Thrift servers live a long time and thus can more effective amortize
the HTable cache across multiple short client runs
- Thrift servers dont have as advanced batch put and you could run
into scaling issues
- The Java API parallelizes as the number of client JVMs you have -
there is no real limit (other than your cluster's ability to handle
the requests)
- The API is probably the way to go if you are in Java
- Stumbleupon uses the Thrift API with PHP and it works like a charm.
The overhead it adds is surprisingly negligible. We deploy thrift
servers on all the regionservers

On Thu, Jul 22, 2010 at 2:18 PM, Sylvain Hellegouarch <sh@defuze.org> wrote:
> On Thu, Jul 22, 2010 at 8:22 PM, S Ahmed <sahmed1020@gmail.com> wrote:
>> Can someone explain, at a high level, how the hbase service is exposed?
>> Is it a Java socket or? (sorry not that well versed in this)
>> Does anyone have any numbers on the performance differences between using
>> the native java driver (that presumably connects 'directly') versus the
>> Thrift route?
> Basically when you use the thrift API, you're conversing with a thrift
> server that is written using the Java API. In other words, it's similar to
> some RPC mechanism, which means you'll introduce some overhead over using
> directly the Java API.
> I don't have numbers at hand unfortunately.
> --
> - Sylvain
> http://www.defuze.org
> http://twitter.com/lawouach

View raw message