thrift-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Boisvert <>
Subject Re: heterogeneous collections
Date Mon, 03 May 2010 17:30:42 GMT
On Mon, May 3, 2010 at 10:03 AM, Mayan Moudgill <> wrote:

> The idea of marshalling to strings seems somewhat counter-productive; after
> all, you're marshalling the data using Thrift, which then gets sent to a
> server, and demarshalls it. Now, on top of that you're adding another layer
> of marshalling.

Understood; that's why, ideally, I'd rather have Thrift handle this for me
transparently and more efficiently.

> If, however, you're encoding the data for demarshalling at the server, it
> sounds like you want a different RPC framework. For instance, do you really
> need the version flexibility that is provided by Thrift? Are your types
> fixed at source & destination? Do you need a leaner transport? In fact, why
> did you pick Thrift in the first place?

Yes, I want to version my service interfaces.   I chose Thrift because we
already use it for other services -- in the spirit of consistency and
minimizing the number of RPC frameworks we use.

 In this case, I'm actually not looking for something optimal in terms of
>> efficiency.  The data structures I'm passing in are small and the services
>> I'm calling are coarse-grained so the transport+marshalling costs should
>> be
>> relatively insignificant compared to what happens in the service.
> Have you actually measured this? Why do you think that this might be the
> case?

No, but the service I'm calling is IO+CPU intensive so it's a safe
assumption that any data marshalling will be only a fraction of I/O and
processing costs.

I understand that, but being sloppy about performance can lead to real pain.
> Consider the case where an efficient implementation fits in a single CPU.
> Being a little sloppy means that you have to go to a multi-threaded
> implementation, but within a chip. Being a lot sloppy means that you may
> have to go to a distributed implementation....

Ugh?  I'm talking about serializing simple data types like int32 and int64
to/from strings.  I think you're getting carried away ;)

Mind you, the service itself is already a multi-threaded and distributed
implementation -- it provides query processing for a "big data"
multi-dimensional cube.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message