thrift-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhay M <iter...@gmail.com>
Subject Re: Serializing large data sets
Date Fri, 11 Jun 2010 20:04:33 GMT
Thanks! This is helpful.

I'll try to get a sense of how much RAM it'll take to deserialize for this
type of messages -

struct TRecordList{
1: list<TRec> records,
}

Assuming message is first parsed into TRec beans (because it is defined as
list of beans), which are in turn converted into application beans, I am
guessing approximately 3 times the size of serialized message (probably
more).

Thanks again


On Fri, Jun 11, 2010 at 11:32 AM, Dave Engberg <dengberg@evernote.com>wrote:

>
> Evernote uses Thrift for all client-server communications, including
> third-party API integrations (http://www.evernote.com/about/developer/api/).
>  We serialize messages up to 55MB via Thrift.  This is very efficient on the
> wire, but marshalling and unmarshalling objects can take a fair amount of
> RAM due to various temporary buffers built into the networking and IO
> runtime libraries.
>
>
>
> On 6/11/10 8:26 AM, Abhay M wrote:
>
>> Hi,
>>
>> Are there any know concerns with serializing large data sets with Thrift?
>> I
>> am looking to serialize messages with 10-150K records, sometimes resulting
>> in ~30M per message. These messages are serialized for storage.
>>
>> I have been experimenting with Google protobuf and saw this in the
>> documentation (
>> http://code.google.com/apis/protocolbuffers/docs/techniques.html) -
>> "Protocol Buffers are not designed to handle large messages. As a general
>> rule of thumb, if you are dealing in messages larger than a megabyte each,
>> it may be time to consider an alternate strategy."
>> FWIW, I did switch to delimited write/parse API (Java only) as recommended
>> in the doc and it works well. But, Python protobuf impl lacks this API and
>> is slow.
>>
>> Thanks
>> Abhay
>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message