thrift-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bjørn Borud <bbo...@gmail.com>
Subject Re: Serializing large data sets
Date Fri, 11 Jun 2010 16:37:34 GMT
On Fri, Jun 11, 2010 at 5:32 PM, Dave Engberg <dengberg@evernote.com> wrote:

>
> Evernote uses Thrift for all client-server communications, including
> third-party API integrations (http://www.evernote.com/about/developer/api/).
>  We serialize messages up to 55MB via Thrift.  This is very efficient on the
> wire, but marshalling and unmarshalling objects can take a fair amount of
> RAM due to various temporary buffers built into the networking and IO
> runtime libraries.
>

do you use TFramedTransport?  if so, I would assume that you have set the
frame size to 55Mb avoid the OOM error problems?  I've been thinking a bit
about this lately since I may want to expose a Thrift API to the outside
world.  Not setting a limit makes is exceptionally susceptible to
denial-of-service (just connect a socket and say "asdf" and boom).  Setting
the limit too high would require about 5 minutes more hacking to create a
program that sucks up lots of resources on the server.

(I guess this problem is also why TFramedTransport avoids using
direct-allocated ByteBuffer?)

One improvement would be to have the ability to do sanity checks on frames
over a certain size -- so that connections writing bogus data can be killed
off early.  But it isn't a quick fix and I am not entirely convinced that it
is worthwhile either.

-Bjørn

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message