thrift-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vincent <pho...@gmail.com>
Subject Re: Why thrift serialization is so much slow-slow?
Date Wed, 09 Jan 2013 14:44:30 GMT
Hi, Henrique

data is Thrift-sturcture instances,  which can't encoded by json.

I use CompactProtocol instead of BinaryProtocol:
def ser1(data):
    transportOut = TTransport.TMemoryBuffer()
    protocolOut = TCompactProtocol.TCompactProtocol(transportOut)
    data.write(protocolOut)
    vals = transportOut.getvalue()
    return vals

But it's getting slower, testing result:

Test thrift
start: 1357742179.9
File length: 316378
File length: 316378
File length: 316378
File length: 316378
File length: 316378
File length: 316378
File length: 316378
File length: 316378
File length: 316378
File length: 316378
end: 1357742188.51
elapse: 8.61072683334


Test json
start: 1357742188.51
File length: 217252
File length: 217252
File length: 217252
File length: 217252
File length: 217252
File length: 217252
File length: 217252
File length: 217252
File length: 217252
File length: 217252
end: 1357742188.58
elapse: 0.0665349960327

On Wed, Jan 9, 2013 at 7:34 AM, Henrique Mendonça <henrique@apache.org>wrote:

> Hi Vincent,
>
> Do you have any reason to use test_json(rawdata) instead of test_json(data)
> ?
> It looks like the object is a lot smaller on rawdata...
> Anyway you're trying to compare a built-in serialisation function with the
> binary implementation, and in this case I would expect some lost of
> performance.
> You can also try to use the compact protocol, as it's also supposed to be a
> little faster, and/or the new json protocol. With you have a benchmark on
> that, please report back to us.
>
> Regards,
> Henrique
>
> On 6 January 2013 09:43, Vincent <phostu@gmail.com> wrote:
>
> > Hi, all
> >
> > I'm using thrift in python, I found serialize structure-data in
> > thrift-python is very slow.
> >
> > I wrote a serialization test on thrift and json,
> >
> > *testing thrift defination:*
> > https://gist.github.com/4465825
> > https://gist.github.com/4465826
> >
> > *python testing code:*
> > https://gist.github.com/4465830
> >
> > *testing data:*
> > https://gist.github.com/4465834
> >
> > *testing results:*
> > https://gist.github.com/4465853
> >
> >
> > testing results:
> >
> > >
> > > Test thrift
> > >
> > > start: 1357457502.17
> > >
> > >   File length: 796500
> > >
> > >       File length: 796500
> > >
> > >       File length: 796500
> > >
> > >       File length: 796500
> > >
> > >       File length: 796500
> > >
> > >       File length: 796500
> > >
> > >       File length: 796500
> > >
> > >       File length: 796500
> > >
> > >       File length: 796500
> > >
> > >       File length: 796500
> > >
> > > end: 1357457509.93
> > >
> > > elapse: 7.7634768486
> > >
> > >
> > >
> > >
> > >
> > > Test json
> > >
> > > start: 1357457509.93
> > >
> > >       File length: 217252
> > >
> > >       File length: 217252
> > >
> > >       File length: 217252
> > >
> > >       File length: 217252
> > >
> > >       File length: 217252
> > >
> > >       File length: 217252
> > >
> > >       File length: 217252
> > >
> > >       File length: 217252
> > >
> > >       File length: 217252
> > >
> > >       File length: 217252
> > >
> > > end: 1357457510.01
> > >
> > > elapse: 0.0743980407715
> > >
> > >
> > As the result above, I can't suffer thrift-serialization is slow like
> this.
> >
> > And my question is:
> >
> >    - Did I have wrong usage with thrift?
> >    - Or thrift was not design to transport big data(200k, is it big?)
> >
> >
> > Thanks.
> > --
> > Vincent.Wen
> >
>



-- 
Vincent.Wen

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message