thrift-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 刘畅 <>
Subject Need help! Re: Hi there, I have got a question about the serialization performance using Thrift
Date Tue, 22 Dec 2015 18:35:15 GMT
Truly, We’ve done something to ‘improve’ the RPC

And somehow it seems that the modification is slowing the whole system down.

At first, I thought that the complexity of the response and the serialization is the problem
that causing the performance drop.

But it turns out that it didn’t.The structure is like below:

	struct DBResult:
		int errorno,
		string errordesp,
		list<DBField> column,
		list<list<string>> data

With only 2000 raw of DBResult, the whole response of current query is about 300k.

We separate the serialization using TMemoryBuffer, and it shows that serialization is not
the issue. It is the network that cost us about .3 seconds in each query.

The modification we did is described below:

Object: There are a lot of thrift server(processor) in each process, and we need to narrow
all the socket connections from one process to another into one socket connection. To save
the file descriptor.

We’ve implemented a virtual transport, when flush, we find current available connection
and send through it, and wait for condition according to the seqid.

On the other hand, we developed a connection manager to epoll all the connection and notify
when receiving the response of certain seqid.

The implementation is quite like the TNonblockingServer just that it’s on the client side.

But as u can see, there is problem of performance.

Do u guys have any suggestion. Due to the confidential policy, I cannot show you the detail,
But if u have any question feel free to ask. 

I need your help, thanks a lot.

> 在 2015年12月20日,下午10:35,Matt Chambers <> 写道:
> 200k is still a pretty small result.  You should generate the C++ client and hit your
server with it to see if the delay is maybe on the sever side.  Maybe your doing something
that is slowing down the building the result?
> If 200k is somehow causing an issue:
> Check out:
> <>
> What I do for python is generate c++ code, then wrap it with cython. You can stuff like
drop the GIL around each network request, and the extra layer is great place to abstract away
all the connection logic into under the hood thread-local connections.  Exception translation
is tricky but I can show you how to do it if your interested.
> -Matt
>> On Dec 20, 2015, at 3:18 AM, 刘畅 <> wrote:
>> The background is that, we used to use Pyro4 to do the magic of RPC between our two
Python processes.
>> The serializer is cPickle in Pyro4. However, in our new approach, we change one of
the process into C++ using Thrift instead.
>> The performance when sending small request/response, Thrift is doing fine.
>> But in one of the cases, the size of the response is over 200 KB, and the performance
is dropping rapidly.
>> Our implementation detail is listed below:
>> Server is using C++, Client is using Python, using TFramedTransport as the transportation
and TThreadedServer as server.
>> Unfortunately, our project cannot stand this performance drop between two version.
And I’m here for help.
>> I’d like to know whether there is any performance improvement solution in serialization?

View raw message