thrift-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Will Lowe <will.l...@gmail.com>
Subject Re: Using thrift as part of a game network protocol
Date Sat, 04 Apr 2009 03:48:05 GMT
I'm actually considering using Thrift in a similar way: as a fast,  
cross-language serialization-and-transport mechanism between a bunch  
of different apps in a pub/sub architecture.

There are a number of possible message types -- 100? -- and it won't  
always be possible for each consumer to know if the other supports a  
given message type,  so I'd like to avoid each app having an RPC  
service for each possible message type;  I'd rather hand them objects  
and let them just ignore the ones they don't care about.

I see a few different ways to do this:

1. Define only one struct with all possible fields for all possible  
messages,  and a "type" field that lets you figure out what it is.  It  
seems kinda stupid to do this,  since one of the major reasons I'm  
interested in Thrift is type-awareness.

2. Modify the TService layer so that RPC arguments aren't statically  
typed:  make it possible to declare an RPC call that accepts any  
struct.  Feels like (void*),  and probably also irritates purists who  
like the simple no-object-inheritance, no-function-overloading model  
Thrift uses today.

3. Build something custom at the TProcessor layer and skip TService  
altogether.

Both #2 and #3 require changing some guts.  I think that would go  
something like this:

* Add a TMessageType (T_STRUCT?) that indicates "I'm sending you data,  
not calling a function!".  Or is that what T_ONEWAY is for?
* Modify TBinaryProtocol so that writeStructBegin() and  
writeStructEnd() aren't noops -- otherwise the receiver doesn't know  
what he's receiving!
* Implement a TProcessor that can read the struct type, instantiate  
it,  and do some sort of dispatch to the client app.

Any thoughts on this?  Has someone else already solved this?

Will


On Apr 3, 2009, at 6:22 PM, Brian Hammond wrote:

> That's neat Joel.  However, does this scale?  I mean, the underlying  
> assumption here is that clients are using persistent connections to  
> the service, and you [not so simply] are sending messages back to  
> the client over that same connection.  Thus, your service now has to  
> handle a potentially large number of client connections.  Unless  
> you're using something libev[ent] I don't see this scaling beyond  
> say 20K connections.  Two things, I could be missing something here,  
> and this level of scalability is probably just fine for *many* types  
> of services (perhaps not for a chat server though)!
>
> I'm curious to hear other people's thoughts on this and how it could  
> be made scalable since, well, I'm planning on using polling in my  
> project since I am expecting potentially a very large number of  
> simultaneous users of the service and my servers can only handle so  
> many connections.
>
> Thanks for sharing this.
>
> Brian
>
> On Apr 3, 2009, at 7:50 PM, Joel Meyer wrote:
>
>> On Thu, Apr 2, 2009 at 4:17 PM, Joel Meyer <joel.meyer@gmail.com>  
>> wrote:
>>
>>> On Tue, Mar 24, 2009 at 5:01 PM, Doug Daniels <daniels.douglas@gmail.com 
>>> >wrote:
>>>
>>>> Ok I definitely plan on giving the Async RPC methods a try  
>>>> tonight, but I
>>>> figured I'd just throw out some questions before I get home to  
>>>> start
>>>> hacking
>>>> on this stuff.
>>>>
>>>> The one-to-one message to RPC call Async solution will let a  
>>>> client send
>>>> messages of any given type in my defined protocol, but how would  
>>>> a server
>>>> respond to a client with a message that the client didn't  
>>>> request? For
>>>> example say I was trying to write a FPS like Quake and I want to  
>>>> server to
>>>> send position updates for all clients to all clients, how would i  
>>>> model
>>>> that
>>>> as a client RPC request for that. With the Async RPC solutions I  
>>>> could
>>>> make
>>>> a RPC call for Map<Integer, Position> getPositionUpdates(), Now  
>>>> say that
>>>> the
>>>> client needs to request 50 other messages to be notified of. I  
>>>> guess the
>>>> solution would be to make an Async RPC call requesting those  
>>>> updates and
>>>> respond to it when I receive it asynchronously and then reissue  
>>>> another
>>>> Async RPC call for the next set of updates. It just seems  
>>>> inefficient to
>>>> actively make the client request for data when the server could  
>>>> implicitly
>>>> know that when connected on this game protocol I can just send  
>>>> these
>>>> messages to the clients without them asking for it. Not to  
>>>> mention you'd
>>>> have make sure you don't "miss" sending a client a message if they
>>>> finished
>>>> their Async call but haven't reestablished a new one.
>>>>
>>>
>>> I think I've done something similar to what you're trying to do,  
>>> and as
>>> long as you can commit to using only async messages it's possible  
>>> to pull it
>>> off without having to start a server on the client to accept RPCs  
>>> from the
>>> server.
>>>
>>> When your RPC is marked as async the server doesn't send a  
>>> response and the
>>> client doesn't try to read one. So, if all your RPC calls from the  
>>> client to
>>> the server are async you have effectively freed up the inbound  
>>> half of the
>>> socket connection. That means that you can use it for receiving  
>>> async
>>> messages from the server - the only catch is that you have to  
>>> start a new
>>> thread to read and dispatch the incoming async RPC calls.
>>>
>>> In a typical Thrift RPC system you'd create a MyService.Processor  
>>> on your
>>> server and a MyService.Client on your client. To do bidirectional  
>>> async
>>> message sending you'll need to go a step further and create a
>>> MyService.Client on your server for each client that connects  
>>> (this can be
>>> accomplished by providing your own TProcessorFactory) and then on  
>>> each
>>> client you create a MyService.Processor. (This assumes that you've  
>>> gone with
>>> a generic MyService definition like you described above that has a  
>>> bunch of
>>> optional messages, another option would be to define separate  
>>> service
>>> definitions for the client and server.) With two clients connected  
>>> the
>>> objects in existence would look something like this:
>>>
>>> Server:
>>> MyService.Processor mainProcessor - handles incoming async RPCs
>>> MyService.Client clientA - used to send outgoing async RPCs to  
>>> ClientA
>>> MyService.Client clientB - used to send outgoing async RPCs to  
>>> ClientB
>>>
>>> ClientA:
>>> MyService.Client - used to send messages to Server
>>> MyService.Processor clientProcessor - used (by a separate thread) to
>>> process incoming async RPCs
>>>
>>> ClientB:
>>> MyService.Client - used to send messages to Server
>>> MyService.Processor clientProcessor - used (by a separate thread) to
>>> process incoming async RPCs
>>>
>>> Hopefully that explains the concept. If you need example code I  
>>> can try and
>>> pull something together (it will be in Java). The nice thing about  
>>> this
>>> method is that you don't have to establish two connections, so you  
>>> can get
>>> around the firewall issues others have mentioned. I've been using  
>>> this
>>> method on a service in production and haven't had any problems.  
>>> When you
>>> have a separate thread in your client running a Processor you're  
>>> basically
>>> blocking on a read, waiting for a message from the server. The  
>>> benefit of
>>> this is that you're notified immediately when the server shuts  
>>> down instead
>>> of having to wait until you send a message and then finding out  
>>> that the TCP
>>> connection was reset.
>>>
>>> Cheers,
>>> Joel
>>>
>>
>> Thanks for the feedback. I've created a simple example in Java  
>> demonstrating
>> this in action:
>> http://www.joelpm.com/wp-content/uploads/2009/04/bidimessages.tgz
>>
>> Post with a few details on the implementation:
>> http://www.joelpm.com/2009/04/03/thrift-bidirectional-async-rpc/
>>
>> Please add me to the list of people who think there's value in a  
>> full async
>> transport that provides (optional?) synchronization at the api  
>> level using
>> futures/deferreds/etc.
>>
>> Cheers,
>> Joel
>>
>>
>>>
>>>
>>>>
>>>> The biggest issue is that not all client request will result in a  
>>>> single
>>>> response (like shooting a bullet, may blowup an entity, and  
>>>> damage all
>>>> players in the area those events are seperate messages sent from  
>>>> the
>>>> respective entities).
>>>>
>>>> At a game development studio I used to work at we developed a cross
>>>> language
>>>> IDL network protocol definition (C++, Java)  very similiar to  
>>>> Protocol
>>>> Buffers and Thrift (without some of the more mature features like  
>>>> being
>>>> transport agnostic we explicitly built it for binary TCP socket  
>>>> transport,
>>>> or protocol versioning), the stream of packets would contain as  
>>>> the first
>>>> 32
>>>> bits a message ID that would be a key to a map a Message class  
>>>> that would
>>>> have methods to read in that message type from a byte[] stream.
>>>>
>>>> Looking through Thrift code in the TBinaryProtocol writeMessage  
>>>> it looks
>>>> like it's including the name of the message being sent and it's  
>>>> type (is
>>>> the
>>>> concept of Message in thrift the same as RPC?), if so what's the
>>>> corresponding code pathway for the client waiting for an RPC  
>>>> response
>>>> because if I could just use this message name or type to key into  
>>>> what I
>>>> need to serialize off the network from both client and server end  
>>>> then
>>>> that
>>>> would be perfect.
>>>>
>>>>
>>>>
>>>> On Tue, Mar 24, 2009 at 1:51 PM, Ted Dunning  
>>>> <ted.dunning@gmail.com>
>>>> wrote:
>>>>
>>>>> I really think that using async service methods which are  
>>>>> matched one to
>>>>> one
>>>>> with the message types that you want to send gives you exactly the
>>>>> semantics
>>>>> that are being requested with very simple implementation cost.
>>>>>
>>>>> It is important to not get toooo hung up on what RPC stands  
>>>>> for.  I use
>>>>> async methods all the time to stream data structures for logging  
>>>>> and it
>>>>> works great.  Moreover, it provides a really simple way of  
>>>>> building
>>>>> extractors and processors for this data since I have an interface
>>>> sitting
>>>>> there that will tell me about all of the methods (data types)  
>>>>> that I
>>>> need
>>>>> to
>>>>> handle or explicitly ignore.
>>>>>
>>>>> So the trick works and works really well.  Give it a try!
>>>>>
>>>>> On Tue, Mar 24, 2009 at 8:23 AM, Bryan Duxbury <bryan@rapleaf.com>
>>>> wrote:
>>>>>
>>>>>> Optional fields are not serialized onto the wire. There is a  
>>>>>> slight
>>>>>> performance penalty at serialization time if you have a ton of  
>>>>>> unset
>>>>> fields,
>>>>>> but that's it.
>>>>>>
>>>>>> Am I over complicating things
>>>>>>>
>>>>>>
>>>>>> Personally, sounds like it to me. Why do you need this streaming
>>>> behavior
>>>>>> or whatnot? Hotwiring the rpc stack to let you send any message 

>>>>>> you
>>>> want
>>>>> is
>>>>>> going to be a ton of work and not really that much of a  
>>>>>> functionality
>>>>>> improvement.
>>>>>>
>>>>>> -Bryan
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ted Dunning, CTO
>>>>> DeepDyve
>>>>>
>>>>
>>>
>>>
>


Mime
View raw message