thrift-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Molinaro <>
Subject Re: erlang server/client closing connections
Date Fri, 13 Aug 2010 18:08:56 GMT
Okay, another update, the problem is the recv_timeout, and it's almost possible
to get it to work the way I want it too, but it required a hack to work.

I switched to thrift_socket_server, and for those who were using thrift_server.
Instead of creating like

thrift_server:start_link(Port, ServiceModule, HandlerModule).

you do the following

thrift_socket_server:start ([{port, Port)},
                             {service, ServiceModule},
                             {handler, HandlerModule}]).

By default the recv_timeout is set to 500ms, so the connections shut down
almost immediately, you can add a higher recv_timeout like.

thrift_socket_server:start ([{port, Port)},
                             {service, ServiceModule},
                             {handler, HandlerModule},
                             {socket_opts, [{recv_timeout, 60*60*1000}]}]).

However, then you get a timeout on gen_server:call/3 which crashes the
processs.  I tracked down the timeout to this call in

read(Transport, Len) when is_integer(Len) ->
    gen_server:call(Transport, {read, Len}, _Timeout=10000).

So to check I just changed 10000 to 60*60*1000 and connections seem to
stay around now, at least for an hour of inactivity, which is fine for
my testing.

I think the appropriate fix would to somehow expose that timeout value
as an option to the server.  Maybe something like idle_timeout or
read_timeout, then the trick is getting it tunneled to that call, currently
thrift_buffered_transport doesn't accept any options, it could be added
as a third parameter to read, but it would have to happen in all the
transports, which looking through the code doesn't seem that bad, the
only minor issue would be with thrift_socket_transport which uses recv_timeout
right now as a read_timeout, so you have 2 timeouts to choose from.
Also, you still need to get that timeout to the places read is called.

Well not sure if its worth it or not?  If I have a chance I can hack at it,
but for the moment I have to finish some things off, so will have to get
back to this later.


On Fri, Aug 13, 2010 at 12:12:49AM -0700, Anthony Molinaro wrote:
> On Thu, Aug 12, 2010 at 10:41:50PM -0700, David Reiss wrote:
> > usually, this sort of thing happens because the server has a recv timeout
> > set.  I see that thrift_socket_server sets a recv timeout, but I can't tell
> > if thrift_server is doing so.  One possibility might be to put some debugging
> > code in thrift_processor to determine if it is terminating and closing the
> > connection.
> So looking again, it looks like I was mistaken about keepalive being true.
> It's inherited from the listen process, but there doesn't seem to be a way
> to pass options in (this is for the thrift_server).  I hardcoded it and
> passed the option to the client, but it doesn't seem to help.
> So a receive timeout might be a problem as I create connections at startup
> but in my dev env don't really use them for a while.  So if the server decides
> the client isn't going to send anything it might close down it's connection.
> I tried to dig down and see this happen but I don't see the processor break
> out of it's loop, I dropped some io:formats, but it doesn't seem to trigger
> any branch of the case, so I'm not certain what is happening.  I think I'll
> have to see if I can trace it and see what I find.
> > I'm not sure if thrift_server is supposed to be deprecated in favor of
> > thrift_socket_server.  Chris Piro or Todd Lipcon might know.
> I got this usage of thrift_server from Todd's thrift_erl_skel, but
> maybe it's out of date.  I'll take a look at thrift_socket_server
> tomorrow to see what it looks like.  A quick glance and it looks very
> different from thrift_server.
> I may just try to rewrite my pooling mechanism so that instead of
> starting processes when my server starts, start them the first time
> a request is made.  The only problem is since the only way for the
> client to know the server has hung up on him is to make a call, I'll
> have to retry if I create a process, stick it into the pool to reuse,
> pull it out a few seconds later, get an exception then have to re-connect
> and rerun the call :(
> -Anthony
> > On 08/12/2010 10:22 PM, Anthony Molinaro wrote:
> > > Hi,
> > > 
> > >   I'm trying to use pg2 to cache several thrift client connections so I
> > > can spread load across them.  This seems to work great however, the
> > > connections seem to go stale, I think the server is dropping them, however
> > > looking through the thrift code is seems like keepalive is true, so I'm
> > > not sure why this would be the case.
> > > 
> > > I start my server with
> > > 
> > > thrift_server:start_link/3
> > > 
> > > and the client processes are started with
> > > 
> > > thrift_client:start_link/3
> > > 
> > > The process stays alive fine on the client, but goes away after about
> > > 30 seconds or so on the server (probably less they seem to go away
> > > quick).  Since the client is alive, when I do a call I get this
> > > exception.
> > > 
> > > {{case_clause,{error,closed}},
> > >  [{thrift_client,read_result,3},
> > >   {thrift_client,catch_function_exceptions,2},
> > >   {thrift_client,handle_call,3},
> > >   {gen_server,handle_msg,5},
> > >   {proc_lib,init_p_do_apply,3}]}
> > > 
> > > Is there anyway to keep this from happening?
> > > 
> > > Thanks,
> > > 
> > > -Anthony
> > > 
> -- 
> ------------------------------------------------------------------------
> Anthony Molinaro                           <>

Anthony Molinaro                           <>

View raw message