tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Florian Hockmann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TINKERPOP-2268) Prevent Connection Failure from Hanging
Date Fri, 09 Aug 2019 08:54:00 GMT

    [ https://issues.apache.org/jira/browse/TINKERPOP-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903714#comment-16903714
] 

Florian Hockmann commented on TINKERPOP-2268:
---------------------------------------------

{quote}my understanding of the concept here is that such socket code tries to optimize connections
until they timeout for lack of activity, so client-side.
{quote}
This shouldn't happen normally as the driver sends pings every 30s to ensure that the WebSocket
connection stays alive. Are you using Cosmos DB by any chance? We had some reports from users
that Cosmos DB terminates idle connections despite these keepalive pings.
{quote}i just checked if first connection connected (because the code always checks pool in
that order)
{quote}
That's only the case for the first request coming over a connection. The next request will
try the second connection and so on as the pool uses a round-robin like scheduling over connections
to distribute the load over the available connections. (This is implemented in the [{{TryGetAvailableConnection}}
method|https://github.com/apache/tinkerpop/blob/a5486357c610630e35afe6e15419266847d11c00/gremlin-dotnet/src/Gremlin.Net/Driver/ConnectionPool.cs#L118].)
{quote}the more elegant solution may be that when we find a connection ! IsOpen, then re-connect
that connection and then use it for that request.  but that is more complicated because we
have to allow for other concurrent requests against that client.  what say you?
{quote}
Yes, that's the solution I would prefer. We currently already perform [the {{!connection.IsOpen}}
check|https://github.com/apache/tinkerpop/blob/a5486357c610630e35afe6e15419266847d11c00/gremlin-dotnet/src/Gremlin.Net/Driver/ConnectionPool.cs#L127]
and close the connection in that case. The replacement however happens asynchronously [when
a new request comes in|https://github.com/apache/tinkerpop/blob/a5486357c610630e35afe6e15419266847d11c00/gremlin-dotnet/src/Gremlin.Net/Driver/ConnectionPool.cs#L58]
and the pool doesn't have enough connections. Since it's happening asynchronously, you should
get a {{ConnectionPoolBusyException}} if no connection is open instead of a timeout. I don't
understand why you don't get that exception and how you can end up with the timeout error
instead.
{quote}as you know the code will throw a throw new ConnectionPoolBusyException(_poolSize,
_maxInProcessPerConnection) if no connection has less than max requests and an open connection,
which may conflict with RemoveConnectionFromPool, which is calling DefinitelyDestroyConnection.
{quote}
What do you mean with conflict? Dead connections are removed here which can lead to that exception
if all connections are dead, but where is there a conflict?
{quote}what connection pools allow for is scaling.  a more elegant solution would improve
scaling, but may complicate the code to handle the edge case.  i don't know the trade-offs.
{quote}
Yes, scaling the pool up or down based on load could avoid situations where connections become
idle, but as you already said that would make the pool a lot more complicated. We also think
that a connection pool with a fixed size should work just fine which is why we didn't want
to add that complexity for now. The pool would also probably still have one or two connections
even if it's completely idle so new requests can be executed shortly without having to create
new connections for them first. So, the problem with having only idle connections in the pool
can still occur.

> Prevent Connection Failure from Hanging
> ---------------------------------------
>
>                 Key: TINKERPOP-2268
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2268
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: dotnet, driver
>         Environment: .Net Core
>            Reporter: MichaelZ
>            Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When a consumer of the Gremlin.Net client calls to execute a Gremlin query, i.e. "SubmitAsync,"
and there is no valid connection, there will be a costly timeout error. I have experienced
30 to 90 second timeouts.
> I was on vacation, so I didn't do this earlier, but I have written a little patch that
will refresh the connection pool when there is no valid connection, and it works flawlessly. 
This is quick code, and there is a more elegant solution, but what I did is to check IsOpen
on the first connection snapshot, and create a new pool if it was stale.  Here is is the
code on the GremlinClient object:
> {code}
> private ConnectionPool _connectionPool; //{color:#ff0000}used to be readonly{color}
> // member variables
>  private readonly GremlinServer _gremlinServer = null;
>  private readonly GraphSONReader _graphSONReader = null;
>  private readonly GraphSONWriter _graphSONWriter = null;
>  private readonly string _mimeType = null;
> private readonly ConnectionPoolSettings _connectionPoolSettings = null;
> private readonly Action<ClientWebSocketOptions> _webSocketConfiguration = null;
>  //
> public GremlinClient(GremlinServer gremlinServer, GraphSONReader graphSONReader = null,
>  GraphSONWriter graphSONWriter = null, string mimeType = null,
>  ConnectionPoolSettings connectionPoolSettings = null,
>  Action<ClientWebSocketOptions> webSocketConfiguration = null)
>  {
>  //
>  _gremlinServer = gremlinServer;
>  _graphSONReader = graphSONReader;
>  _graphSONWriter = graphSONWriter;
>  _mimeType = mimeType;
>  _connectionPoolSettings = connectionPoolSettings;
>  _webSocketConfiguration = webSocketConfiguration;
>  //
>  {color:#ff0000}NewConnectionPool(){color};
>  }
> private void NewConnectionPool()
> {
> var reader = _graphSONReader ?? new GraphSON3Reader();
> var writer = _graphSONWriter ?? new GraphSON3Writer();
> var connectionFactory = new ConnectionFactory(_gremlinServer, reader, writer, _mimeType
?? DefaultMimeType, _webSocketConfiguration);
> _connectionPool = new ConnectionPool(connectionFactory, _connectionPoolSettings ?? new
ConnectionPoolSettings());
> }
> /// <summary>
>  /// Provides whether the first available connection snapshot in pool is still open.
>  /// </summary>
>  {color:#ff0000}private{color} bool HasOpenConnection => (bool)_connectionPool?.FirstConnectionSnapshot?.IsOpen;
> /// <inheritdoc />
>  public async Task<ResultSet<T>> SubmitAsync<T>(RequestMessage requestMessage)
>  {
>  if (!HasOpenConnection)
>  {
>  Debug.WriteLine("=====================================");
>  Debug.WriteLine("new connection pool");
> {color:#ff0000}NewConnectionPool(){color};
>  }
> using (var connection = await _connectionPool.GetAvailableConnectionAsync().ConfigureAwait(false))
> { return await connection.SubmitAsync<T>(requestMessage).ConfigureAwait(false);
}
> }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Mime
View raw message