tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Broadmore (Jira)" <j...@apache.org>
Subject [jira] [Created] (TINKERPOP-2352) Gremlin Python driver default pool size makes Gremlin keep-alive difficult
Date Fri, 20 Mar 2020 11:14:00 GMT
Mark Broadmore created TINKERPOP-2352:
-----------------------------------------

             Summary: Gremlin Python driver default pool size makes Gremlin keep-alive difficult
                 Key: TINKERPOP-2352
                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2352
             Project: TinkerPop
          Issue Type: Bug
          Components: python
    Affects Versions: 3.4.5, 3.3.5
         Environment: AWS Lambda, Python 3.7 runtime, AWS Neptune.
(AWS Lambda functions can remain in memory and thus hold connections open for many minutes
between invocations)
            Reporter: Mark Broadmore


I'm working with a Gremlin database that (like many) terminates connections if they don't
execute any transactions with a timeout period.  When we want to run a traversal we first
check our `GraphTraversalSource` by running `g.V().limit(1).count().next()` and if that raises
an exception we know we need to reconnect before running the actual traversal.

We've been very confused that this hasn't worked as expected: we intermittently see traversals
fail with `WebSocketClosed` or other connection-related errors immediately after the "connection
test" passes. 

I've (finally) found the cause of this inconsistency is the default pool size in `gremlin_python.driver.client.Client`
being 4.  This means there's no visiblity outside the `Client` of which connection in the
pool is tested and/or used, and in fact no way for the application (`GraphTraversalSource`)
to run keep-alive type traversals reliably.  Anytime an application passes in a pool size
of `None` or a number > 1 there'll be no way to make sure that each and every connection
in the pool actually sends keep-alive traversals to the remote, _except_ in the case of a
single-threaded application where a tight loop could issue `pool_size` of them.  In that
latter case as the application is single-threaded then a `pool_size` above 1 won't provide
much benefit.

I've raised this as a bug because I think a default `pool_size` of 1 would give much more
predictable behaviour, and in the specific case of the Python driver is probably more appropriate
because Python applications tend to run single-threaded by default, with multi-threading carefully
added when performance requires it.  Perhaps it's a wish, but as the behaviour from the default
option is quite confusing it feels more like a bug, at least.  If it would help I'm happy
to raise a PR with some updated function header comments or maybe updated documentation about
multi-threaded / multi-async-loop usage of gremlin-python.

(This is my first issue here, apologies if it has some fields wrong.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message