cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-4449) Make prepared statement global rather than connection based
Date Mon, 24 Sep 2012 20:40:08 GMT


Sylvain Lebresne commented on CASSANDRA-4449:

bq. Actually I don't see a reason to use something as heavyweight as MD5.

The advantage of using a hash of the query string as ID is that you only ever store one prepared
statement for a given query. Which does save memory in practice because a node will be connected
by many clients that will usually all prepare the same set of queries. It also give you some
protection against buggy/crappy clients that re-prepared the same query again and again, though
that's a more minor point. As for the heavyweightness of MD5, I don't think this matters in
the case of prepared statements.

> Make prepared statement global rather than connection based
> -----------------------------------------------------------
>                 Key: CASSANDRA-4449
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>              Labels: binary_protocol
>             Fix For: 1.2.0 beta 2
>         Attachments: 4449.txt, 4449-v2.txt
> Currently, prepared statements are connection based. A client can only use a prepared
statement on the connection it prepared it on, and if you prepare the same prepared statement
on multiple connections, we'll keep multiple times the same prepared statement. This is potentially
inefficient but can also be fairly painful for client libraries with pool of connections (a.k.a
all reasonable client library ever) as this means you need to make sure you prepare statement
on every connection of the pool, including the connection that don't exist yet but might be
created later.
> This ticket suggests making prepared statement global (at least for CQL3), i.e. move
them out of ClientState. This will likely reduce the number of stored statement on a given
node quite a bit, since it's very likely that all clients to a given node will prepare the
same statements (and potentially on all of their connection with the node). And given that
prepared statement identifiers are the hashCode() of the string, this should be fairly trivial.
> I will note that while I think using a hash of the string as identifier is a very good
idea, I don't know if the default java hashCode() is good enough. If that's a concern, maybe
we should use a safer (bug longer) hash like md5 or sha1. But we'd better do that now.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message