cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Favre-Felix (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-8254) Query parameters (and more) are limited to 65,536 entries
Date Tue, 04 Nov 2014 18:33:34 GMT
Nicolas Favre-Felix created CASSANDRA-8254:
----------------------------------------------

             Summary: Query parameters (and more) are limited to 65,536 entries
                 Key: CASSANDRA-8254
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8254
             Project: Cassandra
          Issue Type: Bug
          Components: API
            Reporter: Nicolas Favre-Felix


Parameterized queries are sent over the wire as a string followed by a list of arguments.
This list is decoded in QueryOptions.Codec by CBUtil.readValueList(body), which in turn reads
a 16-bit short value from the wire as the number of values to deserialize.
Sending more values leads to a silent overflow, sometimes reported by the driver as a protocol
error as other values are deserialized incorrectly.

64k sounds like a lot, but tables with a large number of clustering dimensions can hit this
limit when fetching a few thousand CQL rows only with an IN query, e.g.

{code}
SELECT * FROM sensor_data WHERE a=? and (b,c,d,e,f,g,h,i) IN ((?,?,?,?,?,?,?,?), (?,?,?,?,?,?,?,?),
(?,?,?,?,?,?,?,?), (?,?,?,?,?,?,?,?) ... )
{code}

Here, having 8 dimensions in the clustering key plus 1 in the partitioning key restricts the
read to 8,191 CQL rows.

Some other parts of Cassandra still use 16-bit sizes, for example preventing users to fetch
all elements of a large collection (CASSANDRA-6428). The suggestion at the time was "we'll
fix it in the next iteration of the binary protocol", so I'd like to suggest switching to
variable-length integers as this would solve such issues while keeping messages short.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message