cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Harvey (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-5099) get_count sometimes returns value smaller than actual column count
Date Wed, 02 Jan 2013 22:52:14 GMT
Jason Harvey created CASSANDRA-5099:

             Summary: get_count sometimes returns value smaller than actual column count
                 Key: CASSANDRA-5099
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 1.1.7
            Reporter: Jason Harvey
            Priority: Minor

We have a CF where keys have thousands of TTLd columns. The columns are continually added
at a regular rate, and TTL out after 15 minutes. We continually run a 'get_count' on these
keys to get a count of the number of live columns.

Since we upgrade from 1.0 to 1.1.7, "get_count" regularly returns much smaller values than
are possible. For example, with  roughly 15,000 columns that have well-distributed TTLs, running
a get_count 10 times will result in 1 or 2 results that are up to half the actual column count.
Using a normal 'get' to count those columns always results in proper values. 

For example:

(all of these counts were ran within a second or less of eachother)
[default@reddit] count  AccountsActiveBySR['2qh0u'];
13665 columns
[default@reddit] count  AccountsActiveBySR['2qh0u'];
13665 columns
[default@reddit] count  AccountsActiveBySR['2qh0u'];
13666 columns
[default@reddit] count  AccountsActiveBySR['2qh0u'];
3069 columns
[default@reddit] count  AccountsActiveBySR['2qh0u'];
13660 columns
[default@reddit] count  AccountsActiveBySR['2qh0u'];
13661 columns

I should note that this issue happens much more frequently with larger (>10k columns) keys
than smaller keys. It never seems to happen with keys less than 1k columns.

There are no supercolumns in use. The key names and column names are very short, and there
are no column values. The CF is LCS, and due to the TTL only hovers around a few MB in size.

It appears that there was an issue (CASSANDRA-4833) fixed in 1.1.7 regarding get_count. Some
logic was added to prevent an infinite loop case. Could that change have resulted in this
problem somehow? I can't find any other relevant 1.1 changes that might explain this issue.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message