cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dobrin (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8649) CAS per (partition key + clustering key) and not only per (partition key)
Date Wed, 21 Jan 2015 12:57:35 GMT


Dobrin commented on CASSANDRA-8649:

Hi Sylvain,

Thank you very much for sharing your opinion!
2. it's unclear to me how to expose such feature...

I have started with cassandra CQL3 from the very beginning. I have never used trift / the
old query language/model.
Honestly for me having serializability per CQL row is more natural and I believe it will be
probably the same for all the users comming from the SQL world.

I like the table option where you can choose serializability per CQL row or partition. 
And I think that having the serializability per CQL row as default should not lead to any
(too many) confused users as they would need to explicitly change it.


> CAS per (partition key + clustering key) and not only per (partition key)
> -------------------------------------------------------------------------
>                 Key: CASSANDRA-8649
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Dobrin
>             Fix For: 3.0
> Reading the description at
> ...
> * The columns updated do NOT have to be the same as the columns in the IF clause.
> * Lightweight transactions are restricted to a single partition; this is the granularity
at which we keep the internal Paxos state. As a corollary, transactions in different partitions
will never interrupt each other.
> ...
> So my understanding of the above is that if multiple writers for example perform CAS
inserts (INSERT...IF NOT EXISTS) using the same partition key and different clustering keys
will interrupt/interfere with each other?
> Is this understanding correct? (my tests seems to confirm it)
> For example if I want to model users from different country/city/area and I want to be
able to list all the users from a given country ordered by (city,area) and also I know that
a single cassandra node will be able to store all the users from a given country but I need
to partition users from different countries because a single cassandra node will not be enough:
> 	country text,
> 	city text,
> 	area text,
> 	id text,
> 	json text,
> 	version bigint,
> 	PRIMARY KEY ((country), city, area, id)
> );
> Where id is the user id and json is a JSON serialized user object (an aggregate) containing
more information about the user. 
> I want to be able to CAS insert many users into the same country concurrently using
> 	INSERT INTO user(country, city, area, id, json, version) VALUES ('x',...) IF NOT EXISTS;
> and be able to CAS update users from the same country concurrently:
> 	UPDATE user SET json='{...}',version=18 WHERE country='x' AND city='y' AND area='z'
AND id='123' IF version=17;
> As I understand this will not be efficient because all the above concurrent statements
will have to be "ordered" by the same paxos instance/state per country 'x'? (and trying it
results in a lot of WriteTimeoutException-s)
> If yes - can we made paxos to support IF statements per column/cell?
> By cell/column I mean all the underlying persistent state that is behind the compound
primary key (partition key + clustering key) - in the above example
> 	the state is json and version
> 	the partition key is the country
> 	and the clustering key is (city, area, id)
> (	
> I'm stating it explicitly as I'm not completely sure whether this is a single cell or
double cells underneath at the storage engine, references used:
> )
> In other words is it possible to make CAS per (partition key + clustering key) and not
only per (partition key)?

This message was sent by Atlassian JIRA

View raw message