Cassandra : Mixing regular writes with LWT writes
|Mixing regular writes with LWT writes in cassandra on same column is a sure way to loose data. Datastax suggests that data loss can occur if LWT writes and normal writes are used together on same partition. If you are seeing missing updates in cassandra, investigate to see if there are any mixed operations on the table.
Cassandra provides atomic CAS operations through LWT. Linearisable consistency holds only when all the operations use LWT (serial reads and LWT writes). Cassandra fails to provide this consistency in mixed mode because LWT and regular writes uses different timestamp mechanisms.
For example consider the case where we insert a record in regular way (without IF NOT EXISTS) and then do a conditional update on the record. Even if the conditional update is reported as succeeded, the update might not reflect in the select statement. One may assume that this may occur rarely, but I have seen it to happen regularly (especially if you are doing an update immediately after insert) if there is clock skew. Cassandra defaults to client side timestamps for regular writes, the driver might generate timestamps that are in future, if we are generating more than 1 query per microsecond or the system clock itself has skew.
INSERT ... UPDATE ... IF ... SELECT ...
The select query might not return the updated value, instead you get the inserted value in the first step (This does not happen always, there needs to be clock skew for this to happen, if you want to reproduce, follow this procedure, we need to make the insert in future to reproduce this case consistently)
This might come as a surprise for developers, as from application perspective the operations are serial, they happen one after another according wall clock. From cassandra perspective the operation may not be serial as it uses different clocks for regular writes and LWT, We can understand this with following figure (Cassandra internal might be different, this rough estimation of whats going on, the clock might have skew occasionally)
Though B’s update happened after A’s insert, B’s update will be lost due to A’s insert is registered with a future timestamp (due to clock skew or timestamp generation drift). For this reason if linearisable consistency is important to your application, always use LWT for every operation even at the cost of performance (LWT requires 4 round trips with in cassandra cluster).
INSERT ... IF NOT EXISTS UPDATE ... IF ... SELECT ...