spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Priya Ch <>
Subject Writing streaming data to cassandra creates duplicates
Date Sun, 26 Jul 2015 18:19:54 GMT
Hi All,

 I have a problem when writing streaming data to cassandra. Or existing
product is on Oracle DB in which while wrtiting data, locks are maintained
such that duplicates in the DB are avoided.

But as spark has parallel processing architecture, if more than 1 thread is
trying to write same data i.e with same primary key, is there as any scope
to created duplicates? If yes, how to address this problem either from
spark or from cassandra side ?

Padma Ch

View raw message