[ https://issues.apache.org/jira/browse/CASSANDRA-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14966396#comment-14966396
]
Hervé Toulan edited comment on CASSANDRA-4782 at 10/21/15 8:22 AM:
-------------------------------------------------------------------
Hi,
I know this is bug is fixed now, but I got a loss of data in production with several nodes
in Cassandra 1.1.0 (java 6 + Red Hat)
I've been told that NTP and network issues occured, also the Cassandra servers have been restarted
probably due to power outage.
How can I identify that the loss of data are due to the bug described here? Is it reproducible
?
I can't decide to upgrade my servers in production without a solid evidence...
Thanks,
Hervé
was (Author: htoulan):
Hi,
I know this is bug is fixed now, but I got a loss of data in production with several nodes
in Cassandra 1.1.0
I've been told that NTP and network issues occured, also the Cassandra servers have been restarted
probably due to power outage.
How can I identify that the loss of data are due to the bug described here? Is it reproducible
?
I can't decide to upgrade my servers in production without a solid evidence...
Thanks,
Hervé
> Commitlog not replayed after restart
> ------------------------------------
>
> Key: CASSANDRA-4782
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4782
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.1.0
> Reporter: Fabien Rousseau
> Assignee: Jonathan Ellis
> Priority: Critical
> Fix For: 1.1.6
>
> Attachments: 4782.txt
>
>
> It seems that there are two corner cases where commitlog is not replayed after a restart
:
> - After a reboot of a server + restart of cassandra (1.1.0 to 1.1.4)
> - After doing an upgrade from cassandra 1.1.X to cassandra 1.1.5
> This is due to the fact that the commitlog segment id should always be an incrementing
number (see this condition : https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L247
)
> But this assertion can be broken :
> In the first case, it is generated by System.nanoTime() but it seems that System.nanoTime()
is using the boot time as the base/reference (at least on java6 & linux), thus after a
reboot, System.nanoTime() can return a lower number than before the reboot (and the javadoc
says the reference is a relative point in time...)
> In the second case, this was introduced by #4601 (which changes System.nanoTime() by
System.currentTimeMillis() thus people starting with 1.1.5 are safe)
> This could explain the following tickets : #4741 and #4481
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
|