cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (Assigned) (JIRA)" <>
Subject [jira] [Assigned] (CASSANDRA-3248) CommitLog writer should call fdatasync instead of fsync
Date Mon, 28 Nov 2011 21:33:40 GMT


Brandon Williams reassigned CASSANDRA-3248:

    Assignee: Rick Branson  (was: Brandon Williams)
> CommitLog writer should call fdatasync instead of fsync
> -------------------------------------------------------
>                 Key: CASSANDRA-3248
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.6.13, 0.7.9, 0.8.6, 1.0.0, 1.1
>         Environment: Linux
>            Reporter: Zhu Han
>            Assignee: Rick Branson
>   Original Estimate: 48h
>  Remaining Estimate: 48h
> CommitLogSegment uses SequentialWriter to flush the buffered data to log device. It depends
on FileDescriptor#sync() which invokes fsync() as it force the file attributes to disk.
> However, at least on Linux, fdatasync() is good enough for commit log flush:
> bq. fdatasync() is similar to fsync(), but does not flush modified metadata unless that
metadata is needed in order to allow a subsequent data retrieval to be  correctly handled.
 For example, changes to st_atime or st_mtime (respectively, time of last access and time
of last modification; see stat(2)) do not require flushing because they are not necessary
for a subsequent data read to be handled correctly.  On the other hand, a change to the file
size (st_size,  as  made  by  say  ftruncate(2)),  would require a metadata flush.
> File size is synced to disk by fdatasync() either. Although the commit log recovery logic
sorts the commit log segements on their modify timestamp, it can be removed safely, IMHO.
> I checked the native code of JRE 6. On Linux and Solaris, FileChannel#force(false) invokes
fdatasync(). On windows, the false flag does not have any impact.
> On my log device (commodity SATA HDD, write cache disabled), there is large performance
gap between fsync() and fdatasync():
> {quote}
> $sysbench --test=fileio --num-threads=1  --file-num=1 --file-total-size=10G --file-fsync-all=on
--file-fsync-mode={color:red}fdatasync{color} --file-test-mode=seqwr --max-time=600 --file-block-size=2K
 --max-requests=0 run
> {color:blue}54.90{color} Requests/sec executed
>    per-request statistics:
>          min:                                  8.29ms
>          avg:                                 18.18ms
>          max:                                108.36ms
>          approx.  95 percentile:              25.02ms
> $ sysbench --test=fileio --num-threads=1  --file-num=1 --file-total-size=10G --file-fsync-all=on
--file-fsync-mode={color:red}fsync{color} --file-test-mode=seqwr --max-time=600 --file-block-size=2K
 --max-requests=0 run
> {color:blue}28.08{color} Requests/sec executed
>     per-request statistics:
>          min:                                 33.28ms
>          avg:                                 35.61ms
>          max:                                911.87ms
>          approx.  95 percentile:              41.69ms
> {quote}
> I do think this is a very critical performance improvement.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message