cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Boudreault (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization
Date Tue, 18 Nov 2014 21:57:35 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14216873#comment-14216873
] 

Alan Boudreault commented on CASSANDRA-7386:
--------------------------------------------

devs, I've tested this issue with and without the patch and analysed the disk usage of 3 scenarios.
The patch works well and fix important issues related to multiple directories. I'm sharing
with you the results with the graphes (attached below):

For all my tests, I have been able to reproduce the issues using multiple directories. No
need to *hammer* the node with compaction and repair, I simply limited the concurrent_compactors
and the compaction_throughput_mb_per_sec to slow things. This makes the disk busy during the
pick selection.

h4. Test 1

* 2 Disks of the same size
* Goal: stress the server to fill all disks

h5. Result - No Patch

Only one disk is filled and the other one is never filled. Cassandra-stress crashed with WriteTimeoutException
while the second disk remains at ~20% of disk usage.

h5. Result - With Patch

Success. Both disk are filled at approximatively the same speed.

h4. Test 2

* 5 disks total of the same size
* 2 disks initially filled at ~20% 
* 3 disks added later
* Goal: stress the server to fill all disks

h5. Result - No Patch

* The first 2 disks aren't used at the beginning since they are already at 20% of disk usage.
(That's ok)
* Some new data are written
* 2 newly added disks are used for the initial data, when they reach 20% of disk usage...
all 4 disks are filled at approximatively the same speed.
* The last disk that is running a compaction is almost never used and remains at 15% of disk
usage when cassandra-stress crash with write timeouts.

h5. Result -  With Patch

Success. All disks have been filled at approximatively the same speed. I can notice that Cassandra
doesn't wait untill all 3 newly added disks are at 20% to re-use the disk 1 and 2, but it
keeps things OK and reduce the difference through the run.

h4.  Test 3

* 5 disks total. 
* 4 disks of 2G of size
* 1 disk of 10G of size (5x more than the other ones)
* Goal: stress the server to fill all disks

h5. Result - No Patch

* The disk #5 (10G of size) is initially use then an internal compaction is started.
* All the 4 other disks are completely filled and the disk 5 is never used anymore. Cassandra-stress
crash with write timeout and the disk5 remains at 15% of disk usage with more than 8G of free
space.

h5. Result - With Patch 

Success. All 5 disks are filled at approximatively the same speed. 

See the result images attached below..

> JBOD threshold to prevent unbalanced disk utilization
> -----------------------------------------------------
>
>                 Key: CASSANDRA-7386
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chris Lohfink
>            Assignee: Robert Stupp
>            Priority: Minor
>             Fix For: 2.1.3
>
>         Attachments: 7386-2.0-v3.txt, 7386-2.1-v3.txt, 7386-v1.patch, 7386v2.diff, Mappe1.ods,
mean-writevalue-7disks.png, patch_2_1_branch_proto.diff, sstable-count-second-run.png, test1_no_patch.jpg,
test1_with_patch.jpg, test2_no_patch.jpg, test2_with_patch.jpg, test3_no_patch.jpg, test3_with_patch.jpg
>
>
> Currently the pick the disks are picked first by number of current tasks, then by free
space.  This helps with performance but can lead to large differences in utilization in some
(unlikely but possible) scenarios.  Ive seen 55% to 10% and heard reports of 90% to 10% on
IRC.  With both LCS and STCS (although my suspicion is that STCS makes it worse since harder
to be balanced).
> I purpose the algorithm change a little to have some maximum range of utilization where
it will pick by free space over load (acknowledging it can be slower).  So if a disk A is
30% full and disk B is 5% full it will never pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message