[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433637#comment-16433637
]
Jürgen Albersdorfer commented on CASSANDRA-14239:
-------------------------------------------------
I changed
disk_optimization_strategy: ssd
memtable_heap_space_in_mb: 2048
memtable_offheap_space_in_mb: 2048
Streaming was much more faster and produced less CPU pressure than before
{code:java}
-dsk/total- ---system-- ----total-cpu-usage---- --io/total- -net/total-
read writ| int csw |usr sys idl wai hiq siq| read writ| recv send
9830B 31M| 48k 7751 | 67 2 31 0 0 1|0.20 85.8 | 30M 380k
0 28M| 51k 7838 | 65 2 32 0 0 1| 0 80.9 | 33M 511k
32k 35M| 54k 9024 | 66 2 31 0 0 1|0.60 102 | 37M 540k
0 28M| 41k 7072 | 62 2 36 0 0 1| 0 78.1 | 26M 265k
1638B 25M| 41k 6606 | 62 1 36 0 0 0|0.10 67.6 | 25M 110k
1638B 26M| 41k 7251 | 57 1 41 0 0 0|0.10 69.9 | 27M 138k
819B 24M| 40k 6129 | 56 1 42 0 0 1|0.20 61.5 | 25M 127k
0 25M| 38k 7273 | 56 1 42 0 0 0| 0 66.9 | 26M 162k
1024k 24M| 35k 6501 | 56 1 42 0 0 0|25.2 62.8 | 25M 128k
0 24M| 37k 7238 | 56 1 42 0 0 0| 0 62.6 | 26M 164k
0 24M| 35k 6349 | 56 1 42 0 0 0| 0 63.5 | 25M 145k
410B 26M| 40k 6979 | 56 2 42 0 0 0|0.10 73.1 | 28M 341k
0 28M| 41k 7042 | 56 1 42 0 0 0| 0 70.8 | 30M 350k
2048B 31M| 44k 7334 | 56 2 42 0 0 0|0.20 85.4 | 32M 347k
0 31M| 46k 6515 | 56 1 42 0 0 1| 0 86.0 | 33M 383k
0 30M| 47k 7572 | 56 1 42 0 0 1| 0 82.3 | 33M 466k
7373B 31M| 41k 5742 | 56 1 42 0 0 0|0.20 84.3 | 30M 319k
0 30M| 43k 7146 | 56 2 42 0 0 1| 0 87.4 | 28M 423k
{code}
when `Received complete` for all Nodes, bootstrap didn't finish and I can observe a
* stalled number of `Completed` MutationStage,
* while the `Pending` MutationStage seems to skyrocket.
* Rest of it looks fine to me :(
{code:java}
nodetool tpstats
Pool Name Active Pending Completed
Blocked All time blocked
ReadStage 0 0
0 0 0
MiscStage 0 0
0 0 0
CompactionExecutor 2 7
53 0 0
MutationStage 128 5722021 593964000
0 0
MemtableReclaimMemory 0 0
2194 0 0
PendingRangeCalculator 0 0
19 0 0
GossipStage 0 0
25736 0 0
SecondaryIndexManagement 0 0
0 0 0
HintsDispatcher 0 0
0 0 0
RequestResponseStage 0 0
167108 0 0
ReadRepairStage 0 0
0 0 0
CounterMutationStage 0 0
0 0 0
MigrationStage 0 0
40 0 0
MemtablePostFlush 1 11
2344 0 0
PerDiskMemtableFlushWriter_0 0 0 2194
0 0
ValidationExecutor 0 0
0 0 0
Sampler 0 0
0 0 0
MemtableFlushWriter 2 11
2194 0 0
InternalResponseStage 0 0
31 0 0
ViewMutationStage 0 0
0 0 0
AntiEntropyStage 0 0
0 0 0
CacheCleanupExecutor 0 0
0 0 0
Message type Dropped
READ 0
RANGE_SLICE 0
_TRACE 0
HINT 0
MUTATION 0
COUNTER_MUTATION 0
BATCH_STORE 0
BATCH_REMOVE 0
REQUEST_RESPONSE 0
PAGED_RANGE 0
READ_REPAIR 0
{code}
*Why does `MutationStage`now `(busy) hang`? - While*
* SlabPoolCleaner Thread uses a single logical CPU at 100% permanently
* G1 Old Gen increases linearly over time and goes far beyond 50GB
* See attached [^gc.log.201804111141.zip] at [gceasy.io|http://gceasy.io/diamondgc-report.jsp?oTxnId_value=5c97d52f-1d06-4d28-8ab7-dd9bd58311b7]
> OutOfMemoryError when bootstrapping with less than 100GB RAM
> ------------------------------------------------------------
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
> Issue Type: Bug
> Environment: Details of the bootstrapping Node
> * ProLiant BL460c G7
> * 56GB RAM
> * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and saved_caches)
> * CentOS 7.4 on SD-Card
> * /tmp and /var/log on tmpfs
> * Oracle JDK 1.8.0_151
> * Cassandra 3.11.1
> Cluster
> * 10 existing Nodes (Up and Normal)
> Reporter: Jürgen Albersdorfer
> Priority: Major
> Attachments: Objects-by-class.csv, Objects-with-biggest-retained-size.csv, cassandra-env.sh,
cassandra.yaml, gc.log.0.current.zip, gc.log.201804111141.zip, jvm.options, jvm_opts.txt,
stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on our 10 Node
C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM Heap Old Gen
which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see collections, but
there is always a remainder which seems to grow forever without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I have given
it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over the Network
over time during bootstrapping, - which would be a memory leak and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB assigned JVM
Heap). YourKit Profiler shows huge amount of Memory allocated for org.apache.cassandra.db.Memtable
(22 GB) org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer (11 GB)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org
|