cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Simmerl (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1177) OutOfMemory on heavy inserts
Date Thu, 10 Jun 2010 15:13:15 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877445#action_12877445
] 

Alexander Simmerl commented on CASSANDRA-1177:
----------------------------------------------

We tried to reduce the MemtableOperationsInMillions from 1 to 0.1 and MemtableFlushAfterMinutes
1. I also increased and decreased the heap size. As you can see in the attachment all nodes
are kinda even loaded. Only 10.12.22.117 is showing a huge difference, but this happened after
the crashes, before it was equal to the other nodes.

None of the actions helped. We also experienced a flapping with the Gossiper:


 INFO [GC inspection] 2010-06-10 16:10:38,790 GCInspector.java (line 110) GC for ConcurrentMarkSweep:
23943 ms, 8915640 reclaimed leaving 2151863720 used; max is 2263941120
 INFO [GMFD:1] 2010-06-10 16:10:38,790 Gossiper.java (line 568) InetAddress /10.12.22.116
is now UP
 INFO [Timer-1] 2010-06-10 16:10:55,846 Gossiper.java (line 179) InetAddress /10.12.22.116
is now dead.
 INFO [GC inspection] 2010-06-10 16:10:55,846 GCInspector.java (line 110) GC for ConcurrentMarkSweep:
16730 ms, 8592904 reclaimed leaving 2152186664 used; max is 2263941120
 INFO [GMFD:1] 2010-06-10 16:10:55,846 Gossiper.java (line 568) InetAddress /10.12.22.116
is now UP
 INFO [Timer-1] 2010-06-10 16:11:20,004 Gossiper.java (line 179) InetAddress /10.12.22.116
is now dead.
 INFO [GC inspection] 2010-06-10 16:11:20,004 GCInspector.java (line 110) GC for ConcurrentMarkSweep:
24118 ms, 8148936 reclaimed leaving 2152641776 used; max is 2263941120
 INFO [Timer-1] 2010-06-10 16:11:20,004 Gossiper.java (line 179) InetAddress /10.12.22.115
is now dead.
 INFO [GMFD:1] 2010-06-10 16:11:20,004 Gossiper.java (line 568) InetAddress /10.12.22.116
is now UP
 INFO [GMFD:1] 2010-06-10 16:11:20,004 Gossiper.java (line 568) InetAddress /10.12.22.115
is now UP
 INFO [Timer-1] 2010-06-10 16:11:36,610 Gossiper.java (line 179) InetAddress /10.12.22.116
is now dead.
 INFO [GC inspection] 2010-06-10 16:11:36,910 GCInspector.java (line 110) GC for ConcurrentMarkSweep:
16591 ms, 7905120 reclaimed leaving 2152871040 used; max is 2263941120
 INFO [GMFD:1] 2010-06-10 16:11:36,910 Gossiper.java (line 568) InetAddress /10.12.22.116
is now UP
 INFO [Timer-1] 2010-06-10 16:12:01,268 Gossiper.java (line 179) InetAddress /10.12.22.116
is now dead.
 INFO [Timer-1] 2010-06-10 16:12:01,268 Gossiper.java (line 179) InetAddress /10.12.22.115
is now dead.

> OutOfMemory on heavy inserts
> ----------------------------
>
>                 Key: CASSANDRA-1177
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1177
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6.2
>         Environment: SunOS 5.10, x86 32bit, Jave Hotspot Server VM 11.2-b01 mixed mode
> Sun SDK 1.6.0_12-b04
>            Reporter: Torsten Curdt
>            Priority: Critical
>         Attachments: bug report.zip
>
>
> We have cluster of 6 Cassandra 0.6.2 nodes running under SunOS (see environment).
> On initial import (using the thrift API) we see some weird behavior of half the cluster.
While cas04-06 look fine as you can see from the attached munin graphs, the other 3 nodes
kept on GCing (see log file) until they became unreachable and went OOM. (This is also why
the stats are so spotty - munin could no longer reach the boxes) We have seen the same behavior
on 0.6.2 and 0.6.1. This started after around 100 million inserts.
> Looking at the hprof (which is of course to big to attach) we see lots of ConcurrentSkipListMap$Node's
and quite some Column objects. Please see the stats attached.
> This looks similar to https://issues.apache.org/jira/browse/CASSANDRA-1014 but we are
not sure it really is the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message