[ https://issues.apache.org/jira/browse/CASSANDRA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939944#comment-14939944
]
Fernando Gonçalves commented on CASSANDRA-10233:
------------------------------------------------
Hi [~nutbunnies], I work together with Eiti Kimura at Movile, and this issue is happening
in one of our cluster of cassandra.
I'll try answer your questions:
- how many nodes?
Currently we are running with 15 nodes, in 2 racks, in the same datacenter. One rack has 7
nodes and the other has 8 nodes.
- assuming rolling upgrade
I did't understand if this is a question, but what I can say is that we already upgraded to
version 2.1.9 yesterday, and the problem started when we added 7 new nodes to the cluster
a week ago. We add one node a time, waiting for each node join the cluster before start the
joining of the next node.
- jdk change?
We are using the same version for a long time, Java Hotspot 1.8.0_45-b14.
- roughly how long was each node unavailable
pompeia1 14:52:37 up 126 days
pompeia2 14:52:37 up 126 days
pompeia3 14:52:37 up 126 days
pompeia4 14:52:37 up 126 days
pompeia5 14:52:37 up 126 days
pompeia6 14:52:37 up 126 days
pompeia7 14:52:37 up 82 days
pompeia8 14:52:37 up 82 days
pompeia9 14:52:37 up 7 days
pompeia10 14:52:37 up 7 days
pompeia11 14:52:37 up 7 days
pompeia12 14:52:37 up 7 days
pompeia13 14:52:37 up 7 days
pompeia14 14:52:37 up 7 days
pompeia15 14:52:37 up 7 days
- gc_grace value of table with broken hint
values of max_hint_window_in_ms, max_hints_delivery_threads, hinted_handoff_enabled, hinted_handoff_throttle_in_kb
in cassandra.yaml
We are not sure about the table that is problematic, but we think that is the most large (considering
the records count and number of columns) and most used table that we have, and I'm going to
inform the its values:
-- gc_grace_seconds = 864000
The value in the application.yml
-- max_hint_window_in_ms: 10800000
-- max_hints_delivery_threads: 2
-- hinted_handoff_enabled: true
-- hinted_handoff_throttle_in_kb: 1024
- what type of mutation was the hint without a target_id?
I don't know how to get the type of mutation, only the mutation value, that is a blob in the
table. Can you help me here?
If you need any other information, I can send to you!
Thank you!
> IndexOutOfBoundsException in HintedHandOffManager
> -------------------------------------------------
>
> Key: CASSANDRA-10233
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10233
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: Cassandra 2.2.0
> Reporter: Omri Iluz
> Assignee: Andrew Hust
> Attachments: cassandra-2.1.8-10233-v2.txt, cassandra-2.1.8-10233.txt
>
>
> After upgrading our cluster to 2.2.0, the following error started showing exectly every
10 minutes on every server in the cluster:
> {noformat}
> INFO [CompactionExecutor:1381] 2015-08-31 18:31:55,506 CompactionTask.java:142 - Compacting
(8e7e1520-500e-11e5-b1e3-e95897ba4d20) [/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/la-540-big-Data.db:level=0,
]
> INFO [CompactionExecutor:1381] 2015-08-31 18:31:55,599 CompactionTask.java:224 - Compacted
(8e7e1520-500e-11e5-b1e3-e95897ba4d20) 1 sstables to [/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/la-541-big,]
to level=0. 1,544,495 bytes to 1,544,495 (~100% of original) in 93ms = 15.838121MB/s. 0
total partitions merged to 4. Partition merge counts were {1:4, }
> ERROR [HintedHandoff:1] 2015-08-31 18:31:55,600 CassandraDaemon.java:182 - Exception
in thread Thread[HintedHandoff:1,1,main]
> java.lang.IndexOutOfBoundsException: null
> at java.nio.Buffer.checkIndex(Buffer.java:538) ~[na:1.7.0_79]
> at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:410) ~[na:1.7.0_79]
> at org.apache.cassandra.utils.UUIDGen.getUUID(UUIDGen.java:106) ~[apache-cassandra-2.2.0.jar:2.2.0]
> at org.apache.cassandra.db.HintedHandOffManager.scheduleAllDeliveries(HintedHandOffManager.java:515)
~[apache-cassandra-2.2.0.jar:2.2.0]
> at org.apache.cassandra.db.HintedHandOffManager.access$000(HintedHandOffManager.java:88)
~[apache-cassandra-2.2.0.jar:2.2.0]
> at org.apache.cassandra.db.HintedHandOffManager$1.run(HintedHandOffManager.java:168)
~[apache-cassandra-2.2.0.jar:2.2.0]
> at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
~[apache-cassandra-2.2.0.jar:2.2.0]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_79]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_79]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
[na:1.7.0_79]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
[na:1.7.0_79]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_79]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_79]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
|