cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pedro Gordo (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-12750) TRACE messages get dropped
Date Thu, 18 Jan 2018 22:25:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331346#comment-16331346
] 

Pedro Gordo edited comment on CASSANDRA-12750 at 1/18/18 10:24 PM:
-------------------------------------------------------------------

If you run the same query from within cqlsh (with tracing on), do you get a message of the
sort:
{{Trace session id: blablabla [more]}}
{{Unable to retrieve Query Trace information [more]}} ?


was (Author: pedro_gordo):
If you run the same query from within cqlsh (with tracing on), do you get a message of the
sort:
{{Trace session id: blablabla [more]
Unable to retrieve Query Trace information [more]}} ?

> TRACE messages get dropped
> --------------------------
>
>                 Key: CASSANDRA-12750
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12750
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths, Observability
>         Environment: Ubuntu 14.04 LTS
>            Reporter: Kuku1
>            Priority: Major
>
> Two of my nodes drop TRACE messages for some reason.
> I'm using Cassandra 3.0.8.
> I'm running a "select * from ... " query that will return 300.000 rows. I want to measure
the total query runtime. 
> My application runs the following code. 
> {code}
>     PreparedStatement preparedStatement = connection.prepareStatement(query).enableTracing();
>     if (preparedStatement == null) {
>         return;
>     }
>     BoundStatement bs = preparedStatement.bind();
>     ResultSet rs = connection.execute(bs);
>     while (!rs.isFullyFetched()) {
>         rs.fetchMoreResults();
>     }
>     //all results have been fetched at this point
>     //sleep 15s to wait for C* to write the query traces
>     Thread.sleep(15000);
>     long totalQueryRuntime = 0;
>     List<ExecutionInfo> executionInfos = rs.getAllExecutionInfo();
>     for (ExecutionInfo ex : executionInfos) {
>         QueryTrace queryTrace = ex.getQueryTrace();
>         totalQueryRuntime += queryTrace.getDurationMicros();
>         if (verbose) {
>             for (Event event : queryTrace.getEvents()) {
>                 //some logging
>             }
>         }
>     }
> {code}
> The application fails with this exception:
> {code}
> com.datastax.driver.core.exceptions.TraceRetrievalException: Unable to retrieve complete
query trace for id 88dae7d0-8ada-11e6-9f80-d9082a06988f after 5 tries
>         at com.datastax.driver.core.QueryTrace.doFetchTrace(QueryTrace.java:228)
>         at com.datastax.driver.core.QueryTrace.maybeFetchTrace(QueryTrace.java:177)
>         at com.datastax.driver.core.QueryTrace.getDurationMicros(QueryTrace.java:103)
> {code}
> In my system.logs, I can find these output (similar) on two of my six nodes:
> {code}
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,650 MessagingService.java:983 - _TRACE messages
were dropped in last 5000 ms: 511 for internal timeout and 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,651 StatusLogger.java:52 - Pool Name   
                Active   Pending      Completed   Blocked  All Time Blocked
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,657 StatusLogger.java:56 - MutationStage
                    0         0          35036         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,658 StatusLogger.java:56 - ViewMutationStage
                0         0              0         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,658 StatusLogger.java:56 - ReadStage   
                     0         0            399         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,659 StatusLogger.java:56 - RequestResponseStage
             0         0           7158         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,659 StatusLogger.java:56 - ReadRepairStage
                  0         0              2         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,660 StatusLogger.java:56 - CounterMutationStage
             0         0              0         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,661 StatusLogger.java:56 - MiscStage   
                     0         0              0         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,661 StatusLogger.java:56 - CompactionExecutor
               0         0             52         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,662 StatusLogger.java:56 - MemtableReclaimMemory
            0         0             21         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,662 StatusLogger.java:56 - PendingRangeCalculator
           0         0              8         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,663 StatusLogger.java:56 - GossipStage 
                     0         0            476         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,663 StatusLogger.java:56 - SecondaryIndexManagement
         0         0              0         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,664 StatusLogger.java:56 - HintsDispatcher
                  0         0              0         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,664 StatusLogger.java:56 - MigrationStage
                   0         0              0         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,665 StatusLogger.java:56 - MemtablePostFlush
                0         0             40         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,666 StatusLogger.java:56 - ValidationExecutor
               0         0              0         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,666 StatusLogger.java:56 - Sampler     
                     0         0              0         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,667 StatusLogger.java:56 - MemtableFlushWriter
              0         0             21         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,668 StatusLogger.java:56 - InternalResponseStage
            0         0              0         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,669 StatusLogger.java:56 - AntiEntropyStage
                 0         0              0         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,670 StatusLogger.java:56 - CacheCleanupExecutor
             0         0              0         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,671 StatusLogger.java:56 - Native-Transport-Requests
        0         0            116         0                 0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,671 StatusLogger.java:66 - CompactionManager
                0         1
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,671 StatusLogger.java:78 - MessagingService
               n/a       0/0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,671 StatusLogger.java:88 - Cache Type  
                  Size                 Capacity               KeysToSave
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,671 StatusLogger.java:90 - KeyCache    
                  3160                104857600                      all
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,671 StatusLogger.java:96 - RowCache    
                     0                        0                      all
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,672 StatusLogger.java:103 - Table      
                Memtable ops,data
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,672 StatusLogger.java:106 - system_distributed.parent_repair_history
                0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,672 StatusLogger.java:106 - system_distributed.repair_history
                0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,672 StatusLogger.java:106 - system.compaction_history
              2,384
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,672 StatusLogger.java:106 - system.hints
                             0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,672 StatusLogger.java:106 - system.schema_aggregates
                 0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,673 StatusLogger.java:106 - system.IndexInfo
                         0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,673 StatusLogger.java:106 - system.schema_columnfamilies
                0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,673 StatusLogger.java:106 - system.schema_triggers
                   0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,673 StatusLogger.java:106 - system.size_estimates
            2120,252650
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,673 StatusLogger.java:106 - system.schema_functions
                  0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,673 StatusLogger.java:106 - system.paxos
                             0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,673 StatusLogger.java:106 - system.views_builds_in_progress
                0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,673 StatusLogger.java:106 - system.built_views
                       0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,674 StatusLogger.java:106 - system.peer_events
                       0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,674 StatusLogger.java:106 - system.range_xfers
                       0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,674 StatusLogger.java:106 - system.peers
                        97,46227
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,674 StatusLogger.java:106 - system.batches
                           0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,674 StatusLogger.java:106 - system.schema_keyspaces
                  0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,674 StatusLogger.java:106 - system.schema_usertypes
                  0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,674 StatusLogger.java:106 - system.local
                             0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,675 StatusLogger.java:106 - system.sstable_activity
              28,1100
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,675 StatusLogger.java:106 - system.available_ranges
                  0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,675 StatusLogger.java:106 - system.batchlog
                          0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,675 StatusLogger.java:106 - system.schema_columns
                    0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,675 StatusLogger.java:106 - fleetstream.data
                         0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,675 StatusLogger.java:106 - system_schema.columns
              168,25858
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,675 StatusLogger.java:106 - system_schema.types
                     2,16
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,675 StatusLogger.java:106 - system_schema.indexes
                   2,16
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,676 StatusLogger.java:106 - system_schema.keyspaces
                4,242
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,676 StatusLogger.java:106 - system_schema.dropped_columns
               2,16
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,676 StatusLogger.java:106 - system_schema.aggregates
                2,16
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,676 StatusLogger.java:106 - system_schema.triggers
                  2,16
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,676 StatusLogger.java:106 - system_schema.tables
                27,17400
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,676 StatusLogger.java:106 - system_schema.views
                     2,16
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,676 StatusLogger.java:106 - system_schema.functions
                 2,16
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,677 StatusLogger.java:106 - test.data  
                              0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,677 StatusLogger.java:106 - system_auth.roles
                        0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,677 StatusLogger.java:106 - system_auth.role_members
                 0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,677 StatusLogger.java:106 - system_auth.resource_role_permissons_index
                0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,677 StatusLogger.java:106 - system_auth.role_permissions
                0,0
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,677 StatusLogger.java:106 - system_traces.sessions
               58,9106
> INFO  [ScheduledTasks:1] 2016-10-05 12:51:47,677 StatusLogger.java:106 - system_traces.events
               320,62299
> {code}
> nodetool tpstats on one of the two nodes that drop the traces shows:
> {code}
> nodetool tpstats
> Pool Name                    Active   Pending      Completed   Blocked  All time blocked
> MutationStage                     0         0          32911         0              
  0
> ViewMutationStage                 0         0              0         0              
  0
> ReadStage                         0         0           1065         0              
  0
> RequestResponseStage              0         0          10969         0              
  0
> ReadRepairStage                   0         0              6         0              
  0
> CounterMutationStage              0         0              0         0              
  0
> MiscStage                         0         0              0         0              
  0
> CompactionExecutor                0         0            972         0              
  0
> MemtableReclaimMemory             0         0             21         0              
  0
> PendingRangeCalculator            0         0              7         0              
  0
> GossipStage                       0         0           6278         0              
  0
> SecondaryIndexManagement          0         0              0         0              
  0
> HintsDispatcher                   0         0              0         0              
  0
> MigrationStage                    0         0              0         0              
  0
> MemtablePostFlush                 0         0             40         0              
  0
> ValidationExecutor                0         0              0         0              
  0
> Sampler                           0         0              0         0              
  0
> MemtableFlushWriter               0         0             21         0              
  0
> InternalResponseStage             0         0              0         0              
  0
> AntiEntropyStage                  0         0              0         0              
  0
> CacheCleanupExecutor              0         0              0         0              
  0
> Native-Transport-Requests         0         0            209         0              
  0
> Message type           Dropped
> READ                         0
> RANGE_SLICE                  0
> _TRACE                    2789
> HINT                         0
> MUTATION                     0
> COUNTER_MUTATION             0
> BATCH_STORE                  0
> BATCH_REMOVE                 0
> REQUEST_RESPONSE             0
> PAGED_RANGE                  0
> READ_REPAIR                  0
> {code}
> It should not be a memory problem because my nodes have ~20GB of free RAM:
> {code}
>              total       used       free     shared    buffers     cached
> Mem:      49450072   27703928   21746144        944     181696   17040856
> -/+ buffers/cache:   10481376   38968696
> Swap:     50319356        148   50319208
>              total       used       free     shared    buffers     cached
> Mem:      49450072   30405324   19044748       1124     225256   17627620
> -/+ buffers/cache:   12552448   36897624
> Swap:     50319356          0   50319356
> {code}
> I tried with default fetchSize (5000) as well as fetchSize 50000 - both did not help.
Restarting all C* services did not help either. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message