cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13983) Support a means of logging all queries as they were invoked
Date Tue, 31 Oct 2017 18:15:00 GMT


ASF GitHub Bot commented on CASSANDRA-13983:

GitHub user aweisberg opened a pull request:

    Support a means of logging all queries as they were invoked. CASSANDRA-13983

You can merge this pull request into a Git repository by running:

    $ git pull cassandra-13983-trunk

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #169
commit aff958405b1853bb005aa12f02beafac28311cc8
Author: Ariel Weisberg <>
Date:   2017-10-27T21:16:45Z

    Support a means of logging all queries as they were invoked.


> Support a means of logging all queries as they were invoked
> -----------------------------------------------------------
>                 Key: CASSANDRA-13983
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: CQL, Observability, Testing, Tools
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 4.0
> For correctness testing it's useful to be able to capture production traffic so that
it can be replayed against both the old and new versions of Cassandra while comparing the
> Implementing this functionality once inside the database is high performance and presents
less operational complexity.
> In [this patch|]
there is an implementation of a full query log that logs uses chronicle-queue (apache licensed,
the maven artifacts are labeled incorrectly in some cases, dependencies are also apache licensed)
to implement a rotating log of queries.
> * Single thread asynchronously writes log entries to disk to reduce impact on query latency
> * Heap memory usage bounded by a weighted queue with configurable maximum weight sitting
in front of logging thread
> * If the weighted queue is full producers can be blocked or samples can be dropped
> * Disk utilization is bounded by deleting old log segments once a configurable size is
> * The on disk serialization uses a flexible schema binary format (chronicle-wire) making
it easy to skip unrecognized fields, add new ones, and omit old ones.
> * Can be enabled and configured via JMX, disabled, and reset (delete on disk data), logging
path is configurable via both JMX and YAML
> * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which can dump
in a human readable format full query logs as well as follow active full query logs
> Follow up work:
> * Introduce new {{fqltool}} command Replay which can replay N full query logs to two
different clusters and compare the result and check for inconsistencies. <- Actively working
on getting this done
> * Log not just queries but their results to facilitate a comparison between the original
query result and the replayed result. <- Really just don't have specific use case at the
> * "Consistent" query logging allowing replay to fully replicate the original order of
execution and completion even in the face of races (including CAS). <- This is more speculative

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message