cassandra-pr mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [cassandra] yifan-c commented on a change in pull request #404: New/fql
Date Thu, 30 Jan 2020 00:08:20 GMT
yifan-c commented on a change in pull request #404: New/fql
URL: https://github.com/apache/cassandra/pull/404#discussion_r372699098
 
 

 ##########
 File path: doc/source/new/fqllogging.rst
 ##########
 @@ -0,0 +1,2096 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+Full Query Logging
+------------------ 
+
+Apache Cassandra 4.0 adds a new feature to support a means of logging all queries as they
were invoked (`CASSANDRA-13983
+<https://issues.apache.org/jira/browse/CASSANDRA-13983>`_). For correctness testing
it's useful to be able to capture production traffic so that it can be replayed against both
the old and new versions of Cassandra while comparing the results.
+
+Cassandra 4.0 includes an implementation of a full query logging (FQL) that uses chronicle-queue
to implement a rotating log of queries. Some of the features of FQL are:
+
+- Single thread asynchronously writes log entries to disk to reduce impact on query latency
+- Heap memory usage bounded by a weighted queue with configurable maximum weight sitting
in front of logging thread
+- If the weighted queue is full producers can be blocked or samples can be dropped
+- Disk utilization is bounded by deleting old log segments once a configurable size is reached
+- The on disk serialization uses a flexible schema binary format (chronicle-wire) making
it easy to skip unrecognized fields, add new ones, and omit old ones.
+- Can be enabled and configured via JMX, disabled, and reset (delete on disk data), logging
path is configurable via both JMX and YAML
+- Introduce new ``fqltool`` in ``/bin`` that currently implements Dump which can dump in
a readable format full query logs as well as follow active full query logs
+
+Cassandra 4.0 has a binary full query log based on Chronicle Queue that can be controlled
using ``nodetool enablefullquerylog``, ``disablefullquerylog``, and ``resetfullquerylog``.
The log contains all queries invoked, approximate time they were invoked, any parameters necessary
to bind wildcard values, and all query options. A readable version of the log can be dumped
or tailed using the new ``bin/fqltool`` utility. The full query log is designed to be safe
to use in production and limits utilization of heap memory and disk space with limits you
can specify when enabling the log.
+
+Objective
+^^^^^^^^^^ 
+Full Query Logging logs all requests to the CQL interface. The full query logs could be used
for debugging, performance benchmarking,  testing and auditing CQL queries. The audit logs
also include CQL requests but full query logging is dedicated to CQL requests only with features
such as FQL replay and FQL compare that are not available in audit logging. Audit logging
is for auditing the database activity and may be improved to add auditing of other database
activity beside authorization and CQL in future versions. 
+
+Full Query Logger
+^^^^^^^^^^^^^^^^^^ 
+The Full Query Logger is a logger that logs entire query contents after the query finishes
(or times out). Queries are logged in one of two modes: single query or batch of queries.
The log for an invocation of a batch of queries includes the following attributes:
+
+::
+
+ type - The type of the batch
+ queries - CQL text of the queries
+ values - Values to bind to as parameters for the queries
+ queryOptions - Options associated with the query invocation
+ queryState - Timestamp state associated with the query invocation
+ batchTimeMillis - Approximate time in milliseconds since the epoch since the batch was invoked
+
+Bin log is a  quick and dirty binary log that is kind of a NIH version of binary logging
with a traditional logging framework. It's goal is good enough performance, predictable footprint,
simplicity in terms of implementation and configuration and most importantly minimal impact
on producers of log records. Performance safety is accomplished by feeding items to the binary
log using a weighted queue and dropping records if the binary log falls sufficiently far behind.
Simplicity and good enough performance is achieved by using a single log writing thread as
well as Chronicle Queue to handle writing the log, making it available for readers, as well
as log rolling.
 
 Review comment:
   `NIH` most likely stands for `not invented here`. However, it does not help to understand,
although the abbreviation exist in the source code comment. I would suggest to just drop the
clause.
   
   `Bin log` was only used once in the entire document. How is it associate with FQL? 
   
   `It's goal is` should be `Its goal is` or just `The goal is`
   
   Probably have the following
   
   > Full query logging is backed up by `BinLog`. `BinLog` is a quick and dirty binary
log. Its goal is...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


Mime
View raw message