cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-14556) Optimize streaming path in Cassandra
Date Mon, 23 Jul 2018 15:59:00 GMT


ASF GitHub Bot commented on CASSANDRA-14556:

Github user iamaleksey commented on a diff in the pull request:
    --- Diff: src/java/org/apache/cassandra/db/streaming/ ---
    @@ -0,0 +1,184 @@
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.cassandra.db.streaming;
    +import java.util.List;
    +import java.util.Set;
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +import org.apache.cassandra.db.ColumnFamilyStore;
    +import org.apache.cassandra.db.DecoratedKey;
    +import org.apache.cassandra.db.Directories;
    +import org.apache.cassandra.db.SerializationHeader;
    +import org.apache.cassandra.db.lifecycle.LifecycleTransaction;
    +import org.apache.cassandra.schema.TableId;
    +import org.apache.cassandra.streaming.ProgressInfo;
    +import org.apache.cassandra.streaming.StreamReceiver;
    +import org.apache.cassandra.streaming.StreamSession;
    +import org.apache.cassandra.streaming.messages.StreamMessageHeader;
    +import org.apache.cassandra.utils.Collectors3;
    +import org.apache.cassandra.utils.FBUtilities;
    + * CassandraBlockStreamReader reads SSTable off the wire and writes it to disk.
    + */
    +public class CassandraBlockStreamReader implements IStreamReader
    +    private static final Logger logger = LoggerFactory.getLogger(CassandraBlockStreamReader.class);
    +    protected final TableId tableId;
    +    protected final StreamSession session;
    +    protected final int sstableLevel;
    --- End diff --
    It has taken me some time (and @krummas's help) to prove that this wasn't a correctness
issue, but at its best this is confusing/misleading code.
    We extract `sstableLevel` from the header, but don't use it anywhere. Instead, since we
stream `StatsMetadata` directly, we also inherit the level from there - regardless of whether
`CassandraOutgoingStream.keepSSTableLevel` is set to `true`. If `LeveledManifest.canAddSSTable`
check didn't exist, we'd be in trouble here. For clarity, I would probably look at that flag,
and explicitly reset the level to `L0` if `keepSSTableLevel` is set to `false`.
    P.S. What's the deal with all these `protected` fields?

> Optimize streaming path in Cassandra
> ------------------------------------
>                 Key: CASSANDRA-14556
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Streaming and Messaging
>            Reporter: Dinesh Joshi
>            Assignee: Dinesh Joshi
>            Priority: Major
>              Labels: Performance
>             Fix For: 4.x
> During streaming, Cassandra reifies the sstables into objects. This creates unnecessary
garbage and slows down the whole streaming process as some sstables can be transferred as
a whole file rather than individual partitions. The objective of the ticket is to detect when
a whole sstable can be transferred and skip the object reification. We can also use a zero-copy
path to avoid bringing data into user-space on both sending and receiving side.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message