cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Yeschenko (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
Date Tue, 17 Jan 2017 18:32:26 GMT


Aleksey Yeschenko commented on CASSANDRA-9425:

Alright. Fixed the failing tests ({{TableId}} patch broke Paxos, and index metadata patch
altered custom index handling a bit - for the better - but broke {{CustomIndexTests}}), and
addressed some of the comments.

In addition got rid of {{DataResource}} in {{TableMetadata}}, and undid a tiny bit of of the
related CASSANDRA-10410 optimisation I hope you don’t mind.

bq. I'd rename {{TableMetadata.table}} to {{}} for consistency with other
classes ({{KeyspaceMetadata}} and {{IndexMetadata}} at least). While at it, I'd also rename
{{ksName}} to {{keyspace}} and {{viewName}} to {{name}} in {{ViewMetadata}}.

Both done.

bq. I'd also probably rename {{PartitionColumns}} to {{RegularAndStaticColumns}} now.


bq. Nit: In {{SSTableTxnWriter}}, it feels slightly inconsistent that {{createRangeAware}}
takes a {{TableMetadata}}, but create takes a {{TableMetadataRef}}.

I try to use {{TableMetadata}} directly whenever possible, only pass the Ref when unavoidable.
In this case it’s no big deal, though, so done.

bq. In {{Indexes.validate()}}, not sure it's worth doing the duplicate name check since we
do it at the keyspace level already (and only doing it within a single table is a false sense
of safety).

Problem is, in announce for {{CREATE INDEX}} we do not call {{KSM.validate()}}, yet. So for
the time being, until the next patch, where we always validate the entirety of the resulting
before applying anything, it has to stay here, even if imperfect. Removing it will fail some
tests. And some checks - for now - are still better than no checks. Verdict: left be until
the next patch.

bq. I'd rename {{addRegularColumn}} and {{alterColumnType}} in {{ViewMetadata}} as they are
not modifying the {{ViewMetadata}} but creating copies (say to {{withAddedRegularColumn()}}
and {{withAlteredColumnType}}).


bq. Is there a rational for notifying everything at the end, versus calling each notification
in the "appropriate" sub-method (notifying table alter in {{alterTable()}}, ....)? The later
would feels a tad more readable to me as it's slightly easier to check we haven't forgotten

I prefer it stylistically. But I also want to get schema and related DB objects ({{Keyspace}},
{{CFS}}, and everything downstream) to get to the consistent state as fast as possible. For
that reason
I’m delaying potentially blocking change event handlers until the very end. On that note,
we should probably eventually move their processing elsewhere, away from the sensitive path
- just enqueue them into some queue and have something poll and process the queue from time
to time.

bq. I'd maybe suggest adding a private {{handleDiff(mapDiff, Function onDropped, Function
onAdded, Function onUpdated)}} method to simplify a bit.

It’s a very minor case of duplication that doesn’t worry me, I’d rather not factor it
wouldn’t strictly speaking make things simpler. Also see the previous point.

bq. There is a TODO FIXME in {{SchemaKeyspace}} (and not 100% sure what it's about).

It’s an ugly pre-existing piece of code to avoid re-compiling the UDFs; It annoys me immensely
that a method in {{SchemaKeyspace}} is referencing {{Schema.instance}}, hence the TODO. I
want to get rid of it eventually, but it’s not important enough to spend my time on it atm.

bq. There is TODO: FIXME in {{ColumnFamilyStore.setCompressionParameters}}.

See CASSANDRA-12949. The method in its current implementation (1.1+) is broken and unsafe,
so I’ve disabled it for safety reasons. Hopefully someone will get around to dealing with
it some time, but I don’t have the time. Either way, committed a clarification for that

bq. {{Keyspaces has both a {{get(String)}} that return {{Optional}} and a {{getNullable(String)}}.
We should probably remove one.

Not just {{Keyspaces}}. I prefer to have both options available, as there are multiple things
that can tolerate null return. The alternative would be using {{get(name).orElse(null)}},
which I personally find unappealing.

bq. The patch changes some system table options (compared to current trunk)

I removed the explicit setting or {{readRepairChance}} to 0.0 as it was redundant. We made
{{readRepairChance}} default to be 0.0 a while ago. Only {{dcLocalReadRepairChance}} needs

{{system_distributed}} was switched to default gc gs because gc gs of 0 is a bug. {{view_build_status}}
table accepts deletes, which can be compacted away before propagation with 0 gc gs. See CASSANDRA-12954.
Other tables in that keyspace should be unaffected, as they don’t use TTL nor explicit deletes.

As for {{system_auth}}, it was never intended to have a special memtable flush period setting,
it was just an accidental reuse.

Should’ve probably commented on all of these 3 in patch comments, but all of these changes
are intentional. Sorry.

bq. I'm unclear on the changes to {{AlterTypeStatement}}. The patch removes the code that
propagate type changes to other table/view/types, but I neither see where that has moved,
nor why that wouldn't be needed anymore (as an aside, if it does is not needed anymore, than
we could remove the {{updateTypes()}} and {{updateWith()}} methods that are unused in the

They aren’t needed because since 3.0 we started using only the CQL type name in all system
tables. As such, we do not need to update individual UDTs, UDFs, UDAs, tables, and views.
Propagating the type change alone is sufficient. When the merge happens, the difference will
be picked up anyway, and the affected tables/types/etc. will also be reloaded. Should’ve
removed this in 3.0 schema rewrite.

Removed {{updateTypes()}} and {{updateWith()}} though, thanks for noticing (IDEA didn’t
as they call each other recursively).

Will address the remaining few points shortly, but this is ready for a minor follow up review.
Thank you.

> Make node-local schema fully immutable
> --------------------------------------
>                 Key: CASSANDRA-9425
>                 URL:
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Aleksey Yeschenko
>            Assignee: Aleksey Yeschenko
>             Fix For: 4.0
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state that's
won't necessarily be there when the statement gets executed and mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, TableMetadata}}, {{ColumnMetadata}},
immutable, and local schema persistently snapshottable, with a single top-level {{AtomicReference}}
to the current snapshot. Have DDL statements perform validation and transformation on the
same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
>     /**
>      * Validates that the DDL statement can be applied to the provided schema snapshot.
>      *
>      * @param schema snapshot of schema before executing CREATE KEYSPACE
>      */
>     void validate(SchemaSnapshot schema);
>     /**
>      * Applies the DDL statement to the provided schema snapshot.
>      * Implies that validate() has already been called on the provided snapshot.
>      *
>      * @param schema snapshot of schema before executing the statement
>      * @return snapshot of schema as it would be after executing the statement
>      */
>     SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}

This message was sent by Atlassian JIRA

View raw message