cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Holmberg (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-6535) Prepared Statement on Defunct CF Can Impact Cluster Availability
Date Tue, 31 Dec 2013 21:26:50 GMT


Adam Holmberg updated CASSANDRA-6535:

    Attachment: 6535.txt

6535.txt - a simple patch that adds CF validation to ClientState.hasColumnFamilyAccess. This
buttons up the error pathology I was observing, preventing cluster impact and returning meaningful
errors to the client.

> Prepared Statement on Defunct CF Can Impact Cluster Availability
> ----------------------------------------------------------------
>                 Key: CASSANDRA-6535
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Cassandra 1.2.12
> CentOS 6.4
>            Reporter: Adam Holmberg
>         Attachments: 6535.txt
> *Synopsis:* misbehaving clients can cause DoS on a cluster with a defunct prepared statement
> *Scenario:* 
> 1.) Create prepared INSERT statement on existing table X
> 2.) Table X is dropped
> 3.) Continue using prepared statement from (1)
> *Result:* 
> a.) on coordinator node: COMMIT-LOG-WRITER + MutationStage errors
> b.) on other nodes: "UnknownColumnFamilyException reading from socket; closing"  -->
leads to thrashing inter-node connections
> c.) Other clients of the cluster suffer from I/O timeouts, presumably a result of (b)
> *Other observations:*
> * On single-node clusters, clients return from insert without error because mutation
errors are swallowed.
> * On multiple-node clusters, clients receive a confounded 'read timeout' error because
the closed internode connections do not propagate the error back.
> * With prepared SELECT statements (as opposed to INSERT described above). A NullPointerException
is caused on the server, and no meaninful error is returned to the client.
> Besides the obvious "don't do that" to the integrator, it would be good if the cluster
could handle this error case more gracefully and avoid undue impact.

This message was sent by Atlassian JIRA

View raw message