james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benoit Tellier (Jira)" <server-...@james.apache.org>
Subject [jira] [Closed] (JAMES-3586) CL one option for the Cassandra blob store
Date Tue, 25 May 2021 06:50:00 GMT

     [ https://issues.apache.org/jira/browse/JAMES-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Benoit Tellier closed JAMES-3586.
    Resolution: Fixed

https://github.com/apache/james-project/pull/436 contributed this

> CL one option for the Cassandra blob store
> ------------------------------------------
>                 Key: JAMES-3586
>                 URL: https://issues.apache.org/jira/browse/JAMES-3586
>             Project: James Server
>          Issue Type: Improvement
>            Reporter: René Cordier
>            Priority: Major
>             Fix For: 3.7.0
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
> h2. Context
> Some users are storing all message content in Cassandra and thus stores huge amount of
> We would like to reduce the performance costs to read this large amount of data.
> Blobs being immutable, we have a guaranty that:
>  * If we read something the value is up-to date
>  * If we fail at reading something we have a guaranty the content had not been replicated
yet. A second read with a higher consistency level will read the data (and consistency piggy
backed on consistency levels will heal the data)
> Cassandra being very efficient at replicating things (think hinted handoff, direct asynchronous
replication), we can expect that data is correctly duplicated before reads are attempted.
> h2. Decision
> Via a configuration option, allow optimizing blob access.
>  * If enabled, perform a first read at CL one and fallback if needed by performing a
second read at the regular CL
>  * If disabled, only a read at the regular CL will be attempted
> A metric should be implemented to track the CL one hit rate, allowing an effective review
of the effectiveness of this solution.
> h2. Consequences
> In a multiDC setup with RF=3 DC=2 this implies a factor 4 in IO reduction across the
cluster, lowering a lot the read pressure on the Cassandra BlobStore.
> h2. Work to be conducted
> Add a configuration option in cassandra.properties:
> {code:bash}
> # Experimental configuration option. Defaults to false.
> # Enabling it resutls in reading strictly immutable (not deleted, not updated) data at
CL ONE. If the data is missing,
> # we can be sure that the data had not been replicated yet, a second read is performed
 with a higher consistency level.
> # This option still offer the same level of consistency (thanks to strict immutability)
but might result in higher resource usage in case of mis-behaving replication.
> # Metrics can be used to mesure the efficiency of this.
> optimistic.consistency.level.enabled=false
> {code}

This message was sent by Atlassian Jira

To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org

View raw message