james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tellier Benoit (JIRA)" <server-...@james.apache.org>
Subject [jira] [Created] (JAMES-2390) JMAP attachment performance issues
Date Sun, 06 May 2018 13:53:00 GMT
Tellier Benoit created JAMES-2390:
-------------------------------------

             Summary: JMAP attachment performance issues
                 Key: JAMES-2390
                 URL: https://issues.apache.org/jira/browse/JAMES-2390
             Project: James Server
          Issue Type: New Feature
          Components: cassandra, JMAP
    Affects Versions: master
            Reporter: Tellier Benoit
            Assignee: Antoine Duprat


Most of the Cassandra failures are related to attachment downloads, and more precisely to
attachment right checking.

Having a look at attached screenshots:
 - We can notice a lot of warnings are generated by JMAP attachment downloads.
 - That failure happens when reading meta-data, in order to retrieve the list of referencing
messages to resolve rights.
 - Furthermore, we can notice failure is systematic for some attachments.

I spend a bit of time this weekend analysing this (unexpected!) performance issues. I've mostly
found 2 intuitive performance improvements as well as one more complex.

 -1. Upon checking whether a set of messages is accessible, the containing mailbox rights
were checks on a per-mailbox base. This is sub-optimal as some messages might be in the same
mailbox, whose rights will be needlessly checked several times.

This change inserts smoothly into the codebase, the tools for checking rights once per mailbox
is already implemented. Just not used in that case.

 - 2. Paging and asynchronous code don't combine well as already proven by previous code.
The mantra is *join then collect*. If the operation is done reverse and entries exceed paging
size (~5000) an exception will be thrown by the Cassandra driver.

This explains the systematic failures for some specific attachments... The fix is trivial,
and I added a test for demonstrating this.

 - 3. The given logs suggest that we have high cardinality rows in our database (IE an attachment
referenced by several messages), as the number of referencing messages exceeds 5000 (to trigger
paging issues)

Such a high cardinality has a massive read cost:
 - Reading such a row is a complex operation
 - Caching can not help as cache size per primary key is exceeded
 - Rights would be resolved for each referencing messages, generating an expensive read Cascade.

Note that deduplication is done at the Attachment level. By looking at the attachment names
(cf screenshots) we can notice these "high cardinality" attachments look like inlined images
in signature...

The stand here is that deduplicating is not a concern for attachments, but for blobs. We should
further push this lower level constraint in the stack. That way, each blob would be deduplicated
(storage cost reduction, higher FS cache efficiency, etc...) while avoiding *wide rows*.

We should ensure each newly generated AttachmentId is unique, then generate BlobId from the
blob's content, to avoid wide rows while keeping deduplication in place.

Note that this being done just for newly received messages, this can be done transparently,
without the needs for a migration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Mime
View raw message