james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tellier Benoit (JIRA)" <server-...@james.apache.org>
Subject [jira] [Commented] (JAMES-2555) Re-indexing big mailboxes fails
Date Mon, 15 Oct 2018 06:38:00 GMT

    [ https://issues.apache.org/jira/browse/JAMES-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16649788#comment-16649788
] 

Tellier Benoit commented on JAMES-2555:
---------------------------------------

https://github.com/linagora/james-project/pull/1799 did wrap reIndexing in a task

https://github.com/linagora/james-project/pull/1803 Removed no more needed event tracking

Because we reindex a list of UID for a given mailbox, if:
 - A new message arrive it will be handled by the normal indexing process
 - A concurrently deleted message will be filtered out as it can not be retrieved from storage
 - And we will read the latest applied flag value

Note that global event tracking is still needed for mailbox path renames.

https://github.com/linagora/james-project/pull/1804 added API support for per-user and per-message
reIndexing

> Re-indexing big mailboxes fails
> -------------------------------
>
>                 Key: JAMES-2555
>                 URL: https://issues.apache.org/jira/browse/JAMES-2555
>             Project: James Server
>          Issue Type: Bug
>          Components: CLI, mailbox
>    Affects Versions: 3.0.0, master, 3.0.1, 3.1.0
>            Reporter: Tellier Benoit
>            Priority: Major
>              Labels: perf
>             Fix For: 3.2.0
>
>
> Reported internally:
> {code:java}
> Doing a reindex via the CLI on a mailbox which contains ~24000 mails does not work.
>  -  heap out of memory issue
> Heap increased, then:
>  -  cassandra timeout when increasing heap
> Timeout increased, then:
>  -  heap out of memory issue
> Heap increased, then:
>  -  GC overhead limit exceeded
> {code}
> In ReIndexerImpl instead of doing a FULL read to reindex the all mailbox (AND all of
the related data) we should rather do a first metadata query to load uids then perform subsequent
single reads.
> That is the root issue: trying to load 11 Go of mails in memory.
> Furthermore, a simple error should not trigger the end of a ReIndexing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Mime
View raw message