spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Vanzin (JIRA)" <>
Subject [jira] [Resolved] (SPARK-25035) Replicating disk-stored blocks should avoid memory mapping
Date Mon, 25 Feb 2019 19:45:00 GMT


Marcelo Vanzin resolved SPARK-25035.
       Resolution: Fixed
    Fix Version/s: 3.0.0

Issue resolved by pull request 23688

> Replicating disk-stored blocks should avoid memory mapping
> ----------------------------------------------------------
>                 Key: SPARK-25035
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.3.1
>            Reporter: Imran Rashid
>            Assignee: Attila Zsolt Piros
>            Priority: Major
>              Labels: memory-analysis
>             Fix For: 3.0.0
> This is a follow-up to SPARK-24296.
> When replicating a disk-cached block, even if we fetch-to-disk, we still memory-map the
file, just to copy it to another location.
> Ideally we'd just move the tmp file to the right location.  But even without that, we
could read the file as an input stream, instead of memory-mapping the whole thing.  Memory-mapping
is particularly a problem when running under yarn, as the OS may believe there is plenty of
memory available, meanwhile yarn decides to kill the process for exceeding memory limits.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message