hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mukund Thakur (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-17296) ABFS: Allow Random Reads to be of Buffer Size
Date Tue, 06 Oct 2020 09:27:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208604#comment-17208604

Mukund Thakur commented on HADOOP-17296:

won't setting the  fs.azure.readahead.range equal to buffer size in this PR  [https://github.com/apache/hadoop/pull/2307] will
achieve the same thing?

> ABFS: Allow Random Reads to be of Buffer Size
> ---------------------------------------------
>                 Key: HADOOP-17296
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17296
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.3.0
>            Reporter: Sneha Vijayarajan
>            Assignee: Sneha Vijayarajan
>            Priority: Major
>              Labels: abfsactive
> ADLS Gen2/ABFS driver is optimized to read only the bytes that are requested for when
the read pattern is random. 
> It was observed in some spark jobs that though the reads are random, the next read doesn't
skip by a lot and can be served by the earlier read if read was done in buffer size. As a
result the job triggered a higher count of read calls and resulted in higher job runtime.
> When these jobs were run against Gen1 which always reads in buffer size , the jobs fared
> In this Jira we try to provide a control over config on random read to be of requested
size or buffer size.

This message was sent by Atlassian Jira

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message