spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paweł Wiejacha (JIRA) <>
Subject [jira] [Commented] (SPARK-25787) [K8S] Spark can't use data locality information
Date Thu, 25 Jul 2019 18:25:00 GMT


Paweł Wiejacha commented on SPARK-25787:

I *can* reproduce this issue. In Spark UI, Locality Level is always ANY instead of NODE_LOCAL
when reading data from HDFS.

As Yinan Li said, it seems that:

> Support for data locality on k8s has not been ported to the upstream Spark repo yet.

I think that at least the pull request below should be ported and merged to support HDFS data
locality in Spark on Kubernetes.

Could you please reopen this issue?

> [K8S] Spark can't use data locality information
> -----------------------------------------------
>                 Key: SPARK-25787
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes
>    Affects Versions: 2.4.0
>            Reporter: Maciej Bryński
>            Priority: Major
> I started experimenting with Spark based on this presentation:
> I'm using excelent
> charts to deploy HDFS.
> Unfortunately reading from HDFS gives ANY locality for every task.
> Is data locality working on Kubernetes cluster ?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message