[ https://issues.apache.org/jira/browse/SPARK-17556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15516490#comment-15516490
]
Liang-Chi Hsieh edited comment on SPARK-17556 at 9/23/16 1:49 PM:
------------------------------------------------------------------
[~scwf] I quickly go through your design doc. Looks like you still need to collect the content
of RDD to the driver. I don't think it is the executor side broadcast mentioned in this jira's
description. You can refer to the PR I submitted with which we don't need to collect the RDD
back to the driver.
was (Author: viirya):
[~scwf] I quickly go through your design doc. Looks like you still need to collect the content
of RDD to the driver. I don't think it is executor side broadcast means in this jira's description.
You can refer to the PR I submitted with which we don't need to collect the RDD back to the
driver.
> Executor side broadcast for broadcast joins
> -------------------------------------------
>
> Key: SPARK-17556
> URL: https://issues.apache.org/jira/browse/SPARK-17556
> Project: Spark
> Issue Type: New Feature
> Components: Spark Core, SQL
> Reporter: Reynold Xin
> Attachments: executor broadcast.pdf
>
>
> Currently in Spark SQL, in order to perform a broadcast join, the driver must collect
the result of an RDD and then broadcast it. This introduces some extra latency. It might be
possible to broadcast directly from executors.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org
|