spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianshi Huang <jianshi.hu...@gmail.com>
Subject Re: How to do broadcast join in SparkSQL
Date Wed, 08 Oct 2014 06:18:37 GMT
Looks like https://issues.apache.org/jira/browse/SPARK-1800 is not merged
into master?

I cannot find spark.sql.hints.broadcastTables in latest master, but it's in
the following patch.


https://github.com/apache/spark/commit/76ca4341036b95f71763f631049fdae033990ab5


Jianshi


On Mon, Sep 29, 2014 at 1:24 AM, Jianshi Huang <jianshi.huang@gmail.com>
wrote:

> Yes, looks like it can only be controlled by the
> parameter spark.sql.autoBroadcastJoinThreshold, which is a little bit weird
> to me.
>
> How am I suppose to know the exact bytes of a table? Let me specify the
> join algorithm is preferred I think.
>
> Jianshi
>
> On Sun, Sep 28, 2014 at 11:57 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> Have you looked at SPARK-1800 ?
>>
>> e.g. see sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala
>> Cheers
>>
>> On Sun, Sep 28, 2014 at 1:55 AM, Jianshi Huang <jianshi.huang@gmail.com>
>> wrote:
>>
>>> I cannot find it in the documentation. And I have a dozen dimension
>>> tables to (left) join...
>>>
>>>
>>> Cheers,
>>> --
>>> Jianshi Huang
>>>
>>> LinkedIn: jianshi
>>> Twitter: @jshuang
>>> Github & Blog: http://huangjs.github.com/
>>>
>>
>>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>



-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Mime
View raw message