spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yin Huai <yh...@databricks.com>
Subject Re: Spark SQL and Skewed Joins
Date Wed, 17 Jun 2015 17:56:14 GMT
Hi John,

Did you also set spark.sql.planner.externalSort to true? Probably you will
not see executor lost with this conf. For now, maybe you can manually split
the query to two parts, one for skewed keys and one for other records.
Then, you union then results of these two parts together.

Thanks,

Yin

On Wed, Jun 17, 2015 at 9:53 AM, Koert Kuipers <koert@tresata.com> wrote:

> could it be composed maybe? a general version and then a sql version that
> exploits the additional info/abilities available there and uses the general
> version internally...
>
> i assume the sql version can benefit from the logical phase optimization
> to pick join details. or is there more?
>
> On Tue, Jun 16, 2015 at 7:37 PM, Michael Armbrust <michael@databricks.com>
> wrote:
>
>> this would be a great addition to spark, and ideally it belongs in spark
>>> core not sql.
>>>
>>
>> I agree with the fact that this would be a great addition, but we would
>> likely want a specialized SQL implementation for performance reasons.
>>
>
>

Mime
View raw message