spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: [SPARK-29176][DISCUSS] Optimization should change join type to CROSS
Date Wed, 06 Nov 2019 14:53:23 GMT
You asked for an inner join but it turned into a cross-join. This
might be surprising, hence the error you can disable.
The query is not invalid in any case. It's just stopping you from
doing something you may not meant to, and which may be expensive.
However I think we've already changed the default to enable it in
Spark 3 anyway.

On Wed, Nov 6, 2019 at 8:50 AM Enrico Minack <mail@enrico.minack.dev> wrote:
>
> Hi,
>
> I would like to discuss issue SPARK-29176 to see if this is considered a bug and if so,
to sketch out a fix.
>
> In short, the issue is that a valid inner join with condition gets optimized so that
no condition is left, but the type is still INNER. Then CheckCartesianProducts throws an exception.
The type should have changed to CROSS when it gets optimized in that way.
>
> I understand that with spark.sql.crossJoin.enabled you can make Spark not throw this
exception, but I think you should not need this work-around for a valid query.
>
> Please let me know what you think about this issue and how I could fix it. It might affect
more rules than the two given in the Jira ticket.
>
> Thanks,
> Enrico

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message