spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <>
Subject Re: Inner join with the table itself
Date Mon, 15 Jan 2018 10:09:49 GMT
Hi Michael,

-dev +user

What's the query? How do you "fool spark"?

Jacek Laskowski
Mastering Spark SQL
Spark Structured Streaming
Mastering Kafka Streams
Follow me at

On Mon, Jan 15, 2018 at 10:23 AM, Michael Shtelma <>

> Hi all,
> If I try joining the table with itself using join columns, I am
> getting the following error:
> "Join condition is missing or trivial. Use the CROSS JOIN syntax to
> allow cartesian products between these relations.;"
> This is not true, and my join is not trivial and is not a real cross
> join. I am providing join condition and expect to get maybe a couple
> of joined rows for each row in the original table.
> There is a workaround for this, which implies renaming all the columns
> in source data frame and only afterwards proceed with the join. This
> allows us to fool spark.
> Now I am wondering if there is a way to get rid of this problem in a
> better way? I do not like the idea of renaming the columns because
> this makes it really difficult to keep track of the names in the
> columns in result data frames.
> Is it possible to deactivate this check?
> Thanks,
> Michael
> ---------------------------------------------------------------------
> To unsubscribe e-mail:

View raw message