spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "vector" <>
Subject Filte the null before InnerJoin to solve the problem of data skew
Date Tue, 08 Dec 2015 13:58:51 GMT
when i join two tables, i find a table has the problem of data skew, and the skewing value
of the field is null. so i want to filte  the null before InnerJoin. like that

a.key is skewed and the skewing value is null


"select * from a join b on a.key = b.key"


"select * from a join b on a.key = b.key and a.key is not null"

The idea is feasible ?
View raw message