spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yin Huai (JIRA)" <>
Subject [jira] [Created] (SPARK-9862) Join: Handling data skew
Date Wed, 12 Aug 2015 04:39:47 GMT
Yin Huai created SPARK-9862:

             Summary: Join: Handling data skew
                 Key: SPARK-9862
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
            Reporter: Yin Huai
            Assignee: Yin Huai

For a two way shuffle join, if one or multiple groups are skewed in one table (say left table)
but having a relative small number of rows in another table (say right table), we can use
broadcast join for these skewed groups and use shuffle join for other groups.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message