flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rong Rong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-10474) Don't translate IN to JOIN with VALUES for streaming queries
Date Tue, 02 Oct 2018 03:03:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16634867#comment-16634867

Rong Rong commented on FLINK-10474:

Voting for the first approach.

I was actually wondering. Is there benefit from translating the IN to JOIN operator (even
on batch side) if the operand after IN is statically defined.

> Don't translate IN to JOIN with VALUES for streaming queries
> ------------------------------------------------------------
>                 Key: FLINK-10474
>                 URL: https://issues.apache.org/jira/browse/FLINK-10474
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API &amp; SQL
>    Affects Versions: 1.6.1, 1.7.0
>            Reporter: Fabian Hueske
>            Assignee: Hequn Cheng
>            Priority: Major
> IN clauses are translated to JOIN with VALUES if the number of elements in the IN clause
exceeds a certain threshold. This should not be done, because a streaming join is very heavy
and materializes both inputs (which is fine for the VALUES) input but not for the other.
> There are two ways to solve this:
>  # don't translate IN to a JOIN at all
>  # translate it to a JOIN but have a special join strategy if one input is bound and
final (non-updating)
> Option 1. should be easy to do, option 2. requires much more effort.

This message was sent by Atlassian JIRA

View raw message