flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kurt Young (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-11714) Add cost model for both batch and streaming
Date Fri, 01 Mar 2019 01:37:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kurt Young updated FLINK-11714:
-------------------------------
    Component/s:     (was: API / Table SQL)
                 SQL / Planner

> Add cost model for both batch and streaming
> -------------------------------------------
>
>                 Key: FLINK-11714
>                 URL: https://issues.apache.org/jira/browse/FLINK-11714
>             Project: Flink
>          Issue Type: New Feature
>          Components: SQL / Planner
>            Reporter: godfrey he
>            Assignee: godfrey he
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.9.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Calcite's default cost model only contains ROWS, IO and CPU, and does not take IO and
CPU into account when the cost is compared.
> There are two improvements:
> 1. Add NETWORK and MEMORY to represents distribution cost and memory usage.
> 2. The optimization goal is to use minimal resources now, so the comparison order of
factors is:
>     (1). first compare CPU. Each operator will use CPU, so we think it's the most important
factor.
>     (2). then compare MEMORY, NETWORK and IO as a normalized value. Comparison order
of them is not easy to decide, so convert them to CPU cost by different ratio.
>     (3). finally compare ROWS. ROWS has been counted when calculating other factory.
>          e.g. CPU of Sort = nLogN(ROWS) * number of sort keys, CPU of Filter = ROWS *
condition cost on a row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message