flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kurt Young (Jira)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-14567) Aggregate query with more than two group fields can't be write into HBase sink
Date Sun, 17 Nov 2019 06:49:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-14567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16975930#comment-16975930

Kurt Young commented on FLINK-14567:

Even if the framework can't derive primary key information of the query, we can still add
an operator to convert the stream into an upsert stream. 

> Aggregate query with more than two group fields can't be write into HBase sink
> ------------------------------------------------------------------------------
>                 Key: FLINK-14567
>                 URL: https://issues.apache.org/jira/browse/FLINK-14567
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / HBase, Table SQL / Legacy Planner, Table SQL / Planner
>            Reporter: Jark Wu
>            Priority: Critical
>             Fix For: 1.10.0
> If we have a hbase table sink with rowkey of varchar (also primary key) and a column
of bigint, we want to write the result of the following query into the sink using upsert mode.
However, it will fail when primary key check with the exception "UpsertStreamTableSink requires
that Table has a full primary keys if it is updated."
> {code:sql}
> select concat(f0, '-', f1) as key, sum(f2)
> from T1
> group by f0, f1
> {code}
> This happens in both blink planner and old planner. That is because if the query works
in update mode, then there must be a primary key exist to be extracted and set to {{UpsertStreamTableSink#setKeyFields}}.

> That's why we want to derive primary key for concat in FLINK-14539, however, we found
that the primary key is not preserved after concating. For example, if we have a primary key
(f0, f1, f2) which are all varchar type, say we have two unique records ('a', 'b', 'c') and
('ab', '', 'c'), but the results of concat(f0, f1, f2) are the same, which means the concat
result is not primary key anymore.
> So here comes the problem, how can we proper support HBase sink or such use case? 

This message was sent by Atlassian Jira

View raw message