flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [flink] wuchong commented on a change in pull request #8341: [FLINK-11633][docs-zh] Translate "Working with state" into Chinese
Date Sun, 05 May 2019 08:26:17 GMT
wuchong commented on a change in pull request #8341: [FLINK-11633][docs-zh] Translate "Working
with state" into Chinese
URL: https://github.com/apache/flink/pull/8341#discussion_r281008017
 
 

 ##########
 File path: docs/dev/stream/state/state.zh.md
 ##########
 @@ -22,122 +22,87 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-This document explains how to use Flink's state abstractions when developing an application.
-
-* ToC
+本文档主要介绍如何在 Flink 作业中使用状态
+* 目录
 {:toc}
 
 ## Keyed State and Operator State
 
-There are two basic kinds of state in Flink: `Keyed State` and `Operator State`.
+Flink 中有两种基本的状态:`Keyed State` 和 `Operator State`
 
 ### Keyed State
 
-*Keyed State* is always relative to keys and can only be used in functions and operators
on a `KeyedStream`.
+*Keyed State* 通常和键相关,仅可使用在 `KeyedStream` 的方法和算子中。
 
-You can think of Keyed State as Operator State that has been partitioned,
-or sharded, with exactly one state-partition per key.
-Each keyed-state is logically bound to a unique
-composite of <parallel-operator-instance, key>, and since each key
-"belongs" to exactly one parallel instance of a keyed operator, we can
-think of this simply as <operator, key>.
+你可以把 Keyed State 看作分区或者共享的 Operator State, 而且每个键仅出现在一个分区内。
+逻辑上每个 keyed-state 和 <算子并发实例, key> 相绑定,由于每个
key 仅"属于"
+算子的一个并发,因此简化为 <算子, key>。
 
-Keyed State is further organized into so-called *Key Groups*. Key Groups are the
-atomic unit by which Flink can redistribute Keyed State;
-there are exactly as many Key Groups as the defined maximum parallelism.
-During execution each parallel instance of a keyed operator works with the keys
-for one or more Key Groups.
+Keyed State 会按照 *Key Group* 进行管理。Key Group 是 Flink 分发 Keyed State 的最小单元;
+Key Groups 的数目等于作业的最大并发数。在执行过程中,每个 keyed operator
会对应到一个或多个 Key Group
 
 ### Operator State
 
-With *Operator State* (or *non-keyed state*), each operator state is
-bound to one parallel operator instance.
-The [Kafka Connector]({{ site.baseurl }}/dev/connectors/kafka.html) is a good motivating
example for the use of Operator State
-in Flink. Each parallel instance of the Kafka consumer maintains a map
-of topic partitions and offsets as its Operator State.
+对于 *Operator State* (或者 *non-keyed state*) 来说,每个 operator state 和一个并发实例进行绑定。
+[Kafka Connector]({{ site.baseurl }}/zh/dev/connectors/kafka.html) 是 Flink 中使用 operator
state 的一个很好的示例。
+每个 Kafka 消费者的并发在 Operator State 中维护一个 topic partition 到 offset
的映射关系。
 
-The Operator State interfaces support redistributing state among
-parallel operator instances when the parallelism is changed. There can be different schemes
for doing this redistribution.
+Operator State 在 Flink 作业的并发改变后,会重新分发状态,分发的策略和
Keyed State 不一样。
 
 ## Raw and Managed State
 
-*Keyed State* and *Operator State* exist in two forms: *managed* and *raw*.
+*Keyed State* 和 *Operator State* 分别有两种存在形式:*managed* and *raw*.
 
-*Managed State* is represented in data structures controlled by the Flink runtime, such as
internal hash tables, or RocksDB.
-Examples are "ValueState", "ListState", etc. Flink's runtime encodes
-the states and writes them into the checkpoints.
+*Managed State* 有 Flink runtime 中的数据结构所控制,比如内部的 hash table
或者 RocksDB。
+比如 "ValueState", "ListState" 等。Flink runtime 会对这些状态进行编码并写入
checkpoint。
 
-*Raw State* is state that operators keep in their own data structures. When checkpointed,
they only write a sequence of bytes into
-the checkpoint. Flink knows nothing about the state's data structures and sees only the raw
bytes.
+*Raw State* 则保存在算子自己的数据结构中。checkpoint 的时候,Flink 并不知晓具体的内容,仅仅写入一串字节序列到
checkpoint。
 
-All datastream functions can use managed state, but the raw state interfaces can only be
used when implementing operators.
-Using managed state (rather than raw state) is recommended, since with
-managed state Flink is able to automatically redistribute state when the parallelism is
-changed, and also do better memory management.
+所有 datastream 的方法都可以使用 managed state, 但是 raw state 则只能在实现算子的时候使用。
+由于 Flink 可以在修改并发是更好的重新分发状态数据,并且能够更好的管理内存,因此建议使用
managed state(而不是 raw state)。
 
 Review comment:
   ```suggestion
   因为使用 managed stae 的话,Flink 可以在修改并发时更好地重新分发状态数据,并且能够更好地管理内存,所以建议使用
managed state(而不是 raw state)。
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message