beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dhalp...@apache.org
Subject [1/3] incubator-beam-site git commit: minor: remove duplicate words
Date Wed, 28 Sep 2016 20:37:32 GMT
Repository: incubator-beam-site
Updated Branches:
  refs/heads/asf-site ab1f700ca -> 976b0302a


minor: remove duplicate words


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/6a5a0b3c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/6a5a0b3c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/6a5a0b3c

Branch: refs/heads/asf-site
Commit: 6a5a0b3cd77d89b81638bc4787bc635d0e10fda5
Parents: ab1f700
Author: terrencehan(韩亮) <terrencehan@tencent.com>
Authored: Wed Sep 28 17:56:30 2016 +0800
Committer: terrencehan(韩亮) <terrencehan@tencent.com>
Committed: Wed Sep 28 17:56:30 2016 +0800

----------------------------------------------------------------------
 learn/programming-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/6a5a0b3c/learn/programming-guide.md
----------------------------------------------------------------------
diff --git a/learn/programming-guide.md b/learn/programming-guide.md
index ac18ba6..a7e5f12 100644
--- a/learn/programming-guide.md
+++ b/learn/programming-guide.md
@@ -158,7 +158,7 @@ A `PCollection` is a large, immutable "bag" of elements. There is no upper
limit
 
 A `PCollection` can be either **bounded** or **unbounded** in size. A **bounded** `PCollection`
represents a data set of a known, fixed size, while an **unbounded** `PCollection` represents
a data set of unlimited size. Whether a `PCollection` is bounded or unbounded depends on the
source of the data set that it represents. Reading from a batch data source, such as a file
or a database, creates a bounded `PCollection`. Reading from a streaming or continously-updating
data source, such as Pub/Sub or Kafka, creates an unbounded `PCollection` (unless you explicitly
tell it not to).
 
-The bounded (or unbounded) nature The bounded (or unbounded) nature of your `PCollection`
affects how Beam processes your data. A bounded `PCollection` can be processed using a batch
job, which might read the entire data set once, and perform processing in a job of finite
length. An unbounded `PCollection` must be processed using a streaming job that runs continuously,
as the entire collection can never be available for processing at any one time.
+The bounded (or unbounded) nature of your `PCollection` affects how Beam processes your data.
A bounded `PCollection` can be processed using a batch job, which might read the entire data
set once, and perform processing in a job of finite length. An unbounded `PCollection` must
be processed using a streaming job that runs continuously, as the entire collection can never
be available for processing at any one time.
 
 When performing an operation that groups elements in an unbounded `PCollection`, Beam requires
a concept called **Windowing** to divide a continuously updating data set into logical windows
of finite size.  Beam processes each window as a bundle, and processing continues as the data
set is generated. These logical windows are determined by some characteristic associated with
a data element, such as a **timestamp**.
 


Mime
View raw message