beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From git-site-r...@apache.org
Subject [beam] branch asf-site updated: Publishing website 2019/08/02 17:58:12 at commit c6c3bce
Date Fri, 02 Aug 2019 17:58:23 GMT
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 4d5e2ed  Publishing website 2019/08/02 17:58:12 at commit c6c3bce
4d5e2ed is described below

commit 4d5e2ed9040590b56ef7692bfac7f4524c6c2f00
Author: jenkins <builds@apache.org>
AuthorDate: Fri Aug 2 17:58:12 2019 +0000

    Publishing website 2019/08/02 17:58:12 at commit c6c3bce
---
 .../contribute/runner-guide/index.html             | 28 ++++++++--------
 .../documentation/io/built-in/index.html           |  4 +--
 .../patterns/file-processing-patterns/index.html   |  2 +-
 .../documentation/runners/jet/index.html           |  2 +-
 .../sdks/python-dependencies/index.html            | 39 ++++++++++++++++++++++
 .../transforms/python/other/reshuffle/index.html   |  2 +-
 .../transforms/python/overview/index.html          |  2 +-
 7 files changed, 59 insertions(+), 20 deletions(-)

diff --git a/website/generated-content/contribute/runner-guide/index.html b/website/generated-content/contribute/runner-guide/index.html
index db0e1d3..5ef955c 100644
--- a/website/generated-content/contribute/runner-guide/index.html
+++ b/website/generated-content/contribute/runner-guide/index.html
@@ -574,7 +574,7 @@ match across all SDKs.</p>
 
 <p>The <code class="highlighter-rouge">run(Pipeline)</code> method should
be asynchronous and results in a
 PipelineResult which generally will be a job descriptor for your data
-processing engine, provides methods for checking its status, canceling it, and
+processing engine, providing methods for checking its status, canceling it, and
 waiting for it to terminate.</p>
 
 <h2 id="implementing-the-beam-primitives">Implementing the Beam Primitives</h2>
@@ -588,7 +588,7 @@ provided.</p>
 <p>The primitives are designed for the benefit of pipeline authors, not runner
 authors. Each represents a different conceptual mode of operation (external IO,
 element-wise, grouping, windowing, union) rather than a specific implementation
-decision.  The same primitive may require very different implementation based
+decision.  The same primitive may require a very different implementation based
 on how the user instantiates it. For example, a <code class="highlighter-rouge">ParDo</code>
that uses state or
 timers may require key partitioning, a <code class="highlighter-rouge">GroupByKey</code>
with speculative triggering
 may require a more costly or complex implementation, and <code class="highlighter-rouge">Read</code>
is completely
@@ -657,7 +657,7 @@ initialization is almost always equivalent and more efficient, but this
hook
 remains for simplicity for users)</li>
   <li><em>ProcessElement</em> / <em>OnTimer</em> - called for
each element and timer activation</li>
   <li><em>FinishBundle</em> - essentially “flush”; required to be called
before
-considering elements actually processed</li>
+considering elements as actually processed</li>
   <li><em>Teardown</em> - release resources that were used across bundles;
calling this
 can be best effort due to failures</li>
 </ul>
@@ -712,7 +712,7 @@ via the <a href="#the-fn-api">Fn API</a> may manifest as another
implementation
 <p>A side input is a global view of a window of a <code class="highlighter-rouge">PCollection</code>.
This distinguishes
 it from the main input, which is processed one element at a time. The SDK/user
 prepares a <code class="highlighter-rouge">PCollection</code> adequately, the
runner materializes it, and then the
-runner feeds it to the <code class="highlighter-rouge">DoFn</code>. See the</p>
+runner feeds it to the <code class="highlighter-rouge">DoFn</code>.</p>
 
 <p>What you will need to implement is to inspect the materialization requested for
 the side input, and prepare it appropriately, and corresponding interactions
@@ -758,7 +758,7 @@ function. See
 
 <p><em>Main design document: <a href="https://s.apache.org/beam-state">https://s.apache.org/beam-state</a></em></p>
 
-<p>When <code class="highlighter-rouge">ParDo</code> includes state and
timers, its execution on your runner is usually
+<p>When a <code class="highlighter-rouge">ParDo</code> includes state and
timers, its execution on your runner is usually
 very different. See the full details beyond those covered here.</p>
 
 <p>State and timers are partitioned per key and window. You may need or want to
@@ -778,7 +778,7 @@ this to implement user-facing state.</p>
 <p><em>Main design document: <a href="https://s.apache.org/splittable-do-fn">https://s.apache.org/splittable-do-fn</a></em></p>
 
 <p>Splittable <code class="highlighter-rouge">DoFn</code> is a generalization
and combination of <code class="highlighter-rouge">ParDo</code> and <code class="highlighter-rouge">Read</code>.
It
-is per-element processing where each element the capabilities of being “split”
+is per-element processing where each element has the capability of being “split”
 in the same ways as a <code class="highlighter-rouge">BoundedSource</code> or
<code class="highlighter-rouge">UnboundedSource</code>. This enables better
 performance for use cases such as a <code class="highlighter-rouge">PCollection</code>
of names of large files where
 you want to read each of them. Previously they would have to be static data in
@@ -821,7 +821,7 @@ grouping.</p>
 <h4 id="implementing-via-groupbykeyonly--groupalsobywindow">Implementing via GroupByKeyOnly
+ GroupAlsoByWindow</h4>
 
 <p>The Java codebase includes support code for a particularly common way of
-implement the full <code class="highlighter-rouge">GroupByKey</code> operation:
first group the keys, and then group
+implementing the full <code class="highlighter-rouge">GroupByKey</code> operation:
first group the keys, and then group
 by window. For merging windows, this is essentially required, since merging is
 per key.</p>
 
@@ -868,7 +868,7 @@ inputs, or just ignore inputs and choose the end of the window.</p>
 <p>The window primitive applies a <code class="highlighter-rouge">WindowFn</code>
UDF to place each input element into
 one or more windows of its output PCollection. Note that the primitive also
 generally configures other aspects of the windowing strategy for a <code class="highlighter-rouge">PCollection</code>,
-but the fully constructed graph that your runner receive will already have a
+but the fully constructed graph that your runner receives will already have a
 complete windowing strategy for each <code class="highlighter-rouge">PCollection</code>.</p>
 
 <p>To implement this primitive, you need to invoke the provided WindowFn on each
@@ -906,14 +906,14 @@ it like a stream. The capabilities are:</p>
 
 <ul>
   <li><code class="highlighter-rouge">split(int)</code> - your runner should
call this to get the desired parallelism</li>
-  <li><code class="highlighter-rouge">createReader(...)</code> - call this
to start reading elements; it is an enhanced iterator that also vends:</li>
-  <li>watermark (for this source) which you should propagate downstream
-timestamps, which you should associate with elements read</li>
+  <li><code class="highlighter-rouge">createReader(...)</code> - call this
to start reading elements; it is an enhanced iterator that also provides:</li>
+  <li>watermark (for this source) which you should propagate downstream</li>
+  <li>timestamps, which you should associate with elements read</li>
   <li>record identifiers, so you can dedup downstream if needed</li>
   <li>progress indication of its backlog</li>
   <li>checkpointing</li>
   <li><code class="highlighter-rouge">requiresDeduping</code> - this indicates
that there is some chance that the source
-may emit dupes; your runner should do its best to dedupe based on the
+may emit duplicates; your runner should do its best to dedupe based on the
 identifier attached to emitted records</li>
 </ul>
 
@@ -927,7 +927,7 @@ collection of log files, or a database table. The capabilities are:</p>
 <ul>
   <li><code class="highlighter-rouge">split(int)</code> - your runner should
call this to get desired initial parallelism (but you can often steal work later)</li>
   <li><code class="highlighter-rouge">getEstimatedSizeBytes(...)</code>
- self explanatory</li>
-  <li><code class="highlighter-rouge">createReader(...)</code> - call this
to start reading elements; it is an enhanced iterator, with also:</li>
+  <li><code class="highlighter-rouge">createReader(...)</code> - call this
to start reading elements; it is an enhanced iterator that also provides:</li>
   <li>timestamps to associate with each element read</li>
   <li><code class="highlighter-rouge">splitAtFraction</code> for dynamic
splitting to enable work stealing, and other
 methods to support it - see the <a href="/blog/2016/05/18/splitAtFraction-method.html">Beam
blog post on dynamic work
@@ -1036,7 +1036,7 @@ scan the dependencies of the SDK for tests with the JUnit category
 </code></pre>
 </div>
 
-<p>Enable these tests in other languages is unexplored.</p>
+<p>Enabling these tests in other languages is unexplored.</p>
 
 <h2 id="integrating-your-runner-nicely-with-sdks">Integrating your runner nicely with
SDKs</h2>
 
diff --git a/website/generated-content/documentation/io/built-in/index.html b/website/generated-content/documentation/io/built-in/index.html
index ad2894b..431a3fe 100644
--- a/website/generated-content/documentation/io/built-in/index.html
+++ b/website/generated-content/documentation/io/built-in/index.html
@@ -479,8 +479,6 @@ limitations under the License.
     <p><a href="https://github.com/apache/beam/blob/master/sdks/java/io/xml/src/main/java/org/apache/beam/sdk/io/xml/XmlIO.java">XmlIO</a></p>
     <p><a href="https://github.com/apache/beam/blob/master/sdks/java/io/tika/src/main/java/org/apache/beam/sdk/io/tika/TikaIO.java">TikaIO</a></p>
     <p><a href="https://github.com/apache/beam/blob/master/sdks/java/io/parquet/src/main/java/org/apache/beam/sdk/io/parquet/ParquetIO.java">ParquetIO</a></p>
-    <p><a href="https://github.com/apache/beam/blob/master/sdks/java/io/rabbitmq/src/main/java/org/apache/beam/sdk/io/rabbitmq/RabbitMqIO.java">RabbitMqIO</a></p>
-    <p><a href="https://github.com/apache/beam/blob/master/sdks/java/io/amazon-web-services/src/main/java/org/apache/beam/sdk/io/aws/sqs/SqsIO.java">SqsIO</a></p>
   </td>
   <td>
     <p><a href="https://github.com/apache/beam/tree/master/sdks/java/io/kinesis">Amazon
Kinesis</a></p>
@@ -489,6 +487,8 @@ limitations under the License.
     <p><a href="https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub">Google
Cloud Pub/Sub</a></p>
     <p><a href="https://github.com/apache/beam/tree/master/sdks/java/io/jms">JMS</a></p>
     <p><a href="https://github.com/apache/beam/tree/master/sdks/java/io/mqtt">MQTT</a></p>
+    <p><a href="https://github.com/apache/beam/blob/master/sdks/java/io/rabbitmq/src/main/java/org/apache/beam/sdk/io/rabbitmq/RabbitMqIO.java">RabbitMqIO</a></p>
+    <p><a href="https://github.com/apache/beam/blob/master/sdks/java/io/amazon-web-services/src/main/java/org/apache/beam/sdk/io/aws/sqs/SqsIO.java">SqsIO</a></p>
   </td>
   <td>
     <p><a href="https://github.com/apache/beam/tree/master/sdks/java/io/cassandra">Apache
Cassandra</a></p>
diff --git a/website/generated-content/documentation/patterns/file-processing-patterns/index.html
b/website/generated-content/documentation/patterns/file-processing-patterns/index.html
index c0b981c..d500717 100644
--- a/website/generated-content/documentation/patterns/file-processing-patterns/index.html
+++ b/website/generated-content/documentation/patterns/file-processing-patterns/index.html
@@ -545,7 +545,7 @@ limitations under the License.
 
 <ol class="language-java">
   <li>Create a <code class="highlighter-rouge">ReadableFile</code> instance
with <code class="highlighter-rouge">FileIO</code>. <code class="highlighter-rouge">FileIO</code>
returns a <code class="highlighter-rouge">PCollection&lt;ReadableFile&gt;</code>
object. The <code class="highlighter-rouge">ReadableFile</code> class contains
the filename.</li>
-  <li>Call the <code class="highlighter-rouge">readFullyAsUTF9String()</code>
method to read the file into memory and return the filename as a <code class="highlighter-rouge">String</code>
object. If memory is limited, you can use utility classes like <a href="https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/FileSystems.html"><code
class="highlighter-rouge">FileSystems</code></a> to work directly with the
file.</li>
+  <li>Call the <code class="highlighter-rouge">readFullyAsUTF8String()</code>
method to read the file into memory and return the filename as a <code class="highlighter-rouge">String</code>
object. If memory is limited, you can use utility classes like <a href="https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/FileSystems.html"><code
class="highlighter-rouge">FileSystems</code></a> to work directly with the
file.</li>
 </ol>
 
 <p class="language-py">To read filenames in a pipeline job:</p>
diff --git a/website/generated-content/documentation/runners/jet/index.html b/website/generated-content/documentation/runners/jet/index.html
index e95d5f3..4781f5a 100644
--- a/website/generated-content/documentation/runners/jet/index.html
+++ b/website/generated-content/documentation/runners/jet/index.html
@@ -230,7 +230,7 @@ limitations under the License.
 
 <h2 id="overview">Overview</h2>
 
-<p>The Hazelcast Jet Runner can be used to execute Beam pipelines using <a href="https://jet.hazelcast.org/">Hazelcat
+<p>The Hazelcast Jet Runner can be used to execute Beam pipelines using <a href="https://jet.hazelcast.org/">Hazelcast
 Jet</a>.</p>
 
 <p>The Jet Runner and Jet are suitable for large scale continuous jobs and provide:</p>
diff --git a/website/generated-content/documentation/sdks/python-dependencies/index.html b/website/generated-content/documentation/sdks/python-dependencies/index.html
index 4c52fbc..43e54ba 100644
--- a/website/generated-content/documentation/sdks/python-dependencies/index.html
+++ b/website/generated-content/documentation/sdks/python-dependencies/index.html
@@ -290,6 +290,45 @@ the listed versions that will be in scope during execution.</p>
 <p>To see the compile and runtime dependencies for your Beam SDK version, expand
 the relevant section below.</p>
 
+<details><summary><b>2.14.0</b></summary>
+
+<p>Beam SDK for Python 2.14.0 has the following compile and
+  runtime dependencies.</p>
+<table class="table-bordered table-striped">
+  <tr><th>Package</th><th>Version</th></tr>
+  <tr><td>avro-python3</td><td>&gt;=1.8.1,&lt;2.0.0; python_version
&gt;= "3.0"</td></tr>
+  <tr><td>avro</td><td>&gt;=1.8.1,&lt;2.0.0; python_version
&lt; "3.0"</td></tr>
+  <tr><td>cachetools</td><td>&gt;=3.1.0,&lt;4</td></tr>
+  <tr><td>crcmod</td><td>&gt;=1.7,&lt;2.0</td></tr>
+  <tr><td>dill</td><td>&gt;=0.2.9,&lt;0.2.10</td></tr>
+  <tr><td>fastavro</td><td>&gt;=0.21.4,&lt;0.22</td></tr>
+  <tr><td>future</td><td>&gt;=0.16.0,&lt;1.0.0</td></tr>
+  <tr><td>futures</td><td>&gt;=3.2.0,&lt;4.0.0; python_version
&lt; "3.0"</td></tr>
+  <tr><td>google-apitools</td><td>&gt;=0.5.28,&lt;0.5.29</td></tr>
+  <tr><td>google-cloud-bigquery</td><td>&gt;=1.6.0,&lt;1.7.0</td></tr>
+  <tr><td>google-cloud-bigtable</td><td>&gt;=0.31.1,&lt;0.33.0</td></tr>
+  <tr><td>google-cloud-core</td><td>&gt;=0.28.1,&lt;0.30.0</td></tr>
+  <tr><td>google-cloud-datastore</td><td>&gt;=1.7.1,&lt;1.8.0</td></tr>
+  <tr><td>google-cloud-pubsub</td><td>&gt;=0.39.0,&lt;0.40.0</td></tr>
+  <tr><td>googledatastore</td><td>&gt;=7.0.1,&lt;7.1; python_version
&lt; "3.0"</td></tr>
+  <tr><td>grpcio</td><td>&gt;=1.8,&lt;2</td></tr>
+  <tr><td>hdfs</td><td>&gt;=2.1.0,&lt;3.0.0</td></tr>
+  <tr><td>httplib2</td><td>&gt;=0.8,&lt;=0.12.0</td></tr>
+  <tr><td>mock</td><td>&gt;=1.0.1,&lt;3.0.0</td></tr>
+  <tr><td>oauth2client</td><td>&gt;=2.0.1,&lt;4</td></tr>
+  <tr><td>proto-google-cloud-datastore-v1</td><td>&gt;=0.90.0,&lt;=0.90.4;
python_version &lt; "3.0"</td></tr>
+  <tr><td>protobuf</td><td>&gt;=3.5.0.post1,&lt;4</td></tr>
+  <tr><td>pyarrow</td><td>&gt;=0.11.1,&lt;0.15.0; python_version
&gt;= "3.0" or platform_system != "Windows"</td></tr>
+  <tr><td>pydot</td><td>&gt;=1.2.0,&lt;1.3</td></tr>
+  <tr><td>pymongo</td><td>&gt;=3.8.0,&lt;4.0.0</td></tr>
+  <tr><td>pytz</td><td>&gt;=2018.3</td></tr>
+  <tr><td>pyvcf</td><td>&gt;=0.6.8,&lt;0.7.0; python_version
&lt; "3.0"</td></tr>
+  <tr><td>pyyaml</td><td>&gt;=3.12,&lt;4.0.0</td></tr>
+  <tr><td>typing</td><td>&gt;=3.6.0,&lt;3.7.0; python_version
&lt; "3.5.0"</td></tr>
+</table>
+
+</details>
+
 <details><summary><b>2.13.0</b></summary>
 
 <p>Beam SDK for Python 2.13.0 has the following compile and
diff --git a/website/generated-content/documentation/transforms/python/other/reshuffle/index.html
b/website/generated-content/documentation/transforms/python/other/reshuffle/index.html
index d992f9a..8059d8b 100644
--- a/website/generated-content/documentation/transforms/python/other/reshuffle/index.html
+++ b/website/generated-content/documentation/transforms/python/other/reshuffle/index.html
@@ -470,7 +470,7 @@ limitations under the License.
  Adds a temporary random key to each element in a collection, reshuffles
  these keys, and removes the temporary key. This redistributes the
  elements between workers and returns a collection equivalent to its
- input collection.  This is most useful for adjusting paralellism or
+ input collection.  This is most useful for adjusting parallelism or
  preventing coupled failures.</p>
 
 <h2 id="examples">Examples</h2>
diff --git a/website/generated-content/documentation/transforms/python/overview/index.html
b/website/generated-content/documentation/transforms/python/overview/index.html
index 01403e9..30f3430 100644
--- a/website/generated-content/documentation/transforms/python/overview/index.html
+++ b/website/generated-content/documentation/transforms/python/overview/index.html
@@ -517,7 +517,7 @@ limitations under the License.
 </td></tr>
   <tr><td>PAssert</td><td>Not available.</td></tr>
   <tr><td><a href="/documentation/transforms/python/other/reshuffle">Reshuffle</a></td><td>Given
an input collection, redistributes the elements between workers. This is
-  most useful for adjusting paralellism or preventing coupled failures.</td></tr>
+  most useful for adjusting parallelism or preventing coupled failures.</td></tr>
   <tr><td>View</td><td>Not available.</td></tr>
   <tr><td><a href="/documentation/transforms/python/other/windowinto">WindowInto</a></td><td>Logically
divides up or groups the elements of a collection into finite
   windows according to a function.</td></tr>


Mime
View raw message