beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From git-site-r...@apache.org
Subject [beam] branch asf-site updated: Publishing website 2020/07/24 23:25:06 at commit b36441d
Date Fri, 24 Jul 2020 23:25:27 GMT
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new c35e319  Publishing website 2020/07/24 23:25:06 at commit b36441d
c35e319 is described below

commit c35e3190abd120c0b0adee3d91283666ff5dfb6a
Author: jenkins <users@infra.apache.org>
AuthorDate: Fri Jul 24 23:25:06 2020 +0000

    Publishing website 2020/07/24 23:25:06 at commit b36441d
---
 website/generated-content/documentation/runners/spark/index.html | 8 ++++----
 website/generated-content/sitemap.xml                            | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/website/generated-content/documentation/runners/spark/index.html b/website/generated-content/documentation/runners/spark/index.html
index 1ec59d1..8e16e2c 100644
--- a/website/generated-content/documentation/runners/spark/index.html
+++ b/website/generated-content/documentation/runners/spark/index.html
@@ -58,9 +58,9 @@ the portable Runner. For more information on portability, please visit the
   <span class=o>&lt;/</span><span class=n>executions</span><span
class=o>&gt;</span>
 <span class=o>&lt;/</span><span class=n>plugin</span><span
class=o>&gt;</span></code></pre></div></div><p class=language-java>After
running <code>mvn package</code>, run <code>ls target</code> and you
should see (assuming your artifactId is <code>beam-examples</code> and the version
is <code>1.0.0</code>):</p><div class=language-java><div class=highlight><pre
class=chroma><code class=language-java data-lang=java><span class=n>beam</span><span
class=o>-</span><span class=n>examples</span> [...]
 Apache Beam with Python you have to install the Apache Beam Python SDK: <code>pip install
apache_beam</code>. Please refer to the <a href=/documentation/sdks/python/>Python
documentation</a>
-on how to create a Python pipeline.</p><div class=language-py><div class=highlight><pre
class=chroma><code class=language-py data-lang=py><span class=n>pip</span>
<span class=n>install</span> <span class=n>apache_beam</span></code></pre></div></div><p
class=language-py>As of now you will need a copy of Apache Beam&rsquo;s source code.
You can
-download it on the <a href=/get-started/downloads/>Downloads page</a>. In the
future there will be pre-built Docker images
-available.</p><p class=language-py><ol><li>Start the JobService endpoint:
<code>./gradlew :runners:spark:job-server:runShadow</code></li></ol></p><p
class=language-py>The JobService is the central instance where you submit your Beam pipeline.
+on how to create a Python pipeline.</p><div class=language-py><div class=highlight><pre
class=chroma><code class=language-py data-lang=py><span class=n>pip</span>
<span class=n>install</span> <span class=n>apache_beam</span></code></pre></div></div><p
class=language-py>Starting from Beam 2.20.0, pre-built Spark Job Service Docker images
are available at
+<a href=https://hub.docker.com/r/apache/beam_spark_job_server>Docker Hub</a>.</p><p
class=language-py>For older Beam versions, you will need a copy of Apache Beam&rsquo;s
source code. You can
+download it on the <a href=/get-started/downloads/>Downloads page</a>.</p><p
class=language-py><ol><li>Start the JobService endpoint:<ul><li>with
Docker (preferred): <code>docker run --net=host apache/beam_spark_job_server:latest</code></li><li>or
from Beam source code: <code>./gradlew :runners:spark:job-server:runShadow</code></li></ul></li></ol></p><p
class=language-py>The JobService is the central instance where you submit your Beam pipeline.
 The JobService will create a Spark job for the pipeline and execute the
 job. To execute the job on a Spark cluster, the Beam JobService needs to be
 provided with the Spark master address.</p><p class=language-py><ol start=2><li>Submit
the Python pipeline to the above endpoint by using the <code>PortableRunner</code>,
<code>job_endpoint</code> set to <code>localhost:8099</code> (this
is the default address of the JobService), and <code>environment_type</code> set
to <code>LOOPBACK</code>. For example:</li></ol></p><div
class=language-py><div class=highlight><pre class=chroma><code class=language-py
data-lang=py><span class=kn>import< [...]
@@ -73,7 +73,7 @@ provided with the Spark master address.</p><p class=language-py><ol
start=2><li>
 <span class=p>])</span>
 <span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span
class=n>Pipeline</span><span class=p>(</span><span class=n>options</span><span
class=p>)</span> <span class=k>as</span> <span class=n>p</span><span
class=p>:</span>
     <span class=o>...</span></code></pre></div></div><h3
id=running-on-a-pre-deployed-spark-cluster>Running on a pre-deployed Spark cluster</h3><p>Deploying
your Beam pipeline on a cluster that already has a Spark deployment (Spark classes are available
in container classpath) does not require any additional dependencies.
-For more details on the different deployment modes see: <a href=https://spark.apache.org/docs/latest/spark-standalone.html>Standalone</a>,
<a href=https://spark.apache.org/docs/latest/running-on-yarn.html>YARN</a>, or
<a href=https://spark.apache.org/docs/latest/running-on-mesos.html>Mesos</a>.</p><p
class=language-py><ol><li>Start a Spark cluster which exposes the master on
port 7077 by default.</li></ol></p><p class=language-py><ol start=2><li>Start
JobService that will connect with th [...]
+For more details on the different deployment modes see: <a href=https://spark.apache.org/docs/latest/spark-standalone.html>Standalone</a>,
<a href=https://spark.apache.org/docs/latest/running-on-yarn.html>YARN</a>, or
<a href=https://spark.apache.org/docs/latest/running-on-mesos.html>Mesos</a>.</p><p
class=language-py><ol><li>Start a Spark cluster which exposes the master on
port 7077 by default.</li></ol></p><p class=language-py><ol start=2><li>Start
JobService that will connect with th [...]
 Note however that <code>environment_type=LOOPBACK</code> is only intended for
local testing.
 See <a href=/roadmap/portability/#sdk-harness-config>here</a> for details.</li></ol></p><p
class=language-py>(Note that, depending on your cluster setup, you may need to change the
<code>environment_type</code> option.
 See <a href=/roadmap/portability/#sdk-harness-config>here</a> for details.)</p><h2
id=pipeline-options-for-the-spark-runner>Pipeline options for the Spark Runner</h2><p>When
executing your pipeline with the Spark Runner, you should consider the following pipeline
options.</p><p class=language-java><br><b>For RDD/DStream based runner:</b><br></p><table
class="language-java table table-bordered"><tr><th>Field</th><th>Description</th><th>Default
Value</th></tr><tr><td><code>runner</code></t [...]
diff --git a/website/generated-content/sitemap.xml b/website/generated-content/sitemap.xml
index 24a7642..598be8b 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.22.0/</loc><lastmod>2020-06-08T14:13:37-07:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2020-06-08T14:13:37-07:00</lastmod></url><url><loc>/blog/</loc><lastmod>2020-06-08T14:13:37-07:00</lastmod></url><url><loc>/categories/</loc><lastmod>2020-06-08T14:13:37-07:00</lastmod></url><url><loc>/blog/b
[...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.22.0/</loc><lastmod>2020-06-08T14:13:37-07:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2020-06-08T14:13:37-07:00</lastmod></url><url><loc>/blog/</loc><lastmod>2020-06-08T14:13:37-07:00</lastmod></url><url><loc>/categories/</loc><lastmod>2020-06-08T14:13:37-07:00</lastmod></url><url><loc>/blog/b
[...]
\ No newline at end of file


Mime
View raw message