beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject incubator-beam git commit: Add a first file (at least to trigger the github mirroring)
Date Mon, 08 Feb 2016 07:44:05 GMT
Repository: incubator-beam
Updated Branches:
  refs/heads/master [created] 11e842717

Add a first file (at least to trigger the github mirroring)


Branch: refs/heads/master
Commit: 11e842717f70298a4ea8436363b3101117685f60
Author: Jean-Baptiste Onofré <>
Authored: Mon Feb 8 08:43:44 2016 +0100
Committer: Jean-Baptiste Onofré <>
Committed: Mon Feb 8 08:43:44 2016 +0100

---------------------------------------------------------------------- | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
diff --git a/ b/
new file mode 100644
index 0000000..f353381
--- /dev/null
+++ b/
@@ -0,0 +1,68 @@
+# Apache Beam
+[Apache Beam]( provides a simple, powerful
+programming model for building both batch and streaming parallel data processing
+pipelines. It also covers the data integration processus.
+[General usage]( is
+a good starting point for Apache Beam.
+You can take a look on the [Beam Examples](
+## Status [Build Status](
+## Overview
+The key concepts in this programming model are:
+* `PCollection`: represents a collection of data, which could be bounded or unbounded in
+* `PTransform`: represents a computation that transforms input PCollections into output PCollections.
+* `Pipeline`: manages a directed acyclic graph of PTransforms and PCollections that is ready
for execution.
+* `PipelineRunner`: specifies where and how the pipeline should execute.
+We provide the following PipelineRunners:
+  1. The `DirectPipelineRunner` runs the pipeline on your local machine.
+  2. The `BlockingDataflowPipelineRunner` submits the pipeline to the Dataflow Service via
the `DataflowPipelineRunner`
+and then prints messages about the job status until the execution is complete.
+  3. The `SparkPipelineRunner` runs the pipeline on an Apache Spark cluster.
+  4. The `FlinkPipelineRunner` runs the pipeline on an Apache Flink cluster.
+## Getting Started
+The following command will build both the `sdk` and `example` modules and
+install them in your local Maven repository:
+    mvn clean install
+You can speed up the build and install process by using the following options:
+  1. To skip execution of the unit tests, run:
+        mvn install -DskipTests
+  2. While iterating on a specific module, use the following command to compile
+  and reinstall it. For example, to reinstall the `examples` module, run:
+        mvn install -pl examples
+  Be careful, however, as this command will use the most recently installed SDK
+  from the local repository (or Maven Central) even if you have changed it
+  locally.
+After building and installing, you can execute the `WordCount` and other
+example pipelines by following the instructions in this [README](
+## Contact Us
+You can subscribe on the mailing lists to discuss and get involved in Apache Beam:
+* [Subscribe]( on the [](
+* [Subscribe]( on the [](
+You can report issue on [Jira](
+## More Information
+* [Apache Beam](
+* [Apache Beam Documentation](

View raw message