beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner
Date Wed, 09 Aug 2017 02:10:02 GMT


ASF GitHub Bot commented on BEAM-165:

GitHub user peihe opened a pull request:

    [BEAM-165] Initial implementation of the MapReduce runner.

    Follow this checklist to help us incorporate your contribution quickly and easily:
     - [ ] Make sure there is a [JIRA issue](
filed for the change (usually before you start working on it).  Trivial changes like typos
do not require a JIRA issue.  Your pull request should address just this issue, without pulling
in other changes.
     - [ ] Each commit in the pull request should have a meaningful subject line and body.
     - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`,
where you replace `BEAM-XXX` with the appropriate JIRA issue.
     - [ ] Write a pull request description that is detailed enough to understand what the
pull request does, how, and why.
     - [ ] Run `mvn clean verify` to make sure basic checks pass. A more thorough check will
be performed on your pull request automatically.
     - [ ] If this contribution is large, please file an Apache [Individual Contributor License

You can merge this pull request into a Git repository by running:

    $ git pull mr-runner

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3705
commit 9fffd554f1e5fd6465989bb3568dfb6f2d854eeb
Author: Pei He <>
Date:   2017-07-06T02:22:27Z

    Initial commit for MapReduceRunner.

commit 3bacc3e6099718bbcb672ab738ad607204fa8487
Author: Pei He <>
Date:   2017-07-11T02:45:11Z

    MapReduceRunner: add Graph and its visitors.

commit b62238545c1ba95e9857710d91609431cd0a2f93
Author: Pei He <>
Date:   2017-07-13T06:09:10Z

    MapReduceRunner: add unit tests for GraphConverter and GraphPlanner.

commit 64548dc949d0251949efdd02df68eed6032a64f4
Author: Pei He <>
Date:   2017-07-21T05:46:36Z

    mr-runner: support BoundedSource with BeamInputFormat.

commit 3070fded4bc0dde8f08b63e53f94342d21d4bc53
Author: Pei He <>
Date:   2017-07-24T12:15:37Z

    mr-runner: add JobPrototype and translate it to a MR job.

commit 0e16c52463278c6c4f9db61253c6b8287c4718ff
Author: Pei He <>
Date:   2017-07-25T13:44:34Z

    mr-runner: add ParDoOperation and support ParDos chaining.

commit 72a50aa508726e34110475448e9bb52381711faf
Author: Pei He <>
Date:   2017-07-26T13:19:30Z

    mr-runner: add BeamReducer and support GroupByKey.

commit 1b449b0981ae2bb2e1b397113b48eec1df53a4b1
Author: Pei He <>
Date:   2017-07-27T07:01:22Z

    core-java: InMemoryTimerInternals expose getTimers() for timer firings in mr-runner.

commit 6d152a623550446b06bde91ad0c54df1f7e5c60b
Author: Pei He <>
Date:   2017-07-27T02:52:32Z

    mr-runner: support reduce side ParDos and WordCount.

commit 1ef0dec520ee301328007f99419c25b7a7b5b46f
Author: Pei He <>
Date:   2017-07-27T07:05:06Z

    mr-runner: add JarClassInstanceFactory to run ValidatesRunner tests.

commit 02c77375cc114a210f99079cf3efec3d2426941e
Author: Pei He <>
Date:   2017-07-28T08:31:41Z

    mr-runner: refactors and creates Graph data structures to handle general Beam pipelines.

commit bb3349e10c0cfacd81b610880ddfec030fedf34d
Author: Pei He <>
Date:   2017-08-02T11:19:14Z

    mr-runner: support graph visualization with dotfiles.

commit 0fd2f15847e1f9bdd42f4388f6de6e566f9b64ef
Author: Pei He <>
Date:   2017-08-02T13:59:21Z

    mr-runner: hack to get around that ViewAsXXX.expand() return wrong output PValue.

commit 5079322c2e2a092a85b9740d04a7ca9bd887460e
Author: Pei He <>
Date:   2017-08-08T03:30:29Z

    mr-runner: support PCollections materialization with multiple MR jobs.

commit ad4cd2d5ea2af795bba86319d6447e7f8c415bf2
Author: Pei He <>
Date:   2017-08-08T07:49:04Z

    mr-runner: support multiple SourceOperations by composing and partitioning.

commit de2859e1092bfc3fdd036c3becf9e79fbb8fc8fa
Author: Pei He <>
Date:   2017-08-08T09:38:58Z

    mr-runner: support side inputs by reading in all views contents.

commit 69ee0f92bf170f0628d788d5dabeb339e7f1ad0c
Author: Pei He <>
Date:   2017-08-08T14:07:12Z

    mr-runner: setup file paths for read and write sides of materialization.


> Add Hadoop MapReduce runner
> ---------------------------
>                 Key: BEAM-165
>                 URL:
>             Project: Beam
>          Issue Type: Wish
>          Components: runner-ideas
>            Reporter: Jean-Baptiste Onofré
>            Assignee: Jean-Baptiste Onofré
> I think a MapReduce runner could be a good addition to Beam. It would allow users to
smoothly "migrate" from MapReduce to Spark or Flink.
> Of course, the MapReduce runner will run in batch mode (not stream).

This message was sent by Atlassian JIRA

View raw message