beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From al...@apache.org
Subject [beam] branch master updated: [BEAM-7475] update wordcount example (#8803)
Date Tue, 25 Jun 2019 18:17:56 GMT
This is an automated email from the ASF dual-hosted git repository.

altay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
     new 8745d6e  [BEAM-7475] update wordcount example (#8803)
8745d6e is described below

commit 8745d6eb943b1d5c9d28e2c4c7b917b842122ffd
Author: Rakesh Kumar <rakeshkumar@lyft.com>
AuthorDate: Tue Jun 25 11:17:43 2019 -0700

    [BEAM-7475] update wordcount example (#8803)
    
    * BEAM-7475: Updated wordcount example with code snippet
---
 website/src/get-started/wordcount-example.md | 29 ++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/website/src/get-started/wordcount-example.md b/website/src/get-started/wordcount-example.md
index 910a626..fba69b3 100644
--- a/website/src/get-started/wordcount-example.md
+++ b/website/src/get-started/wordcount-example.md
@@ -1235,7 +1235,15 @@ public static void main(String[] args) throws IOException {
 ```
 
 ```py
-# This feature is not yet available in the Beam SDK for Python.
+def main(arvg=None):
+  parser = argparse.ArgumentParser()
+  parser.add_argument('--input-file',
+                      dest='input_file',
+                      default='/Users/home/words-example.txt')
+  known_args, pipeline_args = parser.parse_known_args(argv)
+  pipeline_options = PipelineOptions(pipeline_args)
+  p = beam.Pipeline(options=pipeline_options)
+  lines  = p | 'read' >> ReadFromText(known_args.input_file)
 ```
 
 ```go
@@ -1267,7 +1275,7 @@ each element in the `PCollection`.
 ```
 
 ```py
-# This feature is not yet available in the Beam SDK for Python.
+beam.Map(AddTimestampFn(timestamp_seconds))
 ```
 
 ```go
@@ -1308,7 +1316,16 @@ static class AddTimestampFn extends DoFn<String, String> {
 ```
 
 ```py
-# This feature is not yet available in the Beam SDK for Python.
+class AddTimestampFn(beam.DoFn):
+  
+  def __init__(self, min_timestamp, max_timestamp):
+     self.min_timestamp = min_timestamp
+     self.max_timestamp = max_timestamp
+
+  def process(self, element):
+    return window.TimestampedValue(
+       element,
+       random.randint(self.min_timestamp, self.max_timestamp))
 ```
 
 ```go
@@ -1344,7 +1361,7 @@ PCollection<String> windowedWords = input
 ```
 
 ```py
-# This feature is not yet available in the Beam SDK for Python.
+windowed_words = input | beam.WindowInto(window.FixedWindows(60 * window_size_minutes))
 ```
 
 ```go
@@ -1362,7 +1379,7 @@ PCollection<KV<String, Long>> wordCounts = windowedWords.apply(new
WordCount.Cou
 ```
 
 ```py
-# This feature is not yet available in the Beam SDK for Python.
+word_counts = windowed_words | CountWords()
 ```
 
 ```go
@@ -1499,4 +1516,4 @@ using [`beam.io.WriteStringsToPubSub`](https://beam.apache.org/releases/pydoc/{{
 * Dive in to some of our favorite [Videos and Podcasts]({{ site.baseurl }}/documentation/resources/videos-and-podcasts).
 * Join the Beam [users@]({{ site.baseurl }}/community/contact-us) mailing list.
 
-Please don't hesitate to [reach out]({{ site.baseurl }}/community/contact-us) if you encounter
any issues!
\ No newline at end of file
+Please don't hesitate to [reach out]({{ site.baseurl }}/community/contact-us) if you encounter
any issues!


Mime
View raw message