flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From greghogan <...@git.apache.org>
Subject [GitHub] flink pull request #5045: [hotfix][docs] Review of concepts docs for grammar...
Date Wed, 22 Nov 2017 16:18:38 GMT
Github user greghogan commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5045#discussion_r152608576
  
    --- Diff: docs/concepts/programming-model.md ---
    @@ -33,53 +33,52 @@ Flink offers different levels of abstraction to develop streaming/batch
applicat
     
     <img src="../fig/levels_of_abstraction.svg" alt="Programming levels of abstraction"
class="offset" width="80%" />
     
    -  - The lowest level abstraction simply offers **stateful streaming**. It is embedded
into the [DataStream API](../dev/datastream_api.html)
    -    via the [Process Function](../dev/stream/operators/process_function.html). It allows
users freely process events from one or more streams,
    -    and use consistent fault tolerant *state*. In addition, users can register event
time and processing time callbacks,
    +  - The lowest level abstraction offers **stateful streaming** and is embedded into the
[DataStream API](../dev/datastream_api.html)
    +    via the [Process Function](../dev/stream/operators/process_function.html). It allows
users to process events from one or more streams,
    +    and use consistent fault tolerant *state*. Users can register event time and processing
time callbacks,
         allowing programs to realize sophisticated computations.
     
    -  - In practice, most applications would not need the above described low level abstraction,
but would instead program against the
    +  - In practice, most applications would not need the low level abstraction describe
above, but would instead program against the
         **Core APIs** like the [DataStream API](../dev/datastream_api.html) (bounded/unbounded
streams) and the [DataSet API](../dev/batch/index.html)
    -    (bounded data sets). These fluent APIs offer the common building blocks for data
processing, like various forms of user-specified
    +    (bounded data sets). These fluent APIs offer the common building blocks for data
processing, like forms of user-specified
         transformations, joins, aggregations, windows, state, etc. Data types processed in
these APIs are represented as classes
    -    in the respective programming languages.
    +    in respective programming languages.
     
    -    The low level *Process Function* integrates with the *DataStream API*, making it
possible to go the lower level abstraction 
    -    for certain operations only. The *DataSet API* offers additional primitives on bounded
data sets, like loops/iterations.
    +    The low level *Process Function* integrates with the *DataStream API*, making it
possible to use the lower level abstraction
    +    for certain operations. The *DataSet API* offers additional primitives on bounded
data sets, like loops or iterations.
     
       - The **Table API** is a declarative DSL centered around *tables*, which may be dynamically
changing tables (when representing streams).
    -    The [Table API](../dev/table_api.html) follows the (extended) relational model: Tables
have a schema attached (similar to tables in relational databases)
    +    The [Table API](../dev/table_api.html) follows the (extended) relational model. Tables
have a schema attached (similar to tables in relational databases)
         and the API offers comparable operations, such as select, project, join, group-by,
aggregate, etc.
    -    Table API programs declaratively define *what logical operation should be done* rather
than specifying exactly
    -   *how the code for the operation looks*. Though the Table API is extensible by various
types of user-defined
    +    Table API programs declaratively define *what logical operation should to perform*
rather than specifying
    +   *how the code for the operation looks*. The Table API is extensible by various types
of user-defined
         functions, it is less expressive than the *Core APIs*, but more concise to use (less
code to write).
    -    In addition, Table API programs also go through an optimizer that applies optimization
rules before execution.
    +    Table API programs also go through an optimizer that applies optimization rules before
execution.
     
    -    One can seamlessly convert between tables and *DataStream*/*DataSet*, allowing programs
to mix *Table API* and with the *DataStream*
    +    You can seamlessly convert between tables and *DataStream*/*DataSet*, allowing programs
to mix *Table API* and with the *DataStream*
         and *DataSet* APIs.
     
       - The highest level abstraction offered by Flink is **SQL**. This abstraction is similar
to the *Table API* both in semantics and
         expressiveness, but represents programs as SQL query expressions.
    -    The [SQL](../dev/table_api.html#sql) abstraction closely interacts with the Table
API, and SQL queries can be executed over tables defined in the *Table API*.
    +    The [SQL](../dev/table_api.html#sql) abstraction closely interacts with the Table
API, and you can execute SQL queries over tables defined in the *Table API*.
     
     
     ## Programs and Dataflows
     
    -The basic building blocks of Flink programs are **streams** and **transformations**.
(Note that the
    -DataSets used in Flink's DataSet API are also streams internally -- more about that
    -later.) Conceptually a *stream* is a (potentially never-ending) flow of data records,
and a *transformation* is an
    +The basic building blocks of Flink programs are **streams** and **transformations**.
The
    +DataSets used in Flink's DataSet API are also streams internally, which this document
will cover later. Conceptually a *stream* is a (potentially never-ending) flow of data records,
and a *transformation* is an
    --- End diff --
    
    Needs line break.


---

Mime
View raw message