beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amit Sela (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (BEAM-744) UnboundedKafkaReader should return as soon as it can.
Date Tue, 18 Oct 2016 17:58:58 GMT

     [ https://issues.apache.org/jira/browse/BEAM-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Amit Sela updated BEAM-744:
---------------------------
    Description: 
KafkaIO has two "wait" properties:

{{START_NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, default: 5 seconds.
{{NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, default: 10 msec.

Instead, they should return as soon as they can, [~rangadi] mentioned here: https://github.com/apache/incubator-beam/pull/1071
that they were both originally set to 10 milliseconds, so {{START_NEW_RECORDS_POLL_TIMEOUT}}
should probably be re-set to 10 milliseconds.

  was:
KafkaIO has two "wait" properties:

{{START_NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, default: 5 seconds.
{{NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, default: 10 msec.

[~rangadi] mentioned some of these were set to due to limitations of the DirectRunner, and
I can add that they are now limiting the Spark runner (which reads in defined time frames,
which may be smaller then the wait time and so never actually read).

This feels like defaults should be set for optimal read from Kafka, while a runner may override
those if it needs to.

[~rangadi] also mentioned that this could be set in {{PipelineOptions}} which may be passed
when creating the reader. 


> UnboundedKafkaReader should return as soon as it can.
> -----------------------------------------------------
>
>                 Key: BEAM-744
>                 URL: https://issues.apache.org/jira/browse/BEAM-744
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>            Reporter: Amit Sela
>
> KafkaIO has two "wait" properties:
> {{START_NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, default:
5 seconds.
> {{NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, default: 10 msec.
> Instead, they should return as soon as they can, [~rangadi] mentioned here: https://github.com/apache/incubator-beam/pull/1071
that they were both originally set to 10 milliseconds, so {{START_NEW_RECORDS_POLL_TIMEOUT}}
should probably be re-set to 10 milliseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message