flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2740) Create data consumer for Apache NiFi
Date Thu, 01 Oct 2015 15:51:26 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939998#comment-14939998

ASF GitHub Bot commented on FLINK-2740:

Github user bbende commented on the pull request:

    I think the example data flow in NiFi is producing data fairly slowly, this is based on
the scheduling tab of the GenerateFlowFile processor being set to 2 secs. We could set this
to 0 sec to start really going by stopping the processor, right-clicking and selecting Configure.

    The SiteToSiteClientConfig has options to set batch count but the example isn't setting
it so it is using some default, and the first instance of the source probably runs and pulls
all the available data, and when the second instance runs nothing is left.
    If you set the batch count lower in the example, like this:
        SiteToSiteClientConfig clientConfig = new SiteToSiteClient.Builder()
    				.portName("Data for Flink")
    and then have more than 5 flow files queued in NiFi when you start the source topology,
I think the other instances will start pulling as well. This seemed to work for me using parallelism
of 2, but of course let me know if you are seeing something different.

> Create data consumer for Apache NiFi
> ------------------------------------
>                 Key: FLINK-2740
>                 URL: https://issues.apache.org/jira/browse/FLINK-2740
>             Project: Flink
>          Issue Type: New Feature
>          Components: Streaming Connectors
>            Reporter: Kostas Tzoumas
>            Assignee: Joseph Witt
> Create a connector to Apache NiFi to create Flink DataStreams from NiFi flows

This message was sent by Atlassian JIRA

View raw message