nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Witt <joe.w...@gmail.com>
Subject Re: NiFI as Data Pipeline Orchestration Tool?
Date Fri, 11 Jan 2019 17:17:43 GMT
Jon

First things first - Sonos is awesome.

Now back to the matter at hand... NiFi is quite often used for various
forms of orchestration of other systems doing their thing.  However, I'll
state that isn't really its primary purpose so for pure orchestration cases
it can leave you with a less than ideal user experience.

NiFi is more about managing the flow of data to and from systems and doing
the necessary
routing/splitting/forking/joining/merging/transforming/cajoling to make
that work well.  We're less about telling those other systems what to do
with the data or when to run.

Now, having said this it is pretty common.  We have the Spark Livy
integration for example.  I'd recommend you give tools that cater primarily
to orchestration a first stab on this and if you find the problem looks
more and more like I describe then NiFi is probably appropriate.

Hope that helps a bit.  Talking at a terminology basis is tough as things
like ETL, orchestration, transformation often mean wildly different things
to different people.

Thanks

On Fri, Jan 11, 2019 at 12:02 PM Jonathan Meran <jonathan.meran@sonos.com>
wrote:

> Hello,
>
> I am looking into the possibility of using NiFi as a Data Pipeline
> Orchestration Tool. I’m evaluating NiFi along with some other tools such as
> Airflow and AWS Step Functions/Lambdas.
>
>
>
> Has anyone used NiFi as an orchestration/scheduling tool for tasks such as
> submitting spark jobs to an EMR cluster? These are some of the requirements
> we are considering while evaluating such a tool:
>
>
>
>    1. SSH capabilities to execute remote commands
>    2. Rich scheduling (CRON)
>    3. Ability to write custom routines and import custom libraries
>    4. Event-based triggering of a pipeline
>
>
>
> Any insight would be helpful. We have used NiFi for about a year now for
> data movement and are familiar with its capabilities. My biggest worry is
> the ability to coordinate with other machines using SSH.
>
>
>
> Thanks,
>
> Jon
>

Mime
View raw message