spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: Spark job workflow engine recommendations
Date Fri, 07 Aug 2015 17:00:12 GMT
Check also falcon in combination with oozie

Le ven. 7 août 2015 à 17:51, Hien Luu <hluu@linkedin.com.invalid> a écrit :

> Looks like Oozie can satisfy most of your requirements.
>
>
>
> On Fri, Aug 7, 2015 at 8:43 AM, Vikram Kone <vikramkone@gmail.com> wrote:
>
>> Hi,
>> I'm looking for open source workflow tools/engines that allow us to
>> schedule spark jobs on a datastax cassandra cluster. Since there are tonnes
>> of alternatives out there like Ozzie, Azkaban, Luigi , Chronos etc, I
>> wanted to check with people here to see what they are using today.
>>
>> Some of the requirements of the workflow engine that I'm looking for are
>>
>> 1. First class support for submitting Spark jobs on Cassandra. Not some
>> wrapper Java code to submit tasks.
>> 2. Active open source community support and well tested at production
>> scale.
>> 3. Should be dead easy to write job dependencices using XML or web
>> interface . Ex; job A depends on Job B and Job C, so run Job A after B and
>> C are finished. Don't need to write full blown java applications to specify
>> job parameters and dependencies. Should be very simple to use.
>> 4. Time based  recurrent scheduling. Run the spark jobs at a given time
>> every hour or day or week or month.
>> 5. Job monitoring, alerting on failures and email notifications on daily
>> basis.
>>
>> I have looked at Ooyala's spark job server which seems to be hated
>> towards making spark jobs run faster by sharing contexts between the jobs
>> but isn't a full blown workflow engine per se. A combination of spark job
>> server and workflow engine would be ideal
>>
>> Thanks for the inputs
>>
>
>

Mime
View raw message