flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Welly Tambunan <if05...@gmail.com>
Subject Re: Flink, Kappa and Lambda
Date Fri, 13 Nov 2015 08:52:57 GMT
Hi Nick,

I totally agree with your point.

My concern is the Kafka, is the author concern really true ? Any one can
give comments on this one ?



On Thu, Nov 12, 2015 at 12:33 PM, Nick Dimiduk <ndimiduk@gmail.com> wrote:

> The first and 3rd points here aren't very fair -- they apply to all data
> systems. Systems downstream of your database can lose data in the same way;
> the database retention policy expires old data, downstream fails, and back
> to the tapes you must go. Likewise with 3, a bug in any ETL system can
> cause problems. Also not specific to streaming in general or Kafka/Flink
> specifically.
>
> I'm much more curious about the 2nd claim. The whole point of high
> availability in these systems is to not lose data during failure. The
> post's author is not specific on any of these points, but just like I look
> to a distributed database community to prove to me it doesn't lose data in
> these corner cases, so too do I expect Kafka to prove it is resilient. In
> the absence of software formally proven correct, I look to empirical
> evidence in the form of chaos monkey type tests.
>
>
> On Wednesday, November 11, 2015, Welly Tambunan <if05041@gmail.com> wrote:
>
>> Hi Stephan,
>>
>>
>> Thanks for your response.
>>
>>
>> We are trying to justify whether it's enough to use Kappa Architecture
>> with Flink. This more about resiliency and message lost issue etc.
>>
>> The article is worry about message lost even if you are using Kafka.
>>
>> No matter the message queue or broker you rely on whether it be RabbitMQ,
>> JMS, ActiveMQ, Websphere, MSMQ and yes even Kafka you can lose messages in
>> any of the following ways:
>>
>>    - A downstream system from the broker can have data loss
>>    - All message queues today can lose already acknowledged messages
>>    during failover or leader election.
>>    - A bug can send the wrong messages to the wrong systems.
>>
>> Cheers
>>
>> On Wed, Nov 11, 2015 at 4:13 PM, Stephan Ewen <sewen@apache.org> wrote:
>>
>>> Hi!
>>>
>>> Can you explain a little more what you want to achieve? Maybe then we
>>> can give a few more comments...
>>>
>>> I briefly read through some of the articles you linked, but did not
>>> quite understand their train of thoughts.
>>> For example, letting Tomcat write to Cassandra directly, and to Kafka,
>>> might just be redundant. Why not let the streaming job that reads the Kafka
>>> queue
>>> move the data to Cassandra as one of its results? Further more, durable
>>> storing the sequence of events is exactly what Kafka does, but the article
>>> suggests to use Cassandra for that, which I find very counter intuitive.
>>> It looks a bit like the suggested approach is only adopting streaming for
>>> half the task.
>>>
>>> Greetings,
>>> Stephan
>>>
>>>
>>> On Tue, Nov 10, 2015 at 7:49 AM, Welly Tambunan <if05041@gmail.com>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I read a couple of article about Kappa and Lambda Architecture.
>>>>
>>>>
>>>> http://www.confluent.io/blog/real-time-stream-processing-the-next-step-for-apache-flink/
>>>>
>>>> I'm convince that Flink will simplify this one with streaming.
>>>>
>>>> However i also stumble upon this blog post that has valid argument to
>>>> have a system of record storage ( event sourcing ) and finally lambda
>>>> architecture is appear at the solution. Basically it will write twice to
>>>> Queuing system and C* for safety. System of record here is basically
>>>> storing the event (delta).
>>>>
>>>> [image: Inline image 1]
>>>>
>>>>
>>>> https://lostechies.com/ryansvihla/2015/09/17/event-sourcing-and-system-of-record-sane-distributed-development-in-the-modern-era-2/
>>>>
>>>> Another approach is about lambda architecture for maintaining the
>>>> correctness of the system.
>>>>
>>>>
>>>> https://lostechies.com/ryansvihla/2015/09/17/real-time-analytics-with-spark-streaming-and-cassandra/
>>>>
>>>>
>>>> Given that he's using Spark for the streaming processor, do we have to
>>>> do the same thing with Apache Flink ?
>>>>
>>>>
>>>>
>>>> Cheers
>>>> --
>>>> Welly Tambunan
>>>> Triplelands
>>>>
>>>> http://weltam.wordpress.com
>>>> http://www.triplelands.com <http://www.triplelands.com/blog/>
>>>>
>>>
>>>
>>
>>
>> --
>> Welly Tambunan
>> Triplelands
>>
>> http://weltam.wordpress.com
>> http://www.triplelands.com <http://www.triplelands.com/blog/>
>>
>


-- 
Welly Tambunan
Triplelands

http://weltam.wordpress.com
http://www.triplelands.com <http://www.triplelands.com/blog/>

Mime
View raw message