storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geoffrey Holmes <ghol...@pinsightmedia.com>
Subject Re: HDFS state and commits and only-once semantics
Date Thu, 03 Aug 2017 20:36:35 GMT
Thanks. That helps answer my question. What if a batch fails? Could records in that batch get
written to disk by one HDFS state but not another?

From: Bobby Evans <evans@yahoo-inc.com>
Reply-To: "user@storm.apache.org" <user@storm.apache.org>
Date: Thursday, August 3, 2017 at 1:27 PM
To: "user@storm.apache.org" <user@storm.apache.org>
Subject: Re: HDFS state and commits and only-once semantics

Writing to a state in storm in not atomic.  Storm guarantees that once the batch completes
that the data is written out to all of the states that expect to receive it.  The HDFS state
guarantees that the dat will have been flushed out the the data nodes when a batch completes
and if the topology keeps running eventually the files will be rotated and made available
for others to process, but there is no guarantee that the files will rotate at the same time
or anything like that.


- Bobby



On Thursday, August 3, 2017, 1:17:39 PM CDT, Geoffrey Holmes <gholmes@pinsightmedia.com>
wrote:



I read STORM-837 (https://issues.apache.org/jira/browse/STORM-837) and have a question. How
does this work if I have more than one HDFS state in my Trident topology? Can I ensure that
a record ends up written to both HDFS states or none but not just one or the other?




Mime
View raw message