samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shanthoosh Venkataraman <santhoshvenkat1...@gmail.com>
Subject Re: Review Request 52476: Do not load task store which are older than delete tombstones.
Date Wed, 19 Oct 2016 22:05:00 GMT


> On Oct. 17, 2016, 11:08 p.m., Prateek Maheshwari wrote:
> > samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala, line
156
> > <https://reviews.apache.org/r/52476/diff/3/?file=1539990#file1539990line156>
> >
> >     Unrelated, but let's make this info.

Done.


> On Oct. 17, 2016, 11:08 p.m., Prateek Maheshwari wrote:
> > samza-core/src/main/scala/org/apache/samza/config/StorageConfig.scala, line 33
> > <https://reviews.apache.org/r/52476/diff/3/?file=1539988#file1539988line33>
> >
> >     Usually more readable if you write this as a multiplication: 1 * 24 * 60 * 60
* 1000L

Done.


> On Oct. 17, 2016, 11:08 p.m., Prateek Maheshwari wrote:
> > samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala, line
532
> > <https://reviews.apache.org/r/52476/diff/3/?file=1539989#file1539989line532>
> >
> >     Prefer passing the one config that we need explicitly instead of passing the
config object.

Moved to use changeLogDeleteRetentions map.


> On Oct. 17, 2016, 11:08 p.m., Prateek Maheshwari wrote:
> > samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala, line
29
> > <https://reviews.apache.org/r/52476/diff/3/?file=1539990#file1539990line29>
> >
> >     Unrelated to RB but prefer explicit imports.

Done.


> On Oct. 17, 2016, 11:08 p.m., Prateek Maheshwari wrote:
> > samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala, line
26
> > <https://reviews.apache.org/r/52476/diff/3/?file=1539990#file1539990line26>
> >
> >     Delete or import explicitly.

Done.


> On Oct. 17, 2016, 11:08 p.m., Prateek Maheshwari wrote:
> > samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala, line
107
> > <https://reviews.apache.org/r/52476/diff/3/?file=1539990#file1539990line107>
> >
> >     Add method description.

Done.


> On Oct. 17, 2016, 11:08 p.m., Prateek Maheshwari wrote:
> > samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala, line
114
> > <https://reviews.apache.org/r/52476/diff/3/?file=1539990#file1539990line114>
> >
> >     Another case we ran into on Friday - if the oldest offset in the changelog topic
is newer than the offset in the OFFSET file. Do you need to handle that here?
> >     
> >     Nitpick: would isStaleStore be clearer?

Discussed offline. This is a regular scenario that happens with compaction (message expiration
in general w.r.t topics). When the offset in the offset file is older than oldest offset in
changelog, it indicates that compaction has happened. To not miss messages from the topic
in the consumption, users have to consume from the oldest offset in the changelog, which is
controlled by the config parameter systems.name.consumer.auto.offset.reset.


> On Oct. 17, 2016, 11:08 p.m., Prateek Maheshwari wrote:
> > samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala, line
122
> > <https://reviews.apache.org/r/52476/diff/3/?file=1539990#file1539990line122>
> >
> >     Mention somewhere in the message that this means that the store is stale.

Done.


- Shanthoosh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52476/#review152976
-----------------------------------------------------------


On Oct. 19, 2016, 10:04 p.m., Shanthoosh Venkataraman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52476/
> -----------------------------------------------------------
> 
> (Updated Oct. 19, 2016, 10:04 p.m.)
> 
> 
> Review request for samza.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> Every local task store is backed up by a kafka changelog topic. Due to log compaction,
delete tombstones of the changelog topic have a ttl of delete.retention.ms. Replaying the
events from the changelog that has missing delete tombstones, would result in creation of
an inconsistent local store(due to the missing of some delete events). This patch deletes
the local stores in which difference between current time and last modified time of the offset
file is greater than delete.retention.ms during the container startup.
> 
> 
> Diffs
> -----
> 
>   samza-core/src/main/java/org/apache/samza/storage/StorageRecovery.java 9329edf7d724f3a0d9235354bb77936f713e3b5f

>   samza-core/src/main/scala/org/apache/samza/config/StorageConfig.scala be3f1068f7921c9c8f8697577efea6580672813e

>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 05a996c98075ea8ed3767af666b9beeb1933f2a6

>   samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala 0b7bcdda1639eea8239a69c31bdf42558e9077d2

>   samza-core/src/test/scala/org/apache/samza/storage/TestTaskStorageManager.scala 4d40f520e54beb643acd8410c772b75e2f6a9162

>   samza-kafka/src/main/scala/org/apache/samza/config/KafkaConfig.scala 973ab8cfb3d248bec7efe5e338f5e667f097556d

> 
> Diff: https://reviews.apache.org/r/52476/diff/
> 
> 
> Testing
> -------
> 
> Unit testing and manual testing has been done to verify the functionality.
> 
> 
> Thanks,
> 
> Shanthoosh Venkataraman
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message