samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Navina Ramesh" <nram...@linkedin.com>
Subject Re: Review Request 39806: SAMZA-798 : Performance and stability issue after combining checkpoint and coordinator stream
Date Mon, 02 Nov 2015 05:46:19 GMT


> On Nov. 2, 2015, 4:44 a.m., Yi Pan (Data Infrastructure) wrote:
> > Overall looks good to me. Thanks a lot! Just one comment: we need to document how
to help internal customers who already took 200.10.0.30/31 to migration to the official release
version.

Yeah. Sure. I think we need to modify the coordinator stream writer to allow writing task-to-changelog
mapping. Let me look into it. Thanks for the reminder! 
Just curious: Do we have more than 1 prod users who are using the 200.10.0.30/31 versions?


> On Nov. 2, 2015, 4:44 a.m., Yi Pan (Data Infrastructure) wrote:
> > docs/learn/documentation/versioned/jobs/configuration-table.html, line 1480
> > <https://reviews.apache.org/r/39806/diff/3/?file=1113567#file1113567line1480>
> >
> >     What are other checkpoint factories that we support? I thought that we only
migrate Kafka checkpoint factory?

In the previous versions, we support FileSystemCheckpointManager . By "other", I also meant
any custom implementation a user might have. For example, I am sure the folks at netflix have
their own checkpoint management system.


> On Nov. 2, 2015, 4:44 a.m., Yi Pan (Data Infrastructure) wrote:
> > samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala, line
147
> > <https://reviews.apache.org/r/39806/diff/3/?file=1113575#file1113575line147>
> >
> >     Remove the "TODO" comments. We can open JIRA tickets to track the refactoring
effort.

Ok. This was already there when I copied out the code. I will remove it. It is not necessary
in the code.


> On Nov. 2, 2015, 4:44 a.m., Yi Pan (Data Infrastructure) wrote:
> > samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala,
line 176
> > <https://reviews.apache.org/r/39806/diff/3/?file=1113588#file1113588line176>
> >
> >     Suggestion: remove "TODO" comments.

Yep. This is not needed, either.


> On Nov. 2, 2015, 4:44 a.m., Yi Pan (Data Infrastructure) wrote:
> > samza-kafka/src/main/scala/org/apache/samza/config/KafkaConfig.scala, line 39
> > <https://reviews.apache.org/r/39806/diff/3/?file=1113594#file1113594line39>
> >
> >     Not related to the JIRA but just noticed this. Why are these job ConfigRewriter
related config in KafkaConfig? We should probably consider moving it later.

Hmm.. Good question. Let's re-visit this after the release.


- Navina


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39806/#review104673
-----------------------------------------------------------


On Nov. 2, 2015, 2:24 a.m., Navina Ramesh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39806/
> -----------------------------------------------------------
> 
> (Updated Nov. 2, 2015, 2:24 a.m.)
> 
> 
> Review request for samza, Chris Riccomini, Jake Maes, Jagadish Venkatraman, and Yi Pan
(Data Infrastructure).
> 
> 
> Bugs: SAMZA-798
>     https://issues.apache.org/jira/browse/SAMZA-798
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> Adding interfaces for CheckpointManager, CheckpointManagerFactory and moving Checkpoint
to api
> 
> 
> Adding KafkaCheckpointLogKey, KafkaCheckpointManager and KafkaCheckpointManagerFactory
back from 0.9.1
> 
> 
> Changed SamzaContainer and OffsetManager
> 
> 
> Removed checkpointmanager in JC and modified TaskModel to remove offsetMapping. Container
will continue to use offsetmanager for fetching offsets
> 
> 
> Fixed OffsetManager bugs
> 
> 
> Got rid of all compile errors during build with -x test
> 
> 
> Fixing Jackson object mapper for TaskModel
> 
> 
> Commented tests in checkpoint manager and fixed other failing tests
> 
> 
> Refactored KCM and moved generic functions like createTopic & validateTopic to kafkaUtil.scala
> 
> 
> KCM unit tests work
> 
> 
> Got rid of old migration code and its test. Got rid of redundant KCM
> 
> 
> Commented out migration related tests in jobrunner
> 
> 
> Moved migration code from old.checkpoint package
> 
> 
> Fixed 1 migration test
> 
> 
> Fixed checkpoint migration and its unit tests
> 
> 
> Removed migration related tests from TestKafkaCheckpointManager
> 
> 
> Removed some commented lines and fixed a test in TestJobCoordinator
> 
> 
> Deleted CheckpointManager and SetCheckpoint
> 
> 
> Diffs
> -----
> 
>   docs/learn/documentation/versioned/jobs/configuration-table.html 4adac09305cbdb07b0d2cd9f87777b189df1c290

>   samza-api/src/main/java/org/apache/samza/checkpoint/CheckpointManager.java PRE-CREATION

>   samza-api/src/main/java/org/apache/samza/checkpoint/CheckpointManagerFactory.java PRE-CREATION

>   samza-core/src/main/java/org/apache/samza/checkpoint/Checkpoint.java  
>   samza-core/src/main/java/org/apache/samza/checkpoint/CheckpointManager.java 0185751c28979e50b1bddc28c90339defd94200b

>   samza-core/src/main/java/org/apache/samza/coordinator/stream/messages/SetCheckpoint.java
21afa8569801150e81b4c14ee21a9077dfa1895f 
>   samza-core/src/main/java/org/apache/samza/job/model/TaskModel.java e00c49d5255c0af6d44e251aed4e8360cd3026c5

>   samza-core/src/main/java/org/apache/samza/serializers/model/JsonTaskModelMixIn.java
172358a5428c9789e0883fc0e5ad3e5c3398478a 
>   samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala 2e3aeb8fd5a86aa39464adff9e75aca96622ebad

>   samza-core/src/main/scala/org/apache/samza/checkpoint/OffsetManager.scala 1464acc7ec6592a21c3cdf96f34847e094e9e5e3

>   samza-core/src/main/scala/org/apache/samza/checkpoint/file/FileSystemCheckpointManager.scala
PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 0b73403018b895879ed2c0538a5cd495813d2eae

>   samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 03299cb7cb93d43165a74206113497462d8119e9

>   samza-core/src/main/scala/org/apache/samza/migration/JobRunnerMigration.scala 374e27e8233a27132019d429f6fa1f131db3fe15

>   samza-core/src/test/java/org/apache/samza/coordinator/stream/MockCoordinatorStreamWrappedConsumer.java
dd04d28e54e7afe0cc6d6c2aa508911a14e668bf 
>   samza-core/src/test/java/org/apache/samza/serializers/model/TestSamzaObjectMapper.java
ad1fbc597802078c1a1b7d8f1dbafbd5adf610ae 
>   samza-core/src/test/scala/org/apache/samza/checkpoint/TestCheckpointTool.scala 00b89773ad00b8f445bb1320121ab8af56870327

>   samza-core/src/test/scala/org/apache/samza/checkpoint/TestOffsetManager.scala c00ef91c13b96c8b1845822046343b652a33c1d5

>   samza-core/src/test/scala/org/apache/samza/checkpoint/file/TestFileSystemCheckpointManager.scala
PRE-CREATION 
>   samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala a77ddc7640a8dbbdee391e65a5b432c477b0b67b

>   samza-core/src/test/scala/org/apache/samza/container/grouper/task/TestGroupByContainerCount.scala
ddf1fdef9265b4dbd0e24abe2bff63a3e1244733 
>   samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 1393da84f145c81efd59baabc8a7d3d2132aa05f

>   samza-core/src/test/scala/org/apache/samza/job/local/TestProcessJob.scala a1efe6f2707dc59d2414ebcc0b38f0f95150da64

>   samza-kafka/src/main/scala/old/checkpoint/KafkaCheckpointLogKey.scala 958d07ce3e5d69b15ad74ff52f4572822e0bf09f

>   samza-kafka/src/main/scala/old/checkpoint/KafkaCheckpointManager.scala 627631aa7e3d77349b9e6896fc21737855b0e946

>   samza-kafka/src/main/scala/old/checkpoint/KafkaCheckpointManagerFactory.scala 189752a13f2363c632e3781c0e649a4aae65a9b4

>   samza-kafka/src/main/scala/old/checkpoint/KafkaCheckpointMigration.scala 32afe4c6832df4de0f54007d3e4ee0ce9be856f7

>   samza-kafka/src/main/scala/org/apache/samza/config/KafkaConfig.scala 798033c300a8e816589233a3dc7639ca88841b40

>   samza-kafka/src/main/scala/org/apache/samza/migration/KafkaCheckpointMigration.scala
PRE-CREATION 
>   samza-kafka/src/main/scala/org/apache/samza/util/KafkaUtil.scala a7a095b4d2f19be5ad6119d5bfc715bffaeb68af

>   samza-kafka/src/main/scala/org/apache/samza/util/KafkaUtilException.scala PRE-CREATION

>   samza-kafka/src/test/scala/old/checkpoint/TestKafkaCheckpointManager.scala 2c0304f98eb0de6c644f55d6a758a7a20ec98e0e

>   samza-kafka/src/test/scala/org/apache/samza/checkpoint/kafka/TeskKafkaCheckpointLogKey.scala
PRE-CREATION 
>   samza-kafka/src/test/scala/org/apache/samza/checkpoint/kafka/TestKafkaCheckpointManager.scala
PRE-CREATION 
>   samza-kafka/src/test/scala/org/apache/samza/migration/TestKafkaCheckpointMigration.scala
PRE-CREATION 
>   samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java b20e3516190aa65c4393fe9a50d6c8b7e7eb7f0b

>   samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java
08e53aaf3aaebccf80e79313c3f38fec38359e81 
>   samza-yarn/src/test/java/org/apache/samza/job/yarn/TestSamzaTaskManager.java b12ae5c1eaaee8e94d6e62a925a98d2c952fdb72

>   samza-yarn/src/test/scala/org/apache/samza/job/yarn/TestSamzaAppMasterLifecycle.scala
ec5a8533c7a31b9790504e18e0528be28c77d496 
> 
> Diff: https://reviews.apache.org/r/39806/diff/
> 
> 
> Testing
> -------
> 
> ./gradlew clean build
> 
> Tests:
> 1. First time job deployment performs migration - DONE
> 2. Second time job deployment only performs migration check and doesn't actually migrate
anything - DONE
> 3. Checkpoint Tool works as expected - DONE
> 4. Broadcast Stream works as expected - DONE
> 5. FileSystemCheckpointManager works as expected - DONE
> 
> 
> Thanks,
> 
> Navina Ramesh
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message