samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hai Lu <lhai...@gmail.com>
Subject Re: Review Request 51142: SAMZA-967: HDFS System Consumer
Date Wed, 07 Sep 2016 23:48:04 GMT


> On Sept. 1, 2016, 9:21 p.m., Navina Ramesh wrote:
> > @lhaiesp: Your patch looks awesome. Happy to review again once you have addressed
the comments. It will be great if you can add some unit test for HdfsSystemConsumer. Some
of the documentation that you will have to include for this feature will be:
> > * Add newly introduced configs to configuration-table.html
> > * Add newly introduced metrics to metrics-table.html (Pending SAMZA-702 commit)
> > * Add a webpage for describing the behavior of HDFS systemconsumer (or more generically,
consuming from Bounded input sources) and how to use the HDFS consumer
> > 
> > You can choose to keep the documentation as a part of this RB or create a follow-up
JIRA for documentation and assign it to your self. Ideally, we don't want to have a lot of
gap between code and documentation. 
> > 
> > Thanks for such a thorough work!

I do have a work item for documentation. I will take a look at the existing documentations.


> On Sept. 1, 2016, 9:21 p.m., Navina Ramesh wrote:
> > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/HdfsSystemConsumer.java, line
149
> > <https://reviews.apache.org/r/51142/diff/4/?file=1487650#file1487650line149>
> >
> >     Isn't numTotalEventsCounter a sum of all counters in numEventsCounter in the
map ? Do we want to maintain a running sum?

It is. I think I was thinking about adding a config to enable/disable per partition metrics
eventually. In that case a total metrics would be necessary.


> On Sept. 1, 2016, 9:21 p.m., Navina Ramesh wrote:
> > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/DirectoryPartitioner.java,
line 171
> > <https://reviews.apache.org/r/51142/diff/4/?file=1487652#file1487652line171>
> >
> >     Question: Is the generateOldestOffset simply returning a string of "0" delimited
by a comma? The number of "0" matches the number of files in the group?

Yes.


- Hai


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51142/#review147596
-----------------------------------------------------------


On Sept. 7, 2016, 11:44 p.m., Hai Lu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51142/
> -----------------------------------------------------------
> 
> (Updated Sept. 7, 2016, 11:44 p.m.)
> 
> 
> Review request for samza, Chris Pettitt, Yi Pan (Data Infrastructure), and Navina Ramesh.
> 
> 
> Bugs: SAMZA-967
>     https://issues.apache.org/jira/browse/SAMZA-967
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> Add HDFS System Consumer: 
> 
> 1. System admin, partitioner
> 2. System consumer with metrics
> 
> 
> Diffs
> -----
> 
>   build.gradle 1d4eb74b1294318db8454631ddd0901596121ab2 
>   gradle/dependency-versions.gradle 47c71bfde027835682889407261d4798b629d214 
>   samza-hdfs/src/main/java/org/apache/samza/system/hdfs/HdfsSystemAdmin.java PRE-CREATION

>   samza-hdfs/src/main/java/org/apache/samza/system/hdfs/HdfsSystemConsumer.java PRE-CREATION

>   samza-hdfs/src/main/java/org/apache/samza/system/hdfs/PartitionDescriptionUtil.java
PRE-CREATION 
>   samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/DirectoryPartitioner.java
PRE-CREATION 
>   samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/FileSystemAdapter.java
PRE-CREATION 
>   samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/HdfsFileSystemAdapter.java
PRE-CREATION 
>   samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/AvroFileHdfsReader.java
PRE-CREATION 
>   samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/HdfsReaderFactory.java
PRE-CREATION 
>   samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/MultiFileHdfsReader.java
PRE-CREATION 
>   samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/SingleFileHdfsReader.java
PRE-CREATION 
>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala 61b7570afae3219b618c8830905035063941bdd7

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemAdmin.scala 92eb4472533db67dca01f075cb460581b4bdac0d

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemFactory.scala ef3c20a097ddf2feecaf8b0ad4587ea4bf6570b7

>   samza-hdfs/src/test/java/org/apache/samza/system/hdfs/TestHdfsSystemConsumer.java PRE-CREATION

>   samza-hdfs/src/test/java/org/apache/samza/system/hdfs/TestPartitionDesctiptionUtil.java
PRE-CREATION 
>   samza-hdfs/src/test/java/org/apache/samza/system/hdfs/partitioner/TestDirectoryPartitioner.java
PRE-CREATION 
>   samza-hdfs/src/test/java/org/apache/samza/system/hdfs/partitioner/TestHdfsFileSystemAdapter.java
PRE-CREATION 
>   samza-hdfs/src/test/java/org/apache/samza/system/hdfs/reader/TestAvroFileHdfsReader.java
PRE-CREATION 
>   samza-hdfs/src/test/java/org/apache/samza/system/hdfs/reader/TestMultiFileHdfsReader.java
PRE-CREATION 
>   samza-hdfs/src/test/resources/integTest/emptyTestFile PRE-CREATION 
>   samza-hdfs/src/test/resources/partitioner/testfile01 PRE-CREATION 
>   samza-hdfs/src/test/resources/partitioner/testfile02 PRE-CREATION 
>   samza-hdfs/src/test/resources/reader/TestEvent.avsc PRE-CREATION 
>   samza-hdfs/src/test/scala/org/apache/samza/system/hdfs/TestHdfsSystemProducerTestSuite.scala
261310d03de204718621f601117f016da14841df 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/YarnJobFactory.scala 4e328a5f8c2b496a71e36c106339b7af263c96c7

> 
> Diff: https://reviews.apache.org/r/51142/diff/
> 
> 
> Testing
> -------
> 
> unit tests pass.
> 
> tested by writing a real hdfs samza job and deploying to hadoop cluster.
> 
> 
> Thanks,
> 
> Hai Lu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message