samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Reisman" <initialcont...@gmail.com>
Subject Re: Review Request 35445: SAMZA-693: Very basic HDFS Producer service for Samza
Date Thu, 30 Jul 2015 15:33:01 GMT


> On July 29, 2015, 7:41 p.m., Navina Ramesh wrote:
> > docs/learn/documentation/versioned/hdfs/producer.md, line 24
> > <https://reviews.apache.org/r/35445/diff/4/?file=1023362#file1023362line24>
> >
> >     Can you please update the list of available configuration for this system at
http://samza.apache.org/learn/documentation/latest/jobs/configuration-table.html as well?

> >     
> >     Like we did with elastic search, adding an example job to hello-samza will be
very useful for adoption. If not in this JIRA, please consider opening a follow-up JIRA to
add this example.

Yeah I can do that. I think adding an example to hello-samza is a great idea, but I agree
I would make that a separate ticket and then I'd claim that ticket.


> On July 29, 2015, 7:41 p.m., Navina Ramesh wrote:
> > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/TextSequenceFileHdfsWriter.scala,
line 37
> > <https://reviews.apache.org/r/35445/diff/4/?file=1023373#file1023373line37>
> >
> >     What is the motivation for having 2 sequenceFileHdfsWriters?  Is the only difference
in the data type being written to the fs? 
> >     
> >     The difference is not clear from the class's javadoc or the example configuration
provided. The reason I ask is we already have a config overload in Samza. I would rather not
have the user configure a writer type, if we can safely default to a single type and only
override the relevant diverging properties. Functionally, I believe they are doing the same
thing.

I have worked on another Apache project where introducing Writable type params all over the
place has painted us into some corners over time, so I wanted to try it this way first. I
could definitely rework the patch so that the current base class SequenceFileHdfsWriter takes
[K, V] type params and handles data types that way if you like?


- Eli


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35445/#review93501
-----------------------------------------------------------


On July 28, 2015, 5:25 a.m., Eli Reisman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35445/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 5:25 a.m.)
> 
> 
> Review request for samza.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-693: Very basic HDFS Producer service for Samza
> 
> 
> Diffs
> -----
> 
>   build.gradle 0852adc 
>   docs/learn/documentation/versioned/hdfs/producer.md PRE-CREATION 
>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala PRE-CREATION

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemAdmin.scala PRE-CREATION

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemFactory.scala PRE-CREATION

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducer.scala PRE-CREATION

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemProducerMetrics.scala
PRE-CREATION 
>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/BinarySequenceFileHdfsWriter.scala
PRE-CREATION 
>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/Bucketer.scala PRE-CREATION

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/HdfsWriter.scala PRE-CREATION

>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/JobNameDateTimeBucketer.scala
PRE-CREATION 
>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/SequenceFileHdfsWriter.scala
PRE-CREATION 
>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/TextSequenceFileHdfsWriter.scala
PRE-CREATION 
>   samza-hdfs/src/test/resources/samza-hdfs-test-batch-job-text.properties PRE-CREATION

>   samza-hdfs/src/test/resources/samza-hdfs-test-batch-job.properties PRE-CREATION 
>   samza-hdfs/src/test/resources/samza-hdfs-test-job-text.properties PRE-CREATION 
>   samza-hdfs/src/test/resources/samza-hdfs-test-job.properties PRE-CREATION 
>   samza-hdfs/src/test/scala/org/apache/samza/system/hdfs/TestHdfsSystemProducerTestSuite.scala
PRE-CREATION 
>   settings.gradle 19bff97 
> 
> Diff: https://reviews.apache.org/r/35445/diff/
> 
> 
> Testing
> -------
> 
> Updated: See JIRA SAMZA-693 for details, this latest update (693-4) addresses post-review
issues and adds more pluggable design, several default writer implementations, and more (and
more thorough) unit tests.
> 
> Passes 'gradle clean test'.
> 
> 
> Thanks,
> 
> Eli Reisman
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message