kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Otto <o...@wikimedia.org>
Subject Re: Kafka/Hadoop consumers and producers
Date Tue, 13 Aug 2013 02:00:44 GMT
We've done a bit of work over at Wikimedia to debianize Kafka and make it behave like a regular
service.

https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian

Most relevant, Ken, is an init script for Kafka:
  https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian/kafka.init

And a bin/kafka shell wrapper for the kafka/bin/*.sh scripts:
  https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian/bin/kafka

I'm about to add an init script for MirrorMaker as well, so mirroring can be demonized and
run as a service.


On Aug 12, 2013, at 8:16 PM, Kam Kasravi <kamkasravi@yahoo.com> wrote:

> I would like to do this refactoring since I did a high level consumer a while ago. 
> A few weeks ago I had opened KAFKA-949 Kafka on Yarn which I was also hoping to add to
contribute.
> It's almost done. KAFKA-949 is paired with BIGTOP-989 which adds kafka 0.8 to the bigtop
distribution.
> KAFKA-949 basically allows kafka brokers to be started up using sysvinit services and
would ease some of the 
> startup/configuration issues that newbies have when getting started with kafka. Ideally
I would like to 
> fold a number of kafka/bin/* commands into the kafka service. Andrew please let me know
if would like to 
> pick this up instead. Thanks!
> 
> Kam
> 
> From: Jay Kreps <jay.kreps@gmail.com>
> To: Ken Goodhope <kengoodhope@gmail.com> 
> Cc: Andrew Psaltis <psaltis.andrew@gmail.com>; dibyendu.bhattacharya@pearson.com;
"camus_etl@googlegroups.com" <camus_etl@googlegroups.com>; "aotto@wikimedia.org" <aotto@wikimedia.org>;
Felix GV <felix@mate1inc.com>; Cosmin Lehene <clehene@adobe.com>; "dev@kafka.apache.org"
<dev@kafka.apache.org>; "users@kafka.apache.org" <users@kafka.apache.org> 
> Sent: Saturday, August 10, 2013 3:30 PM
> Subject: Re: Kafka/Hadoop consumers and producers
> 
> So guys, just to throw my 2 cents in:
> 
> 1. We aren't deprecating anything. I just noticed that the Hadoop contrib
> package wasn't getting as much attention as it should.
> 
> 2. Andrew or anyone--if there is anyone using the contrib package who would
> be willing to volunteer to kind of adopt it that would be great. I am happy
> to help in whatever way I can. The practical issue is that most of the
> committers are either using Camus or not using Hadoop at all so we just
> haven't been doing a good job of documenting, bug fixing, and supporting
> the contrib packages.
> 
> 3. Ken, if you could document how to use Camus that would likely make it a
> lot more useful to people. I think most people would want a full-fledged
> ETL solution and would likely prefer Camus, but very few people are using
> Avro.
> 
> -Jay
> 
> 
> On Fri, Aug 9, 2013 at 12:27 PM, Ken Goodhope <kengoodhope@gmail.com> wrote:
> 
> > I just checked and that patch is in .8 branch.  Thanks for working on
> > back porting it Andrew.  We'd be happy to commit that work to master.
> >
> > As for the kafka contrib project vs Camus, they are similar but not quite
> > identical.  Camus is intended to be a high throughput ETL for bulk
> > ingestion of Kafka data into HDFS.  Where as what we have in contrib is
> > more of a simple KafkaInputFormat.  Neither can really replace the other.
> > If you had a complex hadoop workflow and wanted to introduce some Kafka
> > data into that workflow, using Camus would be a gigantic overkill and a
> > pain to setup.  On the flipside, if what you want is frequent reliable
> > ingest of Kafka data into HDFS, a simple InputFormat doesn't provide you
> > with that.
> >
> > I think it would be preferable to simplify the existing contrib
> > Input/OutputFormats by refactoring them to use the more stable higher level
> > Kafka APIs.  Currently they use the lower level APIs.  This should make
> > them easier to maintain, and user friendly enough to avoid the need for
> > extensive documentation.
> >
> > Ken
> >
> >
> > On Fri, Aug 9, 2013 at 8:52 AM, Andrew Psaltis <psaltis.andrew@gmail.com>wrote:
> >
> >> Dibyendu,
> >> According to the pull request: https://github.com/linkedin/camus/pull/15it was
merged into the camus-kafka-0.8
> >> branch. I have not checked if the code was subsequently removed, however,
> >> two at least one the important files from this patch (camus-api/src/main/java/com/linkedin/camus/etl/RecordWriterProvider.java)
> >> is still present.
> >>
> >> Thanks,
> >> Andrew
> >>
> >>
> >>  On Fri, Aug 9, 2013 at 9:39 AM, <dibyendu.bhattacharya@pearson.com>wrote:
> >>
> >>>  Hi Ken,
> >>>
> >>> I am also working on making the Camus fit for Non Avro message for our
> >>> requirement.
> >>>
> >>> I see you mentioned about this patch (
> >>> https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8)
> >>> which supports custom data writer for Camus. But this patch is not pulled
> >>> into camus-kafka-0.8 branch. Is there any plan for doing the same ?
> >>>
> >>> Regards,
> >>> Dibyendu
> >>>
> >>> --
> >>> You received this message because you are subscribed to a topic in the
> >>> Google Groups "Camus - Kafka ETL for Hadoop" group.
> >>> To unsubscribe from this topic, visit
> >>> https://groups.google.com/d/topic/camus_etl/KKS6t5-O-Ng/unsubscribe.
> >>> To unsubscribe from this group and all its topics, send an email to
> >>> camus_etl+unsubscribe@googlegroups.com.
> >>> For more options, visit https://groups.google.com/groups/opt_out.
> >>>
> >>
> >>  --
> >> You received this message because you are subscribed to the Google Groups
> >> "Camus - Kafka ETL for Hadoop" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an
> >> email to camus_etl+unsubscribe@googlegroups.com.
> >> For more options, visit https://groups.google.com/groups/opt_out.
> >>
> >>
> >>
> >
> >  --
> > You received this message because you are subscribed to the Google Groups
> > "Camus - Kafka ETL for Hadoop" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to camus_etl+unsubscribe@googlegroups.com.
> > For more options, visit https://groups.google.com/groups/opt_out.
> >
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "Camus - Kafka
ETL for Hadoop" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to camus_etl+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>  
>  


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message