kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Otto <o...@wikimedia.org>
Subject Re: Kafka/Hadoop consumers and producers
Date Tue, 13 Aug 2013 23:17:01 GMT
>  may merge a bit of your work into bigtop-989 if that's ok with you. 
Merge away!  Happy to help. :)

> I'll ask on bigtop regarding the .deb requirement - it seems they don't abide by this.

Yeah, there seems to be a constant struggle between the 'java way' of doing things, e.g. Maven
downloading the internet, and the 'debian way', e.g. be paranoid about everything, make sure
the build process is 100% repeatable.

Bigtop should definitely do whatever Bigtop thinks is best.  This Makefile technique works
for us now, but probably will require a lot of manual maintenance as Kafka grows.








On Aug 13, 2013, at 6:03 PM, Kam Kasravi <kamkasravi@yahoo.com> wrote:

> Thanks - I'll ask on bigtop regarding the .deb requirement - it seems they don't abide
by this.
> I may merge a bit of your work into bigtop-989 if that's ok with you. I do know the bigtop
folks 
> would like to see sbt support.
> 
> From: Andrew Otto <otto@wikimedia.org>
> To: Kam Kasravi <kamkasravi@yahoo.com> 
> Cc: "dev@kafka.apache.org" <dev@kafka.apache.org>; Ken Goodhope <kengoodhope@gmail.com>;
Andrew Psaltis <psaltis.andrew@gmail.com>; "dibyendu.bhattacharya@pearson.com" <dibyendu.bhattacharya@pearson.com>;
"camus_etl@googlegroups.com" <camus_etl@googlegroups.com>; "aotto@wikimedia.org" <aotto@wikimedia.org>;
Felix GV <felix@mate1inc.com>; Cosmin Lehene <clehene@adobe.com>; "users@kafka.apache.org"
<users@kafka.apache.org> 
> Sent: Tuesday, August 13, 2013 1:03 PM
> Subject: Re: Kafka/Hadoop consumers and producers
> 
> > What installs all the kafka dependencies under /usr/share/java?
> 
> 
> The debian/ work was done mostly by another WMF staffer.  We tried and tried to make
sbt behave with debian standards, most importantly the one that requires that .debs can be
created without needing to connect to the internet, aside from official apt repositories.
> 
> Many of the /usr/share/java dependencies are handled by apt.  Any that aren't available
in an official apt somewhere have been manually added to the ext/ directory.
> 
> The sbt build system has been replaced with Make:
> https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian/patches/our-own-build-system.patch
> 
> You should be able to build a .deb by checking out the debian branch and running:
> 
>   git-buildpackage -uc -us
> 
> -Ao
> 
> 
> 
> 
> 
> 
> On Aug 13, 2013, at 1:34 PM, Kam Kasravi <kamkasravi@yahoo.com> wrote:
> 
> > Thanks Andrew - I like the shell wrapper - very clean and simple. 
> > What installs all the kafka dependencies under /usr/share/java?
> > 
> > From: Andrew Otto <otto@wikimedia.org>
> > To: Kam Kasravi <kamkasravi@yahoo.com> 
> > Cc: "dev@kafka.apache.org" <dev@kafka.apache.org>; Ken Goodhope <kengoodhope@gmail.com>;
Andrew Psaltis <psaltis.andrew@gmail.com>; "dibyendu.bhattacharya@pearson.com" <dibyendu.bhattacharya@pearson.com>;
"camus_etl@googlegroups.com" <camus_etl@googlegroups.com>; "aotto@wikimedia.org" <aotto@wikimedia.org>;
Felix GV <felix@mate1inc.com>; Cosmin Lehene <clehene@adobe.com>; "users@kafka.apache.org"
<users@kafka.apache.org> 
> > Sent: Monday, August 12, 2013 7:00 PM
> > Subject: Re: Kafka/Hadoop consumers and producers
> > 
> > We've done a bit of work over at Wikimedia to debianize Kafka and make it behave
like a regular service.
> > 
> > https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian
> > 
> > Most relevant, Ken, is an init script for Kafka:
> >  https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian/kafka.init
> > 
> > And a bin/kafka shell wrapper for the kafka/bin/*.sh scripts:
> >  https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian/bin/kafka
> > 
> > I'm about to add an init script for MirrorMaker as well, so mirroring can be demonized
and run as a service.
> > 
> > 
> > On Aug 12, 2013, at 8:16 PM, Kam Kasravi <kamkasravi@yahoo.com> wrote:
> > 
> > > I would like to do this refactoring since I did a high level consumer a while
ago. 
> > > A few weeks ago I had opened KAFKA-949 Kafka on Yarn which I was also hoping
to add to contribute.
> > > It's almost done. KAFKA-949 is paired with BIGTOP-989 which adds kafka 0.8
to the bigtop distribution.
> > > KAFKA-949 basically allows kafka brokers to be started up using sysvinit services
and would ease some of the 
> > > startup/configuration issues that newbies have when getting started with kafka.
Ideally I would like to 
> > > fold a number of kafka/bin/* commands into the kafka service. Andrew please
let me know if would like to 
> > > pick this up instead. Thanks!
> > > 
> > > Kam
> > > 
> > > From: Jay Kreps <jay.kreps@gmail.com>
> > > To: Ken Goodhope <kengoodhope@gmail.com> 
> > > Cc: Andrew Psaltis <psaltis.andrew@gmail.com>; dibyendu.bhattacharya@pearson.com;
"camus_etl@googlegroups.com" <camus_etl@googlegroups.com>; "aotto@wikimedia.org" <aotto@wikimedia.org>;
Felix GV <felix@mate1inc.com>; Cosmin Lehene <clehene@adobe.com>; "dev@kafka.apache.org"
<dev@kafka.apache.org>; "users@kafka.apache.org" <users@kafka.apache.org> 
> > > Sent: Saturday, August 10, 2013 3:30 PM
> > > Subject: Re: Kafka/Hadoop consumers and producers
> > > 
> > > So guys, just to throw my 2 cents in:
> > > 
> > > 1. We aren't deprecating anything. I just noticed that the Hadoop contrib
> > > package wasn't getting as much attention as it should.
> > > 
> > > 2. Andrew or anyone--if there is anyone using the contrib package who would
> > > be willing to volunteer to kind of adopt it that would be great. I am happy
> > > to help in whatever way I can. The practical issue is that most of the
> > > committers are either using Camus or not using Hadoop at all so we just
> > > haven't been doing a good job of documenting, bug fixing, and supporting
> > > the contrib packages.
> > > 
> > > 3. Ken, if you could document how to use Camus that would likely make it a
> > > lot more useful to people. I think most people would want a full-fledged
> > > ETL solution and would likely prefer Camus, but very few people are using
> > > Avro.
> > > 
> > > -Jay
> > > 
> > > 
> > > On Fri, Aug 9, 2013 at 12:27 PM, Ken Goodhope <kengoodhope@gmail.com>
wrote:
> > > 
> > > > I just checked and that patch is in .8 branch.  Thanks for working on
> > > > back porting it Andrew.  We'd be happy to commit that work to master.
> > > >
> > > > As for the kafka contrib project vs Camus, they are similar but not quite
> > > > identical.  Camus is intended to be a high throughput ETL for bulk
> > > > ingestion of Kafka data into HDFS.  Where as what we have in contrib is
> > > > more of a simple KafkaInputFormat.  Neither can really replace the other.
> > > > If you had a complex hadoop workflow and wanted to introduce some Kafka
> > > > data into that workflow, using Camus would be a gigantic overkill and
a
> > > > pain to setup.  On the flipside, if what you want is frequent reliable
> > > > ingest of Kafka data into HDFS, a simple InputFormat doesn't provide you
> > > > with that.
> > > >
> > > > I think it would be preferable to simplify the existing contrib
> > > > Input/OutputFormats by refactoring them to use the more stable higher
level
> > > > Kafka APIs.  Currently they use the lower level APIs.  This should make
> > > > them easier to maintain, and user friendly enough to avoid the need for
> > > > extensive documentation.
> > > >
> > > > Ken
> > > >
> > > >
> > > > On Fri, Aug 9, 2013 at 8:52 AM, Andrew Psaltis <psaltis.andrew@gmail.com>wrote:
> > > >
> > > >> Dibyendu,
> > > >> According to the pull request: https://github.com/linkedin/camus/pull/15it
was merged into the camus-kafka-0.8
> > > >> branch. I have not checked if the code was subsequently removed, however,
> > > >> two at least one the important files from this patch (camus-api/src/main/java/com/linkedin/camus/etl/RecordWriterProvider.java)
> > > >> is still present.
> > > >>
> > > >> Thanks,
> > > >> Andrew
> > > >>
> > > >>
> > > >>  On Fri, Aug 9, 2013 at 9:39 AM, <dibyendu.bhattacharya@pearson.com>wrote:
> > > >>
> > > >>>  Hi Ken,
> > > >>>
> > > >>> I am also working on making the Camus fit for Non Avro message
for our
> > > >>> requirement.
> > > >>>
> > > >>> I see you mentioned about this patch (
> > > >>> https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8)
> > > >>> which supports custom data writer for Camus. But this patch is
not pulled
> > > >>> into camus-kafka-0.8 branch. Is there any plan for doing the same
?
> > > >>>
> > > >>> Regards,
> > > >>> Dibyendu
> > > >>>
> > > >>> --
> > > >>> You received this message because you are subscribed to a topic
in the
> > > >>> Google Groups "Camus - Kafka ETL for Hadoop" group.
> > > >>> To unsubscribe from this topic, visit
> > > >>> https://groups.google.com/d/topic/camus_etl/KKS6t5-O-Ng/unsubscribe.
> > > >>> To unsubscribe from this group and all its topics, send an email
to
> > > >>> camus_etl+unsubscribe@googlegroups.com.
> > > >>> For more options, visit https://groups.google.com/groups/opt_out.
> > > >>>
> > > >>
> > > >>  --
> > > >> You received this message because you are subscribed to the Google
Groups
> > > >> "Camus - Kafka ETL for Hadoop" group.
> > > >> To unsubscribe from this group and stop receiving emails from it,
send an
> > > >> email to camus_etl+unsubscribe@googlegroups.com.
> > > >> For more options, visit https://groups.google.com/groups/opt_out.
> > > >>
> > > >>
> > > >>
> > > >
> > > >  --
> > > > You received this message because you are subscribed to the Google Groups
> > > > "Camus - Kafka ETL for Hadoop" group.
> > > > To unsubscribe from this group and stop receiving emails from it, send
an
> > > > email to camus_etl+unsubscribe@googlegroups.com.
> > > > For more options, visit https://groups.google.com/groups/opt_out.
> > > >
> > > 
> > > 
> > > 
> > > -- 
> > > You received this message because you are subscribed to the Google Groups "Camus
- Kafka ETL for Hadoop" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an email
to camus_etl+unsubscribe@googlegroups.com.
> > > For more options, visit https://groups.google.com/groups/opt_out.
> > >  
> > >  
> > 
> > 
> 
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message