samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yan Fang <yanfang...@gmail.com>
Subject Re: Keeping topic names out of the code?
Date Mon, 08 Sep 2014 22:43:18 GMT
yes, actually when I am using Samza, I add this config in the profile to
make my life easier. It is also true that Samza's config system is already
very complicated :( -- I feel the same way... It is now actually very
convenient for experienced users but may scare new users away...

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108

On Mon, Sep 8, 2014 at 3:27 PM, Chris Riccomini <
criccomini@linkedin.com.invalid> wrote:

> Hey Roger,
>
> Funnily enough, we actually used to have this feature in Samza 0.6.0,
> before it was open sourced. We called them "logical streams". The main
> reason that we removed them was really about usability. Samza's configs
> are already overly complicated (at least, I feel that way), and adding an
> extra level of indirection was leading to a lot of confusion from
> developers.
>
> You can still proxy this by using job-specific config, and a variable:
>
>   val edits = config.getString("edits")
>
> And then define:
>
>   edits=kafka.edit-stream-name
>
> In conifig. That seems a bit clunky, though.
>
> Taking a step back, I think Samza's config system is due for a revisit.
> David Chen was actually just discussing this with me today, and I expect a
> JIRA on it to pop up sometime soon. If we simplified things, it might be
> possible to add this feature back in.
>
> Cheers,
> Chris
>
> On 9/8/14 2:51 PM, "Roger Hoover" <roger.hoover@gmail.com> wrote:
>
> >"It might be a huge deal."  I mean "it might not* be huge deal.
> >
> >On Mon, Sep 8, 2014 at 2:50 PM, Roger Hoover <roger.hoover@gmail.com>
> >wrote:
> >
> >> Hi,
> >>
> >> Wondering if people with more experience with Samza think it would be a
> >> good idea to keep topic names out of the code.  You might want to be
> >>able
> >> to change topics by editing the config instead of having to recompile
> >>the
> >> job.
> >>
> >> Maybe introduce an indirection so that output streams have names?
> >>
> >> Config:
> >> #Define an input named "raw" which maps to Kafka topic "wikipedia-raw"
> >> task.inputs.kafka.raw=wikipedia-raw
> >> #Use raw as an input
> >> task.inputs=kafka.raw
> >> #Define an output named "edits" which maps to Kafka topic
> >>"wikipedia-edits"
> >> task.outputs.kafka.edits=wikipedia-edits
> >>
> >> Task code:
> >>
> >> //Input stream would be called "raw" here instead of "wikipedia-raw"
> >> String stream =
> >> envelope.getSystemStreamPartition().getSystemStream().getStream();
> >> if (stream.equals("raw") {
> >>   processRawMsg(envelope, collector, coordinator);
> >> }
> >>
> >> //Send messages to locally named topic "edits"
> >> collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka",
> >> "edits"), parsedJsonObject));
> >>
> >> Thoughts?  It might be a huge deal.  I just found myself copy and
> >>pasting
> >> names a lot across config and code files while writing some test jobs.
> >>
> >> Cheers,
> >>
> >> Roger
> >>
> >> Roger
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message