samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan <danharve...@gmail.com>
Subject Re: Store changelog
Date Thu, 02 Apr 2015 18:13:26 GMT
Ah that makes sense now thanks Chris, re-reading that page it is clear. I
think what confused me is this section from the configuration documentation
for `stores.store-name.changelog`:

"... Any output stream can be used as changelog, but you must ensure that
only one job ever writes to a given changelog stream (each instance of a
job and each store needs its own changelog stream)."

 - Dan


On Thu, 2 Apr 2015 at 18:46 Chinmay Soman <chinmay.cerebro@gmail.com> wrote:

> Also documented here:
> http://samza.apache.org/learn/documentation/0.9/container/
> state-management.html
>
> Check the "Local state in Samza" section - the diagram (and the
> description) explains this clearly.
>
> On Thu, Apr 2, 2015 at 10:36 AM, Chris Riccomini <criccomini@apache.org>
> wrote:
>
> > Hey Dan,
> >
> > I think you might have a misunderstanding in how changelogs work with
> > Samza. Suppose you have a job with two tasks, and a single kv-store is
> > configured with a changelog attached. The changelog, in Kafka, will have
> > two partitions. Each task will use one partition of the changelog topic.
> > You only need one topic per-changelog (and no prefix) because there are
> > multiple partitions per changelog, and there's a 1:1 mapping between a
> task
> > and its changelog partition.
> >
> > Cheers,
> > Chris
> >
> > On Thu, Apr 2, 2015 at 10:30 AM, Dan <danharvey42@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > We're just starting out using Samza to process streams we've already
> got
> > in
> > > Kafka. Some of the jobs we've written are using the per task KV store
> > which
> > > are being persisted to a changelog topic in Kafka. As you need a
> > different
> > > changelog topic per task we are wondering how people are dealing with
> > > ensuring that each task's store has a separate changelog.
> > >
> > > I think we could define multiple stores in the properties file, then
> pick
> > > the correct one for each task index. But that seems quite a verbose way
> > to
> > > go about that?
> > >
> > > If Samza could use a prefix in the properties file then generate a
> topic
> > > name for each task it would simplify using that. Maybe there's
> something
> > > I'm missing from this?
> > >
> > > Thanks,
> > > Dan
> > >
> >
>
>
>
> --
> Thanks and regards
>
> Chinmay Soman
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message