samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <kkrugler_li...@transpac.com>
Subject RE: Required vs. optional methods for KeyValueStore
Date Wed, 29 Jul 2015 17:58:13 GMT
Hi Navina,

Thanks for confirming that putAll(list) is a required method, for supporting the changelog
functionality.

I'm hoping you or others can confirm that range() and all() are _not_ used by the Samza system
- i.e. these are only used internally (as needed) by tasks.

And if the above is true, then adding some Javadoc notes about which methods are required
(used by the Samza system) for changelog support vs. optional (only used by task-specific
code as needed) would be very helpful.

Thanks!

-- Ken

> From: Navina Ramesh
> Sent: July 29, 2015 10:38:45am PDT
> To: dev@samza.apache.org
> Subject: Re: Required vs. optional methods for KeyValueStore
> 
> Hi Ken,
> 
> We use putAll(list) when restoring from changelog. So, unless you don't
> want your store to have support for changelog, the implementation is
> required.
> 
> I only have a high-level overview of what Solr is. Perhaps, others on the
> mailing list have experience with Solr and can provide more useful
> information.
> 
> Thanks!
> Navina
> 
> On Tue, Jul 28, 2015 at 5:30 PM, Ken Krugler <kkrugler_lists@transpac.com>
> wrote:
> 
>> Hi all,
>> 
>> I'm looking at using embedded Solr as the KeyValueStore, as that lets me
>> extract ranked results from the state to publish as part of the task's
>> operation.
>> 
>> Some of the methods defined by KeyValueStore are problematic, though -
>> specifically the range() and all() methods that return iterators.
>> 
>> Iterating over lots of results in Solr, while more feasible with newer
>> paging support, is still an abuse of its architecture :)
>> 
>> So I'm wondering whether I need to support those methods, or are they only
>> called internally by tasks (e.g. my task) and thus can be optional.
>> 
>> I'm assuming that when state is being automatically restored from a
>> changelog, the Samza system is calling putAll(list) repeatedly, but I
>> haven't dug into those details. So that would be an example of a required
>> method.
>> 
>> Thanks,
>> 
>> -- Ken


--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr






Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message