drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Holsman <...@holsman.com.au>
Subject Re: Jeff Dean on fast response in an unreliable world
Date Wed, 12 Sep 2012 02:34:46 GMT
The paper mentions how they selectively replicate different subsets of the data. They use 'china
queries' or somesuch as their example.

my understanding is that there is some kind of query/subset monitor that detects hot spots,
and then increases the replication count of them across the farm. It must also be responsible
for decreasing the count as the hotspots become cool again.

On Sep 12, 2012, at 12:31 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> What do you mean be selective replication?
> On Tue, Sep 11, 2012 at 7:23 PM, Worthy LaFollette <worthyl@gmail.com>wrote:
>> Very good paper. Am curious now to the strategies for selective
>> replication, which looks if done right would make the query generation more
>> efficient.  Do you know of any papers on that subject?
>> On Tue, Sep 11, 2012 at 1:37 PM, Ted Dunning <ted.dunning@gmail.com>
>> wrote:
>>> Headed into Thursday's meetup, this paper by Jeff Dean provides a very
>> good
>>> description of strategies for getting fast response times with variable
>>> quality infrastructure.
>>> http://research.google.com/people/jeff/latency.html
>>> The key point here is that it is very important to have asynchronous
>>> queries with a cancel.  Above that level, there needs to be a simple
>>> strategy for pushing second versions of queries out to the workers and
>>> canceling defunct or redundant queries.

Ian Holsman
PH: +61-400-988-964 Skype:iholsman

View raw message