hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andre Reiter <a.rei...@web.de>
Subject Re: full table scan
Date Sat, 11 Jun 2011 08:36:00 GMT
Jean-Daniel Cryans wrote:
> You expect a MapReduce job to be faster than a Scan on small data,
> your expectation is wrong.

never expected a MR job to be faster  for every context

> There's a minimal cost to every MR job, which is of a few seconds, and
> you can't go around it.

for sure there is an overhead for MR job, and a few seconds are OK, but not a whole minute...

so what time can be expected for processing a full scan of i.e. 1.000.000.000 rows in an hbase
cluster with i.e. 3 region servers?

i'm just wondering, if its worth to run the full scan only once a day, and to persist the
results
i hoped to be able to process it on demand, but if it takes too much time, its not acceptable

andre


Mime
View raw message