hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amandeep Khurana <ama...@gmail.com>
Subject Re: getSplits() in TableInputFormatBase
Date Sun, 11 Apr 2010 08:03:52 GMT
The number of splits is equal to the number of regions...



On Sun, Apr 11, 2010 at 12:54 AM, john smith <js1987.smith@gmail.com> wrote:

> Hi ,
>
> In the method  "public org.apache.hadoop.mapred.InputSplit[] *getSplits*
> (org.apache.hadoop.mapred.JobConf job,
>
>                                                       int numSplits) "
>
> how is the "numSplits" decided ? I've seen differnt values of
> numSplits for different MR jobs . Any reason for this ?
>
> Also what if I ignore numsplits and always split at region
> boundaries.I guess that , splitting at region boundaries makes more
> sense and improves some what data locality.
>
> Any comments on the above statement?
>
> Thanks
>
> j.S
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message