hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xine Jar <xineja...@googlemail.com>
Subject Re: number of tablets and hbase table size
Date Wed, 02 Sep 2009 09:51:49 GMT
Hallo,
The theoretical concept of the table is clear for me. I am aware that the
writes are kept in memory in a buffer called memtable and whenever this
buffer reaches a threshold, the memtable is automatically flushed to the
disk.

Now I have tried to flush the table by executing the following:

*hbase(main):001:0> flush 'myTable'
0 row(s) in 0.2019 seconds

hbase(main):002:0> describe 'myTable'
{NAME => 'myTable', FAMILIES => [{NAME => 'cf', COMPRESSION => 'NONE',
VERSIONS => '3', LENGTH => '2147483647'
, TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}
*
Q1-the expression "0 row(s) in 0.2019" means that it did not flush
anything?!!

Q2- IN_MEMORY=FALSE means that the table is not in memory? so is it in the
disk?!!! If it is so, I still cannot see it in the DFS when executing
"bin/hadoop dfs -ls".


Thank you for taking look at that

Regards,
CJ

On Tue, Sep 1, 2009 at 7:13 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:

> Inline.
>
> J-D
>
> On Tue, Sep 1, 2009 at 1:05 PM, Xine Jar<xinejar22@googlemail.com> wrote:
> > Thank you,
> >
> >  while the answers on Q3 and Q4 were clear enough I still have some
> problems
> > with the first two questions.
>
> Good
>
> >
> > -which entry in the hbase-default.xml allows me to check the size of a
> > tablet?
>
> Those are configuration parameters, not commands. A region will split
> when a family gets that size. See
> http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture#hregion for more
> info on splitting.
>
> >
> > -In hadoop, I used to copy a file to the DFS by doing "bin/hadoop dfs
> > -copyFromLocal filesource fileDFS".
> >  Having this file in the DFS I could list it "bin/hadoop dfs -ls" and
> check
> > its size by doing "bin/hadoop dfs -du fileDFS"
> >  But when I create an hbase table, this table does not appear in the DFS.
> > Therefore the latter command gives an error it cannot find
> >  the table!!  So how can I point to the folder of the table?
>
> Just make sure the table is flushed to disk, the writes are kept in
> memory as described in the link I pasted for the previous question.
> You can force that by going in the shell and issuing "flush 'table'"
> where table replaced with the name of your table.
>
> >
> > Regards,
> > CJ
> >
> >
> > On Tue, Sep 1, 2009 at 5:00 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> Anwers inline.
> >>
> >> J-D
> >>
> >> On Tue, Sep 1, 2009 at 10:53 AM, Xine Jar<xinejar22@googlemail.com>
> wrote:
> >> > Hallo,
> >> > I have a cluster of 6 nodes running hadoop0.19.3 and hbase 0.19.1. I
> have
> >> > managed to write small programs to test the settings and everything
> seems
> >> to
> >> > be fine.
> >> >
> >> > I wrote a mapreduce program reading a small hbase table (100 rows, one
> >> > familiy colum, 6 columns) and summing some values. In my opinion the
> job
> >> is
> >> > slow, it
> >> > is taking 19sec. I would like to look closer what is going, if the
> table
> >> is
> >> > plit into tablets or not ...Therefore I appreciate if someone can
> answer
> >> my
> >> > following questions:
> >>
> >> With that size, that's expected. You would be better off scanning your
> >> table directly instead, MapReduce has a startup cost and 19 seconds
> >> isn't that much.
> >>
> >> >
> >> >
> >> > *Q1 -Does  the value of "hbase.hregion.max.filesize" in the
> >> > hbase-default.xml indicate the maximum size of a tablet in bytes?
> >>
> >> It's the maximum size of a family (in a region) in bytes.
> >>
> >> >
> >> > Q2- How can I know the size of the hbase table I have created? (I
> guess
> >> the
> >> > "Describe" command from the shell does not provide it)
> >>
> >> Size as in disk space? You could use the hadoop dfs -du command on
> >> your table's folder.
> >>
> >> >
> >> > Q3- Is there a way to know the real number of tablets constituting my
> >> table?
> >>
> >> In the Master's web UI, click on the name of your table. If you want
> >> to do that programmatically, you can indirectly do it by calling
> >> HTable.getEndKeys() and the size of that array is the number of
> >> regions.
> >>
> >> >
> >> > Q4- Is there a way to get more information on the tablets handeled by
> >> each
> >> > regionserver? (their number, the rows constituting each tablet)   *
> >>
> >> In the Master's web UI, click on the region server you want info for.
> >> Getting the number of rows inside a region, for the moment, can't be
> >> done directly (requires doing a scan between the start and end keys of
> >> a region and counting the number of rows you see).
> >>
> >> >
> >> > Thank you for you help,
> >> > CJ
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message