hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 陈加俊 <cjjvict...@gmail.com>
Subject Re: the files of one table is so big?
Date Fri, 01 Apr 2011 06:58:58 GMT
I want to copy the files of one table from one cluster to another cluster .
So I do it by step:

1 . bin/hadoop fs -copyToLocal at A
2. scp files from A to B
3.  bin/hadoop fs -copyFromLocal at B

I scan the table and save the data to a file by a file formats which is
defined by me before yesterday ,and I found the file is only 18M.
It is stange that used -copyFromLocal , the file has 99G .

Why?

2011/4/1 Jean-Daniel Cryans <jdcryans@apache.org>

> Depends what you're trying to do? Like I said you didn't give us a lot
> of information so were pretty much in the dark regarding what you're
> trying to achieve.
>
> At first you asked why the files were so big, I don't see the relation
> with the log files.
>
> Also I'm not sure why you referred to the number of versions, unless
> you are overwriting your data it's irrelevant to on-disk size. Again
> not enough information about what you're trying to do.
>
> J-D
>
> On Thu, Mar 31, 2011 at 12:27 AM, 陈加俊 <cjjvictory@gmail.com> wrote:
> > Can I skip the log files?
> >
> > On Thu, Mar 31, 2011 at 2:17 PM, 陈加俊 <cjjvictory@gmail.com> wrote:
> >>
> >> I found there is so many log files under the table folder and it is very
> >> big !
> >>
> >> On Thu, Mar 31, 2011 at 2:16 PM, 陈加俊 <cjjvictory@gmail.com> wrote:
> >>>
> >>> I fond there is so many log files under the table folder and it is very
> >>> big !
> >>>
> >>>
> >>>
> >>> On Thu, Mar 31, 2011 at 1:37 PM, 陈加俊 <cjjvictory@gmail.com>
wrote:
> >>>>
> >>>> thank you  JD
> >>>> the type of key is Long , and  the family's versions is 5 .
> >>>>
> >>>>
> >>>> On Thu, Mar 31, 2011 at 12:42 PM, Jean-Daniel Cryans
> >>>> <jdcryans@apache.org> wrote:
> >>>>>
> >>>>> (Trying to answer with the very little information you gave us)
> >>>>>
> >>>>> So in HBase every cell is stored along it's row key, family name,
> >>>>> qualifier and timestamp (plus length of each). Depending on how
big
> >>>>> your keys are, it can grow your total dataset. So it's not just
a
> >>>>> function of value sizes.
> >>>>>
> >>>>> J-D
> >>>>>
> >>>>> On Wed, Mar 30, 2011 at 9:34 PM, 陈加俊 <cjjvictory@gmail.com>
wrote:
> >>>>> > I scan the table ,It just has  29000 rows and each row only
has not
> >>>>> > reached
> >>>>> > 1 k . I save it to files which has 18M.
> >>>>> >
> >>>>> > But I used /app/cloud/hadoop/bin/hadoop fs -copyFromLocal ,
it has
> >>>>> > 99G .
> >>>>> >
> >>>>> > Why ?
> >>>>> > --
> >>>>> > Thanks & Best regards
> >>>>> > jiajun
> >>>>> >
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Thanks & Best regards
> >>>> jiajun
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Thanks & Best regards
> >>> jiajun
> >>>
> >>
> >>
> >>
> >> --
> >> Thanks & Best regards
> >> jiajun
> >>
> >
> >
> >
> > --
> > Thanks & Best regards
> > jiajun
> >
> >
>



-- 
Thanks & Best regards
jiajun

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message