hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: the files of one table is so big?
Date Fri, 01 Apr 2011 16:51:46 GMT
I just tried to copy a local file to HDFS and it's the same size.
Maybe you are reading it wrong?

J-D

On Thu, Mar 31, 2011 at 11:58 PM, 陈加俊 <cjjvictory@gmail.com> wrote:
> I want to copy the files of one table from one cluster to another cluster .
> So I do it by step:
> 1 . bin/hadoop fs -copyToLocal at A
> 2. scp files from A to B
> 3.  bin/hadoop fs -copyFromLocal at B
> I scan the table and save the data to a file by a file formats which is
> defined by me before yesterday ,and I found the file is only 18M.
> It is stange that used -copyFromLocal , the file has 99G .
> Why?
> 2011/4/1 Jean-Daniel Cryans <jdcryans@apache.org>
>>
>> Depends what you're trying to do? Like I said you didn't give us a lot
>> of information so were pretty much in the dark regarding what you're
>> trying to achieve.
>>
>> At first you asked why the files were so big, I don't see the relation
>> with the log files.
>>
>> Also I'm not sure why you referred to the number of versions, unless
>> you are overwriting your data it's irrelevant to on-disk size. Again
>> not enough information about what you're trying to do.
>>
>> J-D
>>
>> On Thu, Mar 31, 2011 at 12:27 AM, 陈加俊 <cjjvictory@gmail.com> wrote:
>> > Can I skip the log files?
>> >
>> > On Thu, Mar 31, 2011 at 2:17 PM, 陈加俊 <cjjvictory@gmail.com> wrote:
>> >>
>> >> I found there is so many log files under the table folder and it is
>> >> very
>> >> big !
>> >>
>> >> On Thu, Mar 31, 2011 at 2:16 PM, 陈加俊 <cjjvictory@gmail.com>
wrote:
>> >>>
>> >>> I fond there is so many log files under the table folder and it is
>> >>> very
>> >>> big !
>> >>>
>> >>>
>> >>>
>> >>> On Thu, Mar 31, 2011 at 1:37 PM, 陈加俊 <cjjvictory@gmail.com>
wrote:
>> >>>>
>> >>>> thank you  JD
>> >>>> the type of key is Long , and  the family's versions is 5 .
>> >>>>
>> >>>>
>> >>>> On Thu, Mar 31, 2011 at 12:42 PM, Jean-Daniel Cryans
>> >>>> <jdcryans@apache.org> wrote:
>> >>>>>
>> >>>>> (Trying to answer with the very little information you gave
us)
>> >>>>>
>> >>>>> So in HBase every cell is stored along it's row key, family
name,
>> >>>>> qualifier and timestamp (plus length of each). Depending on
how big
>> >>>>> your keys are, it can grow your total dataset. So it's not just
a
>> >>>>> function of value sizes.
>> >>>>>
>> >>>>> J-D
>> >>>>>
>> >>>>> On Wed, Mar 30, 2011 at 9:34 PM, 陈加俊 <cjjvictory@gmail.com>
wrote:
>> >>>>> > I scan the table ,It just has  29000 rows and each row
only has
>> >>>>> > not
>> >>>>> > reached
>> >>>>> > 1 k . I save it to files which has 18M.
>> >>>>> >
>> >>>>> > But I used /app/cloud/hadoop/bin/hadoop fs -copyFromLocal
, it has
>> >>>>> > 99G .
>> >>>>> >
>> >>>>> > Why ?
>> >>>>> > --
>> >>>>> > Thanks & Best regards
>> >>>>> > jiajun
>> >>>>> >
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Thanks & Best regards
>> >>>> jiajun
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Thanks & Best regards
>> >>> jiajun
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Thanks & Best regards
>> >> jiajun
>> >>
>> >
>> >
>> >
>> > --
>> > Thanks & Best regards
>> > jiajun
>> >
>> >
>
>
>
> --
> Thanks & Best regards
> jiajun
>
>

Mime
View raw message