hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Hbase inserts very slow
Date Wed, 16 Feb 2011 23:53:33 GMT
I don't understand... is having the same qualifier a hard requirement?
Worst case you could have a prefix.

J-D

On Wed, Feb 16, 2011 at 3:29 PM, Vishal Kapoor
<vishal.kapoor.in@gmail.com> wrote:
> J-D,
> I also should mention that my data distribution in the three families are
> 1:1:1
> I have three families so that I can have same qualifiers in them. and also
> the data in those families are LIVE:MasterA:MasterB
>
> Vishal
>
> On Wed, Feb 16, 2011 at 6:22 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
>
>> Very often there's no need for more than 1 family, I would suggest you
>> explore that possibility first.
>>
>> J-D
>>
>> On Wed, Feb 16, 2011 at 3:13 PM, Vishal Kapoor
>> <vishal.kapoor.in@gmail.com> wrote:
>> > does that mean I am only left with the choice of writing to the three
>> > families in three different map jobs?
>> > or can I do it any other way?
>> > thanks,
>> > Vishal
>> >
>> > On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org>
>> > wrote:
>> >>
>> >> First, loading into 3 families is currently a bad idea and is bound to
>> >> be inefficient, here's the reason why:
>> >> https://issues.apache.org/jira/browse/HBASE-3149
>> >>
>> >> Those log lines mean that your scanning of the first table is
>> >> generating a log of block cache churn. When setting up the Map, set
>> >> your scanner to setCacheBlocks(false) before passing it to
>> >> TableMapReduceUtil.initTableMapperJob
>> >>
>> >> Finally, you may want to give more memory to the region server.
>> >>
>> >> J-D
>> >>
>> >> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
>> >> <vishal.kapoor.in@gmail.com> wrote:
>> >> > Lars,
>> >> >
>> >> > I am still working on pseudo distributed.
>> >> > hadoop-0.20.2+737/
>> >> > and hbase-0.90.0 with the hadoop jar from the hadoop install.
>> >> >
>> >> > I have a LIVE_RAW_TABLE table, which gets values from a live system
>> >> > I go through each row of that table and get the row ids of two
>> reference
>> >> > tables from it.
>> >> > TABLE_A and TABLE_B, then I explode this to a new table LIVE_TABLE
>> >> > I use
>> >> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null, job);
>> >> >
>> >> >
>> >> > LIVE_TABLE has three families, LIVE, A, B and the row id is a
>> composite
>> >> > key
>> >> > reverseTimeStamp/rowidA/rowIdB
>> >> > after that a run a bunch of map reduce to consolidate the data,
>> >> > to start with I have around 15000 rows in LIVE_RAW_TABLE.
>> >> >
>> >> > when I start with my job, i see it running quite well till i am almost
>> >> > done
>> >> > with 5000 rows
>> >> > then it starts printing the message in the logs, which I use to not
>> see
>> >> > before.
>> >> > the job use to run for around 900 sec ( I have a lot of data parsing
>> >> > while
>> >> > exploding )
>> >> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows in
>> >> > LIVE_TABLE.
>> >> >
>> >> > after those debug messages, the job runs for around 2500 sec,
>> >> > I have not changed anything, including the table design.
>> >> >
>> >> > here is my table description.
>> >> >
>> >> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE', BLOOMFILTER
=>
>> >> > 'NONE',
>> >> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION =>
'NONE', TTL
>> =>
>> >> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE
>> =>
>> >> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE
=>
>> '0',
>> >> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647',
BLOCKSIZE
>> >> > =>
>> >> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
'B',
>> >> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS =>
'1',
>> >> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536',
>> >> > IN_MEMORY
>> >> > => 'false', BLOCKCACHE => 'true'}]}
>> >> >
>> >> > thanks for all your help.
>> >> >
>> >> > Vishal
>> >> >
>> >> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George <lars.george@gmail.com>
>> >> > wrote:
>> >> >
>> >> >> Hi Vishal,
>> >> >>
>> >> >> These are DEBUG level messages and are from the block cache, there
is
>> >> >> nothing wrong with that. Can you explain more what you do and see?
>> >> >>
>> >> >> Lars
>> >> >>
>> >> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
>> >> >> <vishal.kapoor.in@gmail.com> wrote:
>> >> >> > all was working fine and suddenly I see a lot of logs like
below
>> >> >> >
>> >> >> > 2011-02-15 22:19:04,023 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU
>> >> >> > eviction
>> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> >> > 2011-02-15 22:19:04,025 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU
>> >> >> > eviction
>> >> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB,
>> >> >> > multi=92.37
>> >> >> MB,
>> >> >> > memory=166.09 KB
>> >> >> > 2011-02-15 22:19:11,207 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU
>> >> >> > eviction
>> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> >> > 2011-02-15 22:19:11,444 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU
>> >> >> > eviction
>> >> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB,
>> >> >> > multi=93.32
>> >> >> MB,
>> >> >> > memory=166.09 KB
>> >> >> > 2011-02-15 22:19:21,494 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU
>> >> >> > eviction
>> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> > 2011-02-15 22:19:21,760 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU
>> >> >> > eviction
>> >> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB,
>> >> >> > multi=92.73
>> >> >> MB,
>> >> >> > memory=166.09 KB
>> >> >> > 2011-02-15 22:19:39,838 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU
>> >> >> > eviction
>> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> > 2011-02-15 22:19:39,852 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU
>> >> >> > eviction
>> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB,
>> >> >> > multi=91.48
>> >> >> MB,
>> >> >> > memory=166.09 KB
>> >> >> > 2011-02-15 22:19:49,768 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU
>> >> >> > eviction
>> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> > 2011-02-15 22:19:49,770 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
LRU
>> >> >> > eviction
>> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB,
>> >> >> > multi=90.35
>> >> >> MB,
>> >> >> > memory=166.09 KB
>> >> >> >
>> >> >> >
>> >> >> > I haven't changed anything including the table definitions.
>> >> >> > please let me know where to look...
>> >> >> >
>> >> >> > thanks,
>> >> >> > Vishal Kapoor
>> >> >> >
>> >> >>
>> >> >
>> >
>> >
>>
>

Mime
View raw message