hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop John <anoop.hb...@gmail.com>
Subject Re: All memstores flushed to be quite small files
Date Tue, 28 Mar 2017 03:32:11 GMT
You have to keep in mind one thing.  The memstore size 128 MB said  is
the memstore's heap size.  Not the sum of all cell's key+ value size.
 When a Cell is added into memstore there will be big overhead also.
(~100 bytes per cell java overhead).  Ur cell size is so small that
more than half of the total heap size per cell comes from overhead.
There itself ur effective data size flushed getting half.  And then on
top of that u have compression.   Does this sounds reasonable answer?
 (And u said u have only 1 CF so there is no split of 128 MB across
diff CFs)

-Anoop-

On Tue, Mar 28, 2017 at 7:37 AM, Hef <hef.online@gmail.com> wrote:
> We have only one column family.
> Table properties:
>  {NAME => 't', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW',
> REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'SNAPPY',
> MIN_VERSIONS => '0',KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '6216900',
> IN_MEMORY => 'false', BLOCKCACHE => 'true',TTL=> 5184000}
>
> Each cell is around 90bytes, with rowkey taking 60bytes, column qualifier
> taking 2 bytes, and value 4bytes. There may be 3600 columns at max per row,
> but generally most rows have no more than 1000 columns.
>
> Though SNAPPY is used, I don't think the compression ratio can be so high
> to be 18x.
>
> HBase Release:
> CDH5.9 HBase1.2
>
> Here are more logs:
>
> 2017-03-28 01:53:14,868 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Flushing 1/1 column families, memstore=128.02 MB
> 2017-03-28 01:53:15,024 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> writing data to region
> mytable,6A1066441DF045C6960D836EA5B82B16\x00\x00\x00\x00\x00\x00\x00\x19\x00\x06P\xC6,1490340809164.e9c7ee2a526aa7b696545dccb3f3feda.
> with WAL disabled. Data may be lost in the event of a crash.
> 2017-03-28 01:53:16,002 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Rolled WAL
> /hbase/WALs/ip-10-1-1-107,60020,1490588175688/ip-10-1-1-107%2C60020%2C1490588175688.null1.1490665962353
> with entries=107241, filesize=122.08 MB; new WAL
> /hbase/WALs/ip-10-1-1-107,60020,1490588175688/ip-10-1-1-107%2C60020%2C1490588175688.null1.1490665995822
> 2017-03-28 01:53:16,026 INFO
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed,
> sequenceid=14159831, memsize=128.0 M, hasBloomFilter=true, into tmp file
> hdfs://nameservice1/hbase/data/default/mytable/e9c7ee2a526aa7b696545dccb3f3feda/.tmp/225732a8f58842ddb10008e1dc6a872c
> 2017-03-28 01:53:16,288 INFO org.apache.hadoop.hbase.regionserver.HStore:
> Added
> hdfs://nameservice1/hbase/data/default/mytable/e9c7ee2a526aa7b696545dccb3f3feda/t/225732a8f58842ddb10008e1dc6a872c,
> entries=546485, sequenceid=14159831, *filesize=7.5 M*
> 2017-03-28 01:53:16,290 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Finished memstore flush of ~128.02 MB/134235854, currentsize=428.55
> KB/438840 for region
> mytable,6A1066441DF045C6960D836EA5B82B16\x00\x00\x00\x00\x00\x00\x00\x19\x00\x06P\xC6,1490340809164.e9c7ee2a526aa7b696545dccb3f3feda.
> in 1422ms, sequenceid=14159831, compaction requested=true
> 2017-03-28 01:53:16,334 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 283 ms,
> current pipeline:
> [DatanodeInfoWithStorage[10.1.1.107:50010,DS-07fc4de8-62bd-4aa3-8e18-c2ce56381113,DISK],
> DatanodeInfoWithStorage[10.1.1.104:50010,DS-c9749aaf-7213-41af-87ec-2214cf166830,DISK],
> DatanodeInfoWithStorage[10.1.1.102:50010
> ,DS-59b137d0-30cb-404f-8b7f-40ee8c18f72b,DISK]]
> 2017-03-28 01:53:16,386 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Rolled WAL
> /hbase/WALs/ip-10-1-1-107,60020,1490588175688/ip-10-1-1-107%2C60020%2C1490588175688.null0.1490665962438
> with entries=75434, filesize=90.34 MB; new WAL
> /hbase/WALs/ip-10-1-1-107,60020,1490588175688/ip-10-1-1-107%2C60020%2C1490588175688.null0.1490665996002
> 2017-03-28 01:53:16,387 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Archiving
> hdfs://nameservice1/hbase/WALs/ip-10-1-1-107,60020,1490588175688/ip-10-1-1-107%2C60020%2C1490588175688.null0.1490664817220
> to
> hdfs://nameservice1/hbase/oldWALs/ip-10-1-1-107%2C60020%2C1490588175688.null0.1490664817220
> 2017-03-28 01:53:16,435 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Archiving
> hdfs://nameservice1/hbase/WALs/ip-10-1-1-107,60020,1490588175688/ip-10-1-1-107%2C60020%2C1490588175688.null0.1490664854680
> to
> hdfs://nameservice1/hbase/oldWALs/ip-10-1-1-107%2C60020%2C1490588175688.null0.1490664854680
> 2017-03-28 01:53:16,485 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Archiving
> hdfs://nameservice1/hbase/WALs/ip-10-1-1-107,60020,1490588175688/ip-10-1-1-107%2C60020%2C1490588175688.null0.1490664886354
> to
> hdfs://nameservice1/hbase/oldWALs/ip-10-1-1-107%2C60020%2C1490588175688.null0.1490664886354
> 2017-03-28 01:53:16,584 INFO org.apache.hadoop.hbase.regionserver.HStore:
> Completed compaction of 10 file(s) in t of
> voqa_ts,\x00\x06\x00\x00\x00\x00\xA8Y\x93\x1A\x00\x06:\xE9\x00\x04\x06\xC8\xFAm\xA1\xCDb\xD8\x8D\xEDj\x00\x87E\xC7\xBC,1479467163762.522552668b70080a9032e7acf3173aab.
> into 6650c711a6ef47738b3cd08a0f1e8bcb(size=63.4 M), total size for store is
> 1.3 G. This selection was in queue for 0sec, and took 6sec to execute.
> 2017-03-28 01:53:16,584 INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed
> compaction: Request =
> regionName=voqa_ts,\x00\x06\x00\x00\x00\x00\xA8Y\x93\x1A\x00\x06:\xE9\x00\x04\x06\xC8\xFAm\xA1\xCDb\xD8\x8D\xEDj\x00\x87E\xC7\xBC,1479467163762.522552668b70080a9032e7acf3173aab.,
> storeName=t, fileCount=10, fileSize=61.7 M, priority=289,
> time=19792873029623870; duration=6sec
> 2017-03-28 01:53:20,284 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Flushing 1/1 column families, memstore=128.00 MB
> 2017-03-28 01:53:20,520 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> writing data to region
> mytable,F,1490340846182.d2cc89e3d6be47a90afc2d4965b876ab. with WAL
> disabled. Data may be lost in the event of a crash.
> 2017-03-28 01:53:21,475 INFO
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed,
> sequenceid=14186673, memsize=128.0 M, hasBloomFilter=true, into tmp file
> hdfs://nameservice1/hbase/data/default/mytable/d2cc89e3d6be47a90afc2d4965b876ab/.tmp/eb763a36a4684feabcc884f7afe34070
> 2017-03-28 01:53:21,511 INFO org.apache.hadoop.hbase.regionserver.HStore:
> Added
> hdfs://nameservice1/hbase/data/default/mytable/d2cc89e3d6be47a90afc2d4965b876ab/t/eb763a36a4684feabcc884f7afe34070,
> entries=539522, sequenceid=14186673, *filesize=7.4 M*
> 2017-03-28 01:53:21,512 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Finished memstore flush of ~128.00 MB/134222903, currentsize=388.75
> KB/398080 for region
> mytable,F,1490340846182.d2cc89e3d6be47a90afc2d4965b876ab. in 1228ms,
> sequenceid=14186673, compaction requested=true
> 2017-03-28 01:53:22,533 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 171 ms,
> current pipeline:
> [DatanodeInfoWithStorage[10.1.1.107:50010,DS-07fc4de8-62bd-4aa3-8e18-c2ce56381113,DISK],
> DatanodeInfoWithStorage[10.1.1.104:50010,DS-c9749aaf-7213-41af-87ec-2214cf166830,DISK],
> DatanodeInfoWithStorage[10.1.1.105:50010
> ,DS-b7e613a5-adc7-4b71-8ea1-28982dcac15b,DISK]]
> 2017-03-28 01:53:22,533 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 171 ms,
> current pipeline:
> [DatanodeInfoWithStorage[10.1.1.107:50010,DS-07fc4de8-62bd-4aa3-8e18-c2ce56381113,DISK],
> DatanodeInfoWithStorage[10.1.1.104:50010,DS-c9749aaf-7213-41af-87ec-2214cf166830,DISK],
> DatanodeInfoWithStorage[10.1.1.105:50010
> ,DS-b7e613a5-adc7-4b71-8ea1-28982dcac15b,DISK]]
> 2017-03-28 01:53:22,533 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 171 ms,
> current pipeline:
> [DatanodeInfoWithStorage[10.1.1.107:50010,DS-07fc4de8-62bd-4aa3-8e18-c2ce56381113,DISK],
> DatanodeInfoWithStorage[10.1.1.104:50010,DS-c9749aaf-7213-41af-87ec-2214cf166830,DISK],
> DatanodeInfoWithStorage[10.1.1.105:50010
> ,DS-b7e613a5-adc7-4b71-8ea1-28982dcac15b,DISK]]
> 2017-03-28 01:53:22,533 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 171 ms,
> current pipeline:
> [DatanodeInfoWithStorage[10.1.1.107:50010,DS-07fc4de8-62bd-4aa3-8e18-c2ce56381113,DISK],
> DatanodeInfoWithStorage[10.1.1.104:50010,DS-c9749aaf-7213-41af-87ec-2214cf166830,DISK],
> DatanodeInfoWithStorage[10.1.1.105:50010
> ,DS-b7e613a5-adc7-4b71-8ea1-28982dcac15b,DISK]]
> 2017-03-28 01:53:22,533 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 171 ms,
> current pipeline:
> [DatanodeInfoWithStorage[10.1.1.107:50010,DS-07fc4de8-62bd-4aa3-8e18-c2ce56381113,DISK],
> DatanodeInfoWithStorage[10.1.1.104:50010,DS-c9749aaf-7213-41af-87ec-2214cf166830,DISK],
> DatanodeInfoWithStorage[10.1.1.105:50010
> ,DS-b7e613a5-adc7-4b71-8ea1-28982dcac15b,DISK]]
> 2017-03-28 01:53:23,254 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Flushing 1/1 column families, memstore=128.01 MB
> 2017-03-28 01:53:23,442 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> writing data to region
> mytable,51FEE138DF654EC0B8B0677D0B7D31F7\x00\x00\x00\x00\x00\x00\x00\x9B\x00\x06Q\x0A\x1DX\xBC\x0B\x90\xC4\xDB\xBC\x16\xEA\xAB(d%L\xBC,1490340802704.70be161c4c6c5cdfbb5b468a89d678b8.
> with WAL disabled. Data may be lost in the event of a crash.
> 2017-03-28 01:53:24,374 INFO
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed,
> sequenceid=14235013, memsize=128.0 M, hasBloomFilter=true, into tmp file
> hdfs://nameservice1/hbase/data/default/mytable/70be161c4c6c5cdfbb5b468a89d678b8/.tmp/b14cbbe5a2b243a8b66e0a187a2f91b6
> 2017-03-28 01:53:24,427 INFO org.apache.hadoop.hbase.regionserver.HStore:
> Added
> hdfs://nameservice1/hbase/data/default/mytable/70be161c4c6c5cdfbb5b468a89d678b8/t/b14cbbe5a2b243a8b66e0a187a2f91b6,
> entries=530828, sequenceid=14235013,* filesize=7.3 M*
> 2017-03-28 01:53:24,429 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Finished memstore flush of ~128.01 MB/134224915, currentsize=181.56
> KB/185920 for region
> mytable,51FEE138DF654EC0B8B0677D0B7D31F7\x00\x00\x00\x00\x00\x00\x00\x9B\x00\x06Q\x0A\x1DX\xBC\x0B\x90\xC4\xDB\xBC\x16\xEA\xAB(d%L\xBC,1490340802704.70be161c4c6c5cdfbb5b468a89d678b8.
> in 1174ms, sequenceid=14235013, compaction requested=true
>
>
>
>
>
>
> On Mon, Mar 27, 2017 at 10:47 PM, Dave Latham <latham@davelink.net> wrote:
>
>> Do you have compression enabled, and is your data highly compressible?
>>
>> On Mon, Mar 27, 2017 at 6:26 AM, Hef <hef.online@gmail.com> wrote:
>>
>> > Hi,
>> > Does anyone have an idea why most of my 128MB memstore flushed files are
>> > only several MBs?
>> >
>> > There are a lot of logs look as below:
>> >
>> > 2017-03-27 13:10:25,064 INFO
>> > org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed,
>> > sequenceid=13042496, memsize=128.0 M, hasBloomFilter=true, into tmp file
>> > hdfs://xxxxx
>> > 2017-03-27 13:10:25,093 INFO org.apache.hadoop.hbase.
>> regionserver.HStore:
>> > Added
>> > hdfs://nameservice1/hbase/data/default/mytable/
>> > 82f4e7d6931a57edf9654f32625f9fb7/t/0f59c07ef24644a38dce577d7fb6ac1b,
>> > entries=545226, sequenceid=13042496, filesize=7.3 M
>> > 2017-03-27 13:10:25,094 INFO org.apache.hadoop.hbase.
>> regionserver.HRegion:
>> > Finished memstore flush of ~128.02 MB/134235405, currentsize=352.34
>> > KB/360800 for region
>> > mytable,3808555269964B038A7BC1EC6CC19B03\x00\x00\x00\x00\x00\x00\
>> > x00\xA7\x00\x06P\xC9])\x9A\xC9\x01\xA9Q<\xBDX\x88\x07\
>> > x82\xEAR\xD7,1490340797381.82f4e7d6931a57edf9654f32625f9fb7.
>> > in 1052ms, sequenceid=13042496, compaction requested=false
>> >
>> > A 128MB memstore flushed to be only 7.3MB hfile, is that normal?
>> > In what case  could this happen?
>> >
>> > Thanks
>> > Hef
>> >
>>

Mime
View raw message