kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergejs Andrejevs <S.Andrej...@intrum.com>
Subject Slow queries after massive deletions. Is it due to compaction?
Date Thu, 22 Nov 2018 15:57:45 GMT
Hi,

Is there a way to call of MajorDeltaCompactionOp for a table/tablet/rowset?

We've faced with an issue:

0.       Kudu table is created

1.       Data is inserted there

2.       Run select query - it goes fast (matter of a few seconds)

3.       Delete all data from the table (but not dropping the table)

4.       Run select query - it goes slow (4-6 minutes)

Investigating and reading documentation of Kudu has leaded to a thought that delete operations
are done logically, but physically the table contains written data and deletes are applied
each time on top of it.
I had a look at kudu tablet and there are quite large "redo" blocks (see one of rowset examples
below).
There was a thought that compression and encoding play their role (reducing the chances to
run compaction), but removing them (keeping column defaults) hasn't helped as well.
We run tservers

-          maintenance_manager_num_threads=10 (increased comparing to default)

-          tablet_delta_store_major_compact_min_ratio=0.10000000149011612 (default value)

-          kudu 1.7.0-cdh5.15.0

>From documentation and comments in code I saw the description of tablet_delta_store_major_compact_min_ratio:
"Minimum ratio of sizeof(deltas) to sizeof(base data) before a major compaction."
And "Major compactions: the score will be the result of sizeof(deltas)/sizeof(base data),
unless it is smaller than tablet_delta_store_major_compact_min_ratio or if the delta files
are only composed of deletes, in which case the score is brought down to zero."
So basically the table stays in such state for more than a day.

While majority of tables will have mostly scans, there will be a couple of large tables with
large number of deletions (but not of all data).
Could you advise how to improve scans after large deletions?

block-id | block-kind  | column| cfile-size | cfile-data-type |                          
                           cfile-delta-stats                                             
        | cfile-encoding  | cfile-compression
----------+-------------+-------+----------- +-----------------+-----------------------------------------------------------------------------------------------------------------------------+-----------------+-------------------
24693586 | column      | var1  | 2.80M      | int64           |                          
                                                                                         
        | BIT_SHUFFLE     | NO_COMPRESSION
24693587 | column      | var2  | 100.7K     | int64           |                          
                                                                                         
        | BIT_SHUFFLE     | NO_COMPRESSION
24693588 | column      | var3  | 4.95M      | int64           |                          
                                                                                         
        | BIT_SHUFFLE     | NO_COMPRESSION
24693589 | column      | var4  | 1.58M      | string          |                          
                                                                                         
        | DICT_ENCODING   | LZ4
24693590 | column      | var5  | 8.82M      | string          |                          
                                                                                         
        | PLAIN_ENCODING  | LZ4
24693591 | column      | var6  | 2.7K       | string          |                          
                                                                                         
        | DICT_ENCODING   | LZ4
24700691 | redo        |       | 14.04M     | binary          | ts range=[6319363930065100800,
6319364908129189926], delete_count=[2190649], reinsert_count=[0], update_counts_by_col_id=[]
| PLAIN_ENCODING  | LZ4
24693592 | bloom       |       | 5.04M      | binary          |                          
                                                                                         
        | PLAIN_ENCODING  | NO_COMPRESSION
24693593 | adhoc-index |       | 8.94M      | binary          |                          
                                                                                         
        | PREFIX_ENCODING | LZ4

Kind Regards,
Sergejs Andrejevs


Mime
View raw message