[ https://issues.apache.org/jira/browse/PHOENIX4925?page=com.atlassian.jira.plugin.system.issuetabpanels:alltabpanel
]
Bin Shi updated PHOENIX4925:

Description:
As reported, Query compilation (for the sample queries showed below), especially deriving
estimation and generating parallel scans from guide posts, becomes much slower after we introduced
Phoenix Stats.
a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY pk1__c,pk2__c
d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c ORDER BY pk1__c,pk2__c
e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' OR pk__c
>= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to make the primary
key.
By using prefix encoding for guide post info, we have to decode and traverse guide posts
sequentially, which causes time complexity in BaseResultIterators.getParallelScan(...) to
be O( n ) , where n is the total count of guide posts.
According to PHOENIX2417, to reduce footprint in client cache and over transmition, the prefix
encoding is used as inmemory and overthewire encoding for guide post info.
We can use Segment Tree to address both memory and performance concerns. The guide posts are
partitioned to k chunks (k=1024?), each chunk is encoded by prefix encoding and the encoded
data is a leaf node of the tree. The inner node contains summary info (the count of rows,
the data size) of the sub tree rooted at the inner node.
With this tree like data structure, compared to the current data structure, the increased
size (mainly coming from the n/k1 inner nodes) is ignorable. The time complexity for queries
a, b, c can be reduced to O(m) where m is the total count of regions; the time complexity
for "EXPLAN" queries a, b, c can be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE
ONLY)", it can even be reduced to O(1). For queries d and e, the time complexity to find the
start of target scan ranges can be reduced to O(log(n/k)).
The tree can also integrate AVL and B+ characteristics to support partial load/unload when
interacting with stats client cache.
was:
As reported, Query compilation (for the sample queries showed below), especially deriving
estimation and generating parallel scans from guide posts, becomes much slower after we introduced
Phoenix Stats.
a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY pk1__c,pk2__c
d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c ORDER BY pk1__c,pk2__c
e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm' OR pk__c
>= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to make the primary
key.
By using prefix encoding for guide post info, we have to decode and traverse guide posts
sequentially, which causes time complexity in BaseResultIterators.getParallelScan(...) to
be O(n) , where n is the total count of guide posts.
According to PHOENIX2417, to reduce footprint in client cache and over transmition, the prefix
encoding is used as inmemory and overthewire encoding for guide post info.
We can use Segment Tree to address both memory and performance concerns. The guide posts are
partitioned to k chunks (k=1024?), each chunk is encoded by prefix encoding and the encoded
data is a leaf node of the tree. The inner node contains summary info (the count of rows,
the data size) of the sub tree rooted at the inner node.
With this tree like data structure, compared to the current data structure, the increased
size (mainly coming from the n/k1 inner nodes) is ignorable. The time complexity for queries
a, b, c can be reduced to O(m) where m is the total count of regions; the time complexity
for "EXPLAN" queries a, b, c can be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE
ONLY)", it can even be reduced to O(1). For queries d and e, the time complexity to find the
start of target scan ranges can be reduced to O(log(n/k)).
The tree can also integrate AVL and B+ characteristics to support partial load/unload when
interacting with stats client cache.
> Use Segment tree to organize Guide Post Info
> 
>
> Key: PHOENIX4925
> URL: https://issues.apache.org/jira/browse/PHOENIX4925
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Bin Shi
> Assignee: Bin Shi
> Priority: Major
>
> As reported, Query compilation (for the sample queries showed below), especially deriving
estimation and generating parallel scans from guide posts, becomes much slower after we introduced
Phoenix Stats.
> a. SELECT f1__c FROM MyCustomBigObject__b ORDER BY Pk1__c
> b. SELECT f1__c FROM MyCustomBigObject__b WHERE nonpk1__c = ‘x’ ORDER BY Pk1__c
> c. SELECT f1__c FROM MyCustomBigObject__b WHERE pk2__c = ‘x’ ORDER BY pk1__c,pk2__c
> d. SELECT f1__c FROM MyCustomBigObject__b WHERE pk1__c = ‘x’ AND nonpk1__c ORDER
BY pk1__c,pk2__c
> e. SELECT f1__c FROM MyCustomBigObject__b WHERE pk__c >= 'd' AND pk__c <= 'm'
OR pk__c >= 'o' AND pk__c <= 'x' ORDER BY pk__c // pk__c is the only column to make
the primary key.
>
> By using prefix encoding for guide post info, we have to decode and traverse guide posts
sequentially, which causes time complexity in BaseResultIterators.getParallelScan(...) to
be O( n ) , where n is the total count of guide posts.
> According to PHOENIX2417, to reduce footprint in client cache and over transmition,
the prefix encoding is used as inmemory and overthewire encoding for guide post info.
> We can use Segment Tree to address both memory and performance concerns. The guide posts
are partitioned to k chunks (k=1024?), each chunk is encoded by prefix encoding and the
encoded data is a leaf node of the tree. The inner node contains summary info (the count of
rows, the data size) of the sub tree rooted at the inner node.
> With this tree like data structure, compared to the current data structure, the increased
size (mainly coming from the n/k1 inner nodes) is ignorable. The time complexity for queries
a, b, c can be reduced to O(m) where m is the total count of regions; the time complexity
for "EXPLAN" queries a, b, c can be reduced to O(m) too, and if we support "EXPLAIN (ESTIMATE
ONLY)", it can even be reduced to O(1). For queries d and e, the time complexity to find the
start of target scan ranges can be reduced to O(log(n/k)).
> The tree can also integrate AVL and B+ characteristics to support partial load/unload
when interacting with stats client cache.
>

This message was sent by Atlassian JIRA
(v7.6.3#76005)
