Hi Qiao,

Glad to know that the DDL chan= ge helped.

I think you can also play with the foll= owing for better performance.=C2=A0
1. CQD=C2=A0parallel_num= _esps.=C2=A0 This CQD sets the degree of parallelism to a fixed value for a= ll parallel layers (like the layer composed of operator 1,2 and 3 in the qu= ery plan).=C2=A0 It is OK to use the CQD during query optimization. In prod= uction however, you may consider use a different way which is my second poi= nt below (item 2). The benefit of not using CQD=C2=A0parallel_num_es= ps is that this will allow the compiler to choose the parallelism based on = data size per layer. Your system resource could be better controlled this w= ay.
2. CQD=C2=A0MAX_ESPS_PER_CPU_PER_OP controls the # of ESPs per nod= e. We =C2=A0normally allocate 2 ESPs per node (that is the reason there are= 6 =3D 3 * 2 in the plan). You can alter the CQD=C2=A0MAX_ESPS_PER_CPU_PER_= OP to a value so that you want maximally X number of ESPs per node (say 4).= =C2=A0 The formula for the CQD is X / #coresPerNode =3D X/8. So to boost th= e degree of parallelism from 6 to 12 (or from 2 to 4 ESPs per node), you ne= ed to set the CQD to 4/8=3D0.5.
3. From the stats, UID (with total # o= f distinct value of=C2=A08319955)=C2=A0is more 'unique' than VID (5= 78912). It will be a good idea to switch the position of UID and VID in pri= mary key to favor queries with the search condition SID=3D<u> and V_D= ATE =3D<v> and VID=3D<w>.=C2=A0
4. The SALT clause=C2=A0(S= ID, V_DATE, UID, VID) could be reduced to=C2=A0(SID, V_DATE, VID) if the ab= ove search condition (in item 3) is issued often and the expected number of= matching rows is small (say less than 200). In this case, the compiler may= choose a serial plan fragment which is perfect to use one ESP to read from= one partition where all the matching rows reside. =C2=A0=C2=A0
Thanks --Qifan
=C2=A0
LC =C2=A0 RC =C2=A0 OP =C2=A0 OPERATOR =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0OPT =C2=A0 =C2=A0 =C2=A0 DESCRIPTION =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CARD =C2=A0=C2=A0
---- ---- ---- ----------= ---------- =C2=A0-------- =C2=A0-------------------- =C2=A0---------=
=C2=A0=
4 =C2=A0 =C2=A0. =C2=A0 =C2=A05 =C2=A0 =C2=A0root =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A01.00E+000
3 =C2=A0 =C2=A0. =C2=A0 =C2=A04 =C2=A0 =C2=A0sort_partial_aggr= _ro =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01.00E+000
2 =C2=A0 =C2=A0. =C2=A0 =C2=A03 =C2=A0 =C2=A0esp_e= xchange =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A01:6(hash2) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01.00E+000
1 =C2=A0 =C2=A0. =C2=A0 =C2=A02 =C2=A0 =C2= =A0sort_partial_aggr_le =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01.00E+= 000
. =C2=A0 =C2=A0. =C2=A0 =C2= =A01 =C2=A0 =C2=A0trafodion_scan =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0VISIT_FROM_HIVE =C2=A0 =C2=A0 =C2=A0 8.03E+006

--- SQL = operation complete.=C2=A0

On Tue, Sep 20, 2016= at 12:28 AM, Eric Owhadi wrote:=

I also see that you should not be = using DIVISION BY, since your V_DATE are already on day boundary. Using DIV= ISION_BY is counter-productive here.

=C2=A0

And sorry I opened wrong file, I had old and new ope= ned at the same time and looked wrong window J, you di= d change the charset and char length, my bad,

Eric

= Sent: Tuesday, September 20, 2016 12:10 AM
To: '
=E4=B9=94=E5=BD=A6=E5=85=8B' <qyanke@gmail.com>; 'user@trafodi= on.incubator.apache.org' <user@trafodion.incubator.apach= e.org>
Cc: 'dev' <dev@trafodion.incubator.apa= che.org>; Qifan Chen <qifan.chen@esgyn.com>
Subject: RE: trafodio= n query optimization

=C2=A0

Hi Qiao

=C2=A0

Optimizer picked 6 as DOP= . You can force it higher:

=C2=A0<= /p>

Use

CQD paral= lel_num_esps =E2=80=9812=E2=80=99;

Then p= repare and run the query.

=C2=A0

You may experiment with re-generating you table w= ith SALT 15 and CQD parallel_num_esps =E2=80=9815=E2=80=99.

=C2=A0

you can even dr= aw a curve

SALT 18, parallel_num_esp =E2= =80=9818=E2=80=99

SALT =C2=A021, paralle= l_num_esps =E2=80=9821=E2=80=99

Etc 3 by= 3 until you see that performance stop getting better, and actually drops.<= /span>

=C2=A0

<= span style=3D"font-size:11.0pt;font-family:"Calibri",sans-serif">= I see you did not change VARCHAR to CHAR and did not use Charset ISO88591: = this is important to optimize row size. UTF8 is very greedy specially for P= K columns. When I see your sample sid, I wonder if you really need UTF8.

=C2=A0

Sa= me principle apply for the other columns, but less important than for key.<= /span>

=C2=A0

<= span style=3D"font-size:11.0pt;font-family:"Calibri",sans-serif">= Ultimately you can change the default value of HBASE_OPTIONS BLOCKSI= ZE

Default is 65565, you can try doubling or quad= rupling it. This will improve scan rate but crease any query that does rand= om access.

=C2=A0

Also = I checked in an experimental feature to allow scanning in parallel without = using ESP (using multithreading instead), in case memory resource consumed = by ESP becomes a bottleneck, but I would not recommend it yet as it is =E2= =80=9Cexperimental=E2=80=9D still=E2=80=A6

Except= if you start seeing memory pressure=E2=80=A6

Hop= e this helps,
Eric

=C2=A0

<= p class=3D"MsoNormal">=C2=A0

=C2=A0

=C2=A0

From: =E4=B9=94=E5=BD=A6=E5=85=8B [mailto:qyanke@g= mail.com]
Sent: Monday, September 19, 2016 = 11:38 PM
To: user@trafodion.incubator.apache.org
Subject: Re: trafodio= n query optimization

=C2=A0

Many thanks =C2=A0to Eric and Qifan again.

=

I upload a new log about the = new ddl and the query plan.

@Eric, Following= your advice I modified the ddl, the sum query now takes 25seconds while th= e old ddl will take 90seconds.(there are 3 nodes in my cluster, 8cores(16 v= -cores) and 16GRAM), Is there any other optimize options to make the sum qu= ery more fast (better less than 10seconds)?

@Qifan, I log the output of showstats command in the attachment, look= ing forward more suggestions.

= =C2=A0

Thanks again.

Qiao

=C2=A0

Qifan Chen <qifan.chen@esgyn.com>=E4=BA=8E2016=E5=B9=B49=E6=9C=8819=E6=97=A5=E5=91=A8=E4=B8=80 =E4=B8=8B=E5=8D=889:23=E5=86=99=E9=81=93=EF=BC=9A

=

Hi Qiao,

=C2=A0

Thank you for = the data. It is very helpful.=C2=A0

= =C2=A0

There are several things notice= d.

• The key columns = are:=C2=A0_SALT_, _DIVISION_1_, SID, V_DATE, UID, VID
• The salt column is built from column SID only, which means all rows = with identical SID values V will be stored in the same partition.
• From the query plan, the compiler assigns 6 executor pro= cesses (we call ESP) to read the salted table of 12 partitions: 1 ESP reads= 2 salt partitions.
• The frequency of V is high = when sid=3D'6b2a0957' (~8million rows), all these relevant rows are= handled by one execution process out of 6. That probably is the reason of = not much parallelism observed.=C2=A0
• You can us= e SQL command showstats=C2=A0with detail option to check the frequen= cy on column SID.=C2=A0
• showstats for table=C2=A0TRAFODION.SEABASE.VISIT_F= ROM_HIVE2 on SID detail;
• If high frequency per unique value on column SID is confirmed, we pro= bably should consider our next step of action. For example, we could add so= me columns from the primary key to the SALT clause to help spread V of SID = to all 12 partitions.

Could you please= send us the output of the showstats command above, and the showstats comma= nd below for all columns in the table? =C2=A0

=C2=A0

showstats for ta= ble TRAFODION.SEABASE.VISIT_FROM_HIVE2 on every column;

=

=C2=A0

Thanks -= -Qifan

=C2=A0

On Mon, Sep 19, 2016 at 3:15 AM, =E4=B9=94=E5=BD=A6=E5=85=8B <qyanke@gmail.c= om> wrote:

Thanks Eric = and Qifan. I am sorry to reply after so long a time because I'm on the = Chinese mid-autumn festival holiday.

Accordi= ng to Qifan's advice, I upload a log which contains the DDL and the que= ry plan hope to get more advice.

=C2= =A0

and= to Eric, I summit a jira about the block-encoding and the compression, https://issues.apache.org/jira/browse/TRAFODION-2195, so I on= ly use the hbase compression.

Qifan Che= n <qifan.chen@= esgyn.com>=E4=BA= =8E2016=E5=B9=B49=E6=9C=8812=E6=97=A5=E5=91=A8=E4= =B8=80 =E4=B8=8B= =E5=8D=8810:43=E5= =86=99=E9=81=93=EF=BC=9A

Hi Qiao,

=C2=A0

<= div>

You can also send us the DDL and the query plan = to help with the tuning.=C2=A0

=C2=A0<= /p>

To generate a query plan, do the follo= wing from sqlci, and the plan is in text file mylog.=C2=A0

=C2=A0

1. log mylog clear;
2. prepare xx f= rom <your query>;
3. explain xx;
4. explain options 'f' xx;
5. exit;

=C2=A0

Thanks --Qifan

=C2=A0

On Mon, Sep 12, 2016 at 8:21 AM, E= ric Owhadi <e= ric.owhadi@esgyn.com> wrote:

Hello Qiao,
When you say whatever the table ddl it limits things a lot,= as table ddl
will help define several things that will drastically impr= ove the degree of
parallelism and the table size.

The DOP (degree= of parallelism) of the scan operator is constrained by the
number of re= gions your table uses. So if you want to increase DOP, you need
to parti= tion your table using the syntax like:
create table customer_demographic= s_salt
(
=C2=A0cd_demo_sk int not null
=C2=A0, cd_gender char(1)=C2=A0, cd_marital_status char(1)
=C2=A0, cd_education_status char(20)=
=C2=A0, cd_purchase_estimate int
=C2=A0, cd_credit_rating char(10)=C2=A0, cd_dep_count int
=C2=A0, cd_dep_employed_count int
=C2=A0, = cd_dep_college_count int
=C2=A0, primary key (cd_demo_sk)
)
salt u= sing 12 partitions
ATTRIBUTES ALIGNED FORMAT
HBASE_OPTIONS
=C2=A0 = (
=C2=A0 =C2=A0 DATA_BLOCK_ENCODING =3D 'FAST_DIFF',
=C2=A0 = =C2=A0 COMPRESSION =3D 'SNAPPY'
=C2=A0 );

you can experim= ent with different values of number of partitions (but pick a
multiple o= f 3 since you have 3 nodes).

Then the optimizer will pick the DOP wi= th a compromise of resource usage vs
gain in speed.
If you want to fo= rce higher DOP than what optimizer selected, you can use :
CQD parallel_= num_esps '12';
To force it to one ESP per partition (assuming yo= u picked 12 partitions).
You can verify what optimizer picked as DOP by = doing an explain on the
query.

Other important factors plays in p= erformance:
- use of aligned format (see above example)
- careful cho= ice of the primary key (bad idea to use a varchar with big max
size)
= - It is good idea to use compression and encoding (see the most common
o= ptions we use above)
- you can also increase the HBASE_OPTIONS BLOCKSIZE= parameter, but there is
a drawback: increasing it will increase perform= ance of SCAN but decrease
performance of keyed access. That is why I did= not include it in the example
above.

Hope this helps,
Regards= ,
Eric

-----Original Message-----
From: =E4=B9=94=E5=BD=A6=E5= =85=8B [mailto:qyanke@gmail.com]
Sent: Monday, September 12, 2016 1:22 AM
To: <= a href=3D"mailto:user@trafodion.incubator.apache.org" target=3D"_blank">use= r@trafodion.incubator.apache.org; dev@trafodion.incubator.apache= .org
Subject: trafodion query optimization

Hi all,
=C2=A0 = =C2=A0 =C2=A0I executed the sum and count query on my table where the clust= er has
three nodes. I found that the sum query is not well parallel exec= uted(not
all the three nodes get high load when executing the sum query)= and the cpu
load is very high while the memory load is very low(the mac= hines have 16
cores and 16GB memory). My sum query on the 12 million dat= a sets takes about
2 minutes and a half time.
=C2=A0 =C2=A0 So my que= stion is that is there any optimization advice that I can use
to improve= the query performance and maximize the usage of my machines, what
ever = the configuration or the table ddl.
=C2=A0 =C2=A0 Any replies is appreci= ated.

Thanks,
Qiao

<= br>

=C2=A0

=

--

Regards, --Qifan

=C2=A0

=C2=A0

--

Regards, --Qifan

=C2=A0

<= /div>

--
Regards, --Qifan

--001a113fb460197d81053cf06d8b--