From user-return-704-apmail-trafodion-user-archive=trafodion.apache.org@trafodion.incubator.apache.org Tue Sep 20 13:30:07 2016 Return-Path: X-Original-To: apmail-trafodion-user-archive@minotaur.apache.org Delivered-To: apmail-trafodion-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F0A10197FC for ; Tue, 20 Sep 2016 13:30:07 +0000 (UTC) Received: (qmail 13433 invoked by uid 500); 20 Sep 2016 13:30:07 -0000 Delivered-To: apmail-trafodion-user-archive@trafodion.apache.org Received: (qmail 13393 invoked by uid 500); 20 Sep 2016 13:30:07 -0000 Mailing-List: contact user-help@trafodion.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@trafodion.incubator.apache.org Delivered-To: mailing list user@trafodion.incubator.apache.org Received: (qmail 13382 invoked by uid 99); 20 Sep 2016 13:30:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Sep 2016 13:30:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 690A5C3B40 for ; Tue, 20 Sep 2016 13:30:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.777 X-Spam-Level: **** X-Spam-Status: No, score=4.777 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, KAM_LINEPADDING=1.2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=2.397] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=esgyn.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id vc0Ky0RWUiIK for ; Tue, 20 Sep 2016 13:30:05 +0000 (UTC) Received: from mail-lf0-f54.google.com (mail-lf0-f54.google.com [209.85.215.54]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 7C42260D24 for ; Tue, 20 Sep 2016 13:30:04 +0000 (UTC) Received: by mail-lf0-f54.google.com with SMTP id l131so13975107lfl.2 for ; Tue, 20 Sep 2016 06:30:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=esgyn.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=g+jhoUUVGxpR8Ya7LQ6FS+Yi3oIP/5B+sM+crxOps3M=; b=NmW3sy51AWZZffC2Q9vwNVwNrgj/g9nQCq6aUsqxdCtTZlprs2Zvtih+iiP4abcdAX pegqNcKqRjUN5EPmshrI/B6LRRflgyNx/VmUvAHF6OvK2vlhtNgXcB7cFBskENQl16NZ oSWjusB9l08DnWFsvTpVkWDwc1XJzdWfdOAFw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=g+jhoUUVGxpR8Ya7LQ6FS+Yi3oIP/5B+sM+crxOps3M=; b=QZZDeAW0k82ikutxO5ml67AsrSbeV799LTn/1FK3yThQvwMxDTf6afH6fEyORkuVxg N6m/u5O8eirvHt/S54adkD+8l7cRhn6iJhbR9ezRFJogMTGtfq48aUqc0obfkmuW9rlk Z4MltAyczh2YH+3GD3oiXBL8dLYq3gFgatd2BeUXyIwHhiiGUi7LI4/Ud2wMW8T9PAeS bmUIFCZW67vxxRtzgmvZgUEMRaz5SI7V3JlJeeGqNT6sNe1kv9ggdckcqBuSZ2PUkMmE TrMi24J9ADUdeW5efto1O3hq3qQeyqgxyYZHMRMkdGsqrCut3527r4sK0aqL8nlQsSxw 3qSg== X-Gm-Message-State: AE9vXwPvl8HHX0WmazfdT59AIecn0pbIuBjvALyKEfSyjjsNObkTs96jy4B05yC2bjvar+oWlFC7GuHC4vQfuvI7 X-Received: by 10.25.18.23 with SMTP id h23mr6119736lfi.5.1474378203748; Tue, 20 Sep 2016 06:30:03 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.195.137 with HTTP; Tue, 20 Sep 2016 06:29:22 -0700 (PDT) In-Reply-To: <9cea05b7905c765b3129892257d62164@mail.gmail.com> References: <83570e3e9547c40b8cd01a197d0c907e@mail.gmail.com> <9cea05b7905c765b3129892257d62164@mail.gmail.com> From: Qifan Chen Date: Tue, 20 Sep 2016 08:29:22 -0500 Message-ID: Subject: Re: trafodion query optimization To: Eric Owhadi Cc: =?UTF-8?B?5LmU5b2m5YWL?= , user@trafodion.incubator.apache.org, dev Content-Type: multipart/alternative; boundary=001a113fb460197d81053cf06d8b --001a113fb460197d81053cf06d8b Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Qiao, Glad to know that the DDL change helped. I think you can also play with the following for better performance. 1. CQD parallel_num_esps. This CQD sets the degree of parallelism to a fixed value for all parallel layers (like the layer composed of operator 1,2 and 3 in the query plan). It is OK to use the CQD during query optimization. In production however, you may consider use a different wa= y which is my second point below (item 2). The benefit of *not* using CQD parallel_num_esps is that this will allow the compiler to choose the parallelism based on data size per layer. Your system resource could be better controlled this way. 2. CQD MAX_ESPS_PER_CPU_PER_OP controls the # of ESPs per node. We normally allocate 2 ESPs per node (that is the reason there are 6 =3D 3= * 2 in the plan). You can alter the CQD MAX_ESPS_PER_CPU_PER_OP to a value s= o that you want maximally X number of ESPs per node (say 4). The formula = for the CQD is X / #coresPerNode =3D X/8. So to boost the degree of parallel= ism from 6 to 12 (or from 2 to 4 ESPs per node), you need to set the CQD to 4/8=3D0.5. 3. From the stats, UID (with total # of distinct value of 8319955) is more 'unique' than VID (578912). It will be a good idea to switch the position of UID and VID in primary key to favor queries with the search condition SID=3D and V_DATE =3D and VID=3D. 4. The SALT clause (SID, V_DATE, UID, VID) could be reduced to (SID, V_DATE, VID) if the above search condition (in item 3) is issued often a= nd the expected number of matching rows is small (say less than 200). In th= is case, the compiler may choose a serial plan fragment which is perfect to use one ESP to read from one partition where all the matching rows resid= e. Thanks --Qifan LC RC OP OPERATOR OPT DESCRIPTION CARD ---- ---- ---- -------------------- -------- -------------------- --------- 4 . 5 root 1.00E+000 3 . 4 sort_partial_aggr_ro 1.00E+000 2 . 3 esp_exchange 1:6(hash2) 1.00E+000 1 . 2 sort_partial_aggr_le 1.00E+000 . . 1 trafodion_scan VISIT_FROM_HIVE 8.03E+006 --- SQL operation complete. On Tue, Sep 20, 2016 at 12:28 AM, Eric Owhadi wrote= : > I also see that you should not be using DIVISION BY, since your V_DATE ar= e > already on day boundary. Using DIVISION_BY is counter-productive here. > > > > And sorry I opened wrong file, I had old and new opened at the same time > and looked wrong window J, you did change the charset and char length, my > bad, > > Eric > > *From:* Eric Owhadi [mailto:eric.owhadi@esgyn.com] > *Sent:* Tuesday, September 20, 2016 12:10 AM > *To:* '=E4=B9=94=E5=BD=A6=E5=85=8B' ; 'user@trafodion.i= ncubator.apache.org' < > user@trafodion.incubator.apache.org> > *Cc:* 'dev' ; Qifan Chen < > qifan.chen@esgyn.com> > *Subject:* RE: trafodion query optimization > > > > Hi Qiao > > > > Optimizer picked 6 as DOP. You can force it higher: > > > > Use > > CQD parallel_num_esps =E2=80=9812=E2=80=99; > > Then prepare and run the query. > > > > You may experiment with re-generating you table with SALT 15 and CQD > parallel_num_esps =E2=80=9815=E2=80=99. > > > > you can even draw a curve > > SALT 18, parallel_num_esp =E2=80=9818=E2=80=99 > > SALT 21, parallel_num_esps =E2=80=9821=E2=80=99 > > Etc 3 by 3 until you see that performance stop getting better, and > actually drops. > > > > I see you did not change VARCHAR to CHAR and did not use Charset ISO88591= : > this is important to optimize row size. UTF8 is very greedy specially for > PK columns. When I see your sample sid, I wonder if you really need UTF8. > > > > Same principle apply for the other columns, but less important than for > key. > > > > Ultimately you can change the default value of HBASE_OPTIONS BLOCKSIZE > > Default is 65565, you can try doubling or quadrupling it. This will > improve scan rate but crease any query that does random access. > > > > Also I checked in an experimental feature to allow scanning in parallel > without using ESP (using multithreading instead), in case memory resource > consumed by ESP becomes a bottleneck, but I would not recommend it yet as > it is =E2=80=9Cexperimental=E2=80=9D still=E2=80=A6 > > Except if you start seeing memory pressure=E2=80=A6 > > Hope this helps, > Eric > > > > > > > > > > *From:* =E4=B9=94=E5=BD=A6=E5=85=8B [mailto:qyanke@gmail.com ] > *Sent:* Monday, September 19, 2016 11:38 PM > *To:* user@trafodion.incubator.apache.org > *Cc:* dev ; Eric Owhadi < > eric.owhadi@esgyn.com>; Qifan Chen > *Subject:* Re: trafodion query optimization > > > > Many thanks to Eric and Qifan again. > > I upload a new log about the new ddl and the query plan. > > @Eric, Following your advice I modified the ddl, the sum query now takes > 25seconds while the old ddl will take 90seconds.(there are 3 nodes in my > cluster, 8cores(16 v-cores) and 16GRAM), Is there any other optimize > options to make the sum query more fast (better less than 10seconds)? > > @Qifan, I log the output of showstats command in the attachment, looking > forward more suggestions. > > > > Thanks again. > > Qiao > > > > Qifan Chen =E4=BA=8E2016=E5=B9=B49=E6=9C=8819=E6=97= =A5=E5=91=A8=E4=B8=80 =E4=B8=8B=E5=8D=889:23=E5=86=99=E9=81=93=EF=BC=9A > > Hi Qiao, > > > > Thank you for the data. It is very helpful. > > > > There are several things noticed. > > - The key columns are: _SALT_, _DIVISION_1_, SID, V_DATE, UID, VID > - The salt column is built from column SID only, which means all rows > with identical SID values V will be stored in the same partition. > - From the query plan, the compiler assigns 6 executor processes (we > call ESP) to read the salted table of 12 partitions: 1 ESP reads 2 sal= t > partitions. > - The frequency of V is high when sid=3D'6b2a0957' (~8million rows), a= ll > these relevant rows are handled by one execution process out of 6. Tha= t > probably is the reason of not much parallelism observed. > - You can use SQL command *showstats with detail* option to check the > frequency on column SID. > > > - showstats for table TRAFODION.SEABASE.VISIT_FROM_HIVE2 on SID detail= ; > > > - If high frequency per unique value on column SID is confirmed, we > probably should consider our next step of action. For example, we coul= d add > some columns from the primary key to the SALT clause to help spread V = of > SID to all 12 partitions. > > Could you please send us the output of the showstats command above, and > the showstats command below for all columns in the table? > > > > showstats for table TRAFODION.SEABASE.VISIT_FROM_HIVE2 on every column; > > > > Thanks --Qifan > > > > On Mon, Sep 19, 2016 at 3:15 AM, =E4=B9=94=E5=BD=A6=E5=85=8B wrote: > > Thanks Eric and Qifan. I am sorry to reply after so long a time because > I'm on the Chinese mid-autumn festival holiday. > > According to Qifan's advice, I upload a log which contains the DDL and th= e > query plan hope to get more advice. > > > > and to Eric, I summit a jira about the block-encoding and the compression= , > https://issues.apache.org/jira/browse/TRAFODION-2195, so I only use the > hbase compression. > > Qifan Chen =E4=BA=8E2016=E5=B9=B49=E6=9C=8812=E6=97= =A5=E5=91=A8=E4=B8=80 =E4=B8=8B=E5=8D=8810:43=E5=86=99=E9=81=93=EF=BC=9A > > Hi Qiao, > > > > You can also send us the DDL and the query plan to help with the tuning. > > > > To generate a query plan, do the following from sqlci, and the plan is in > text file mylog. > > > > 1. log mylog clear; > 2. prepare xx from ; > 3. explain xx; > 4. explain options 'f' xx; > 5. exit; > > > > Thanks --Qifan > > > > On Mon, Sep 12, 2016 at 8:21 AM, Eric Owhadi > wrote: > > Hello Qiao, > When you say whatever the table ddl it limits things a lot, as table ddl > will help define several things that will drastically improve the degree = of > parallelism and the table size. > > The DOP (degree of parallelism) of the scan operator is constrained by th= e > number of regions your table uses. So if you want to increase DOP, you ne= ed > to partition your table using the syntax like: > create table customer_demographics_salt > ( > cd_demo_sk int not null > , cd_gender char(1) > , cd_marital_status char(1) > , cd_education_status char(20) > , cd_purchase_estimate int > , cd_credit_rating char(10) > , cd_dep_count int > , cd_dep_employed_count int > , cd_dep_college_count int > , primary key (cd_demo_sk) > ) > salt using 12 partitions > ATTRIBUTES ALIGNED FORMAT > HBASE_OPTIONS > ( > DATA_BLOCK_ENCODING =3D 'FAST_DIFF', > COMPRESSION =3D 'SNAPPY' > ); > > you can experiment with different values of number of partitions (but pic= k > a > multiple of 3 since you have 3 nodes). > > Then the optimizer will pick the DOP with a compromise of resource usage = vs > gain in speed. > If you want to force higher DOP than what optimizer selected, you can use= : > CQD parallel_num_esps '12'; > To force it to one ESP per partition (assuming you picked 12 partitions). > You can verify what optimizer picked as DOP by doing an explain on the > query. > > Other important factors plays in performance: > - use of aligned format (see above example) > - careful choice of the primary key (bad idea to use a varchar with big m= ax > size) > - It is good idea to use compression and encoding (see the most common > options we use above) > - you can also increase the HBASE_OPTIONS BLOCKSIZE parameter, but there = is > a drawback: increasing it will increase performance of SCAN but decrease > performance of keyed access. That is why I did not include it in the > example > above. > > Hope this helps, > Regards, > Eric > > > -----Original Message----- > From: =E4=B9=94=E5=BD=A6=E5=85=8B [mailto:qyanke@gmail.com] > Sent: Monday, September 12, 2016 1:22 AM > To: user@trafodion.incubator.apache.org; dev@trafodion.incubator. > apache.org > Subject: trafodion query optimization > > Hi all, > I executed the sum and count query on my table where the cluster has > three nodes. I found that the sum query is not well parallel executed(not > all the three nodes get high load when executing the sum query) and the c= pu > load is very high while the memory load is very low(the machines have 16 > cores and 16GB memory). My sum query on the 12 million data sets takes > about > 2 minutes and a half time. > So my question is that is there any optimization advice that I can us= e > to improve the query performance and maximize the usage of my machines, > what > ever the configuration or the table ddl. > Any replies is appreciated. > > Thanks, > Qiao > > > > > > -- > > Regards, --Qifan > > > > > > > > -- > > Regards, --Qifan > > > > --=20 Regards, --Qifan --001a113fb460197d81053cf06d8b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Qiao,

Glad to know that the DDL chan= ge helped.

I think you can also play with the foll= owing for better performance.=C2=A0
  1. CQD=C2=A0parallel_num= _esps.=C2=A0 This CQD sets the degree of parallelism to a fixed value for a= ll parallel layers (like the layer composed of operator 1,2 and 3 in the qu= ery plan).=C2=A0 It is OK to use the CQD during query optimization. In prod= uction however, you may consider use a different way which is my second poi= nt below (item 2). The benefit of not using CQD=C2=A0parallel_num_es= ps is that this will allow the compiler to choose the parallelism based on = data size per layer. Your system resource could be better controlled this w= ay.
  2. CQD=C2=A0MAX_ESPS_PER_CPU_PER_OP controls the # of ESPs per nod= e. We =C2=A0normally allocate 2 ESPs per node (that is the reason there are= 6 =3D 3 * 2 in the plan). You can alter the CQD=C2=A0MAX_ESPS_PER_CPU_PER_= OP to a value so that you want maximally X number of ESPs per node (say 4).= =C2=A0 The formula for the CQD is X / #coresPerNode =3D X/8. So to boost th= e degree of parallelism from 6 to 12 (or from 2 to 4 ESPs per node), you ne= ed to set the CQD to 4/8=3D0.5.
  3. From the stats, UID (with total # o= f distinct value of=C2=A08319955)=C2=A0is more 'unique' than VID (5= 78912). It will be a good idea to switch the position of UID and VID in pri= mary key to favor queries with the search condition SID=3D<u> and V_D= ATE =3D<v> and VID=3D<w>.=C2=A0
  4. The SALT clause=C2=A0(S= ID, V_DATE, UID, VID) could be reduced to=C2=A0(SID, V_DATE, VID) if the ab= ove search condition (in item 3) is issued often and the expected number of= matching rows is small (say less than 200). In this case, the compiler may= choose a serial plan fragment which is perfect to use one ESP to read from= one partition where all the matching rows reside. =C2=A0=C2=A0
Thanks --Qifan
=C2=A0
LC =C2=A0 RC =C2=A0 OP =C2=A0 OPERATOR =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0OPT =C2=A0 =C2=A0 =C2=A0 DESCRIPTION =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CARD =C2=A0=C2=A0
---- ---- ---- ----------= ---------- =C2=A0-------- =C2=A0-------------------- =C2=A0---------=
=C2=A0=
4 =C2=A0 =C2=A0. =C2=A0 =C2=A05 =C2=A0 =C2=A0root =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A01.00E+000
3 =C2=A0 =C2=A0. =C2=A0 =C2=A04 =C2=A0 =C2=A0sort_partial_aggr= _ro =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01.00E+000
2 =C2=A0 =C2=A0. =C2=A0 =C2=A03 =C2=A0 =C2=A0esp_e= xchange =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A01:6(hash2) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01.00E+000
1 =C2=A0 =C2=A0. =C2=A0 =C2=A02 =C2=A0 =C2= =A0sort_partial_aggr_le =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01.00E+= 000
. =C2=A0 =C2=A0. =C2=A0 =C2= =A01 =C2=A0 =C2=A0trafodion_scan =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0VISIT_FROM_HIVE =C2=A0 =C2=A0 =C2=A0 8.03E+006

--- SQL = operation complete.=C2=A0


On Tue, Sep 20, 2016= at 12:28 AM, Eric Owhadi <eric.owhadi@esgyn.com> wrote:=

I also see that you should not be = using DIVISION BY, since your V_DATE are already on day boundary. Using DIV= ISION_BY is counter-productive here.

=C2=A0

And sorry I opened wrong file, I had old and new ope= ned at the same time and looked wrong window J, you di= d change the charset and char length, my bad,

Eric

From: Eric Owhadi [mailto:eric.owhadi@esgyn.com]
= Sent: Tuesday, September 20, 2016 12:10 AM
To: '
=E4=B9=94=E5=BD=A6=E5=85=8B' <qyanke@gmail.com>; 'user@trafodi= on.incubator.apache.org' <user@trafodion.incubator.apach= e.org>
Cc: 'dev' <dev@trafodion.incubator.apa= che.org>; Qifan Chen <qifan.chen@esgyn.com>
Subject: RE: trafodio= n query optimization

=C2=A0

Hi Qiao

=C2=A0

Optimizer picked 6 as DOP= . You can force it higher:

=C2=A0<= /p>

Use

CQD paral= lel_num_esps =E2=80=9812=E2=80=99;

Then p= repare and run the query.

=C2=A0

You may experiment with re-generating you table w= ith SALT 15 and CQD parallel_num_esps =E2=80=9815=E2=80=99.

=C2=A0

you can even dr= aw a curve

SALT 18, parallel_num_esp =E2= =80=9818=E2=80=99

SALT =C2=A021, paralle= l_num_esps =E2=80=9821=E2=80=99

Etc 3 by= 3 until you see that performance stop getting better, and actually drops.<= /span>

=C2=A0

<= span style=3D"font-size:11.0pt;font-family:"Calibri",sans-serif">= I see you did not change VARCHAR to CHAR and did not use Charset ISO88591: = this is important to optimize row size. UTF8 is very greedy specially for P= K columns. When I see your sample sid, I wonder if you really need UTF8.

=C2=A0

Sa= me principle apply for the other columns, but less important than for key.<= /span>

=C2=A0

<= span style=3D"font-size:11.0pt;font-family:"Calibri",sans-serif">= Ultimately you can change the default value of HBASE_OPTIONS BLOCKSI= ZE

Default is 65565, you can try doubling or quad= rupling it. This will improve scan rate but crease any query that does rand= om access.

=C2=A0

Also = I checked in an experimental feature to allow scanning in parallel without = using ESP (using multithreading instead), in case memory resource consumed = by ESP becomes a bottleneck, but I would not recommend it yet as it is =E2= =80=9Cexperimental=E2=80=9D still=E2=80=A6

Except= if you start seeing memory pressure=E2=80=A6

Hop= e this helps,
Eric

=C2=A0

<= p class=3D"MsoNormal">=C2=A0

=C2=A0

=C2=A0

From: =E4=B9=94=E5=BD=A6=E5=85=8B [mailto:qyanke@g= mail.com]
Sent: Monday, September 19, 2016 = 11:38 PM
To: user@trafodion.incubator.apache.org
Cc:= dev <dev@trafodion.incubator.apache.org>; Eric Owhadi &l= t;eric.owhadi@es= gyn.com>; Qifan Chen <qifan.chen@esgyn.com>
Subject: Re: trafodio= n query optimization

=C2=A0

Many thanks =C2=A0to Eric and Qifan again.

=

I upload a new log about the = new ddl and the query plan.

@Eric, Following= your advice I modified the ddl, the sum query now takes 25seconds while th= e old ddl will take 90seconds.(there are 3 nodes in my cluster, 8cores(16 v= -cores) and 16GRAM), Is there any other optimize options to make the sum qu= ery more fast (better less than 10seconds)?

@Qifan, I log the output of showstats command in the attachment, look= ing forward more suggestions.

= =C2=A0

Thanks again.

Qiao

=C2=A0

Qifan Chen <qifan.chen@esgyn.com>=E4=BA=8E2016=E5=B9=B49=E6=9C=8819=E6=97=A5=E5=91=A8=E4=B8=80 =E4=B8=8B=E5=8D=889:23=E5=86=99=E9=81=93=EF=BC=9A

=

Hi Qiao,

=C2=A0

Thank you for = the data. It is very helpful.=C2=A0

= =C2=A0

There are several things notice= d.

  • The key columns = are:=C2=A0_SALT_, _DIVISION_1_, SID, V_DATE, UID, VID
  • The salt column is built from column SID only, which means all rows = with identical SID values V will be stored in the same partition.
  • From the query plan, the compiler assigns 6 executor pro= cesses (we call ESP) to read the salted table of 12 partitions: 1 ESP reads= 2 salt partitions.
  • The frequency of V is high = when sid=3D'6b2a0957' (~8million rows), all these relevant rows are= handled by one execution process out of 6. That probably is the reason of = not much parallelism observed.=C2=A0
  • You can us= e SQL command showstats=C2=A0with detail option to check the frequen= cy on column SID.=C2=A0
    • showstats for table=C2=A0TRAFODION.SEABASE.VISIT_F= ROM_HIVE2 on SID detail;
  • If high frequency per unique value on column SID is confirmed, we pro= bably should consider our next step of action. For example, we could add so= me columns from the primary key to the SALT clause to help spread V of SID = to all 12 partitions.

Could you please= send us the output of the showstats command above, and the showstats comma= nd below for all columns in the table? =C2=A0

=C2=A0

showstats for ta= ble TRAFODION.SEABASE.VISIT_FROM_HIVE2 on every column;

=

=C2=A0

Thanks -= -Qifan

=C2=A0

On Mon, Sep 19, 2016 at 3:15 AM, =E4=B9=94=E5=BD=A6=E5=85=8B <qyanke@gmail.c= om> wrote:

Thanks Eric = and Qifan. I am sorry to reply after so long a time because I'm on the = Chinese mid-autumn festival holiday.

Accordi= ng to Qifan's advice, I upload a log which contains the DDL and the que= ry plan hope to get more advice.

=C2= =A0

and= to Eric, I summit a jira about the block-encoding and the compression, https://issues.apache.org/jira/browse/TRAFODION-2195, so I on= ly use the hbase compression.

Qifan Che= n <qifan.chen@= esgyn.com>=E4=BA= =8E2016=E5=B9=B49=E6=9C=8812=E6=97=A5=E5=91=A8=E4= =B8=80 =E4=B8=8B= =E5=8D=8810:43=E5= =86=99=E9=81=93=EF=BC=9A

Hi Qiao,

=C2=A0

<= div>

You can also send us the DDL and the query plan = to help with the tuning.=C2=A0

=C2=A0<= /p>

To generate a query plan, do the follo= wing from sqlci, and the plan is in text file mylog.=C2=A0

=C2=A0

  1. log mylog clear;
  2. prepare xx f= rom <your query>;
  3. explain xx;
  4. explain options 'f' xx;
  5. exit;

=C2=A0

Thanks --Qifan

=C2=A0

On Mon, Sep 12, 2016 at 8:21 AM, E= ric Owhadi <e= ric.owhadi@esgyn.com> wrote:

Hello Qiao,
When you say whatever the table ddl it limits things a lot,= as table ddl
will help define several things that will drastically impr= ove the degree of
parallelism and the table size.

The DOP (degree= of parallelism) of the scan operator is constrained by the
number of re= gions your table uses. So if you want to increase DOP, you need
to parti= tion your table using the syntax like:
create table customer_demographic= s_salt
(
=C2=A0cd_demo_sk int not null
=C2=A0, cd_gender char(1)=C2=A0, cd_marital_status char(1)
=C2=A0, cd_education_status char(20)=
=C2=A0, cd_purchase_estimate int
=C2=A0, cd_credit_rating char(10)=C2=A0, cd_dep_count int
=C2=A0, cd_dep_employed_count int
=C2=A0, = cd_dep_college_count int
=C2=A0, primary key (cd_demo_sk)
)
salt u= sing 12 partitions
ATTRIBUTES ALIGNED FORMAT
HBASE_OPTIONS
=C2=A0 = (
=C2=A0 =C2=A0 DATA_BLOCK_ENCODING =3D 'FAST_DIFF',
=C2=A0 = =C2=A0 COMPRESSION =3D 'SNAPPY'
=C2=A0 );

you can experim= ent with different values of number of partitions (but pick a
multiple o= f 3 since you have 3 nodes).

Then the optimizer will pick the DOP wi= th a compromise of resource usage vs
gain in speed.
If you want to fo= rce higher DOP than what optimizer selected, you can use :
CQD parallel_= num_esps '12';
To force it to one ESP per partition (assuming yo= u picked 12 partitions).
You can verify what optimizer picked as DOP by = doing an explain on the
query.

Other important factors plays in p= erformance:
- use of aligned format (see above example)
- careful cho= ice of the primary key (bad idea to use a varchar with big max
size)
= - It is good idea to use compression and encoding (see the most common
o= ptions we use above)
- you can also increase the HBASE_OPTIONS BLOCKSIZE= parameter, but there is
a drawback: increasing it will increase perform= ance of SCAN but decrease
performance of keyed access. That is why I did= not include it in the example
above.

Hope this helps,
Regards= ,
Eric


-----Original Message-----
From: =E4=B9=94=E5=BD=A6=E5= =85=8B [mailto:qyanke@gmail.com]
Sent: Monday, September 12, 2016 1:22 AM
To: <= a href=3D"mailto:user@trafodion.incubator.apache.org" target=3D"_blank">use= r@trafodion.incubator.apache.org; dev@trafodion.incubator.apache= .org
Subject: trafodion query optimization

Hi all,
=C2=A0 = =C2=A0 =C2=A0I executed the sum and count query on my table where the clust= er has
three nodes. I found that the sum query is not well parallel exec= uted(not
all the three nodes get high load when executing the sum query)= and the cpu
load is very high while the memory load is very low(the mac= hines have 16
cores and 16GB memory). My sum query on the 12 million dat= a sets takes about
2 minutes and a half time.
=C2=A0 =C2=A0 So my que= stion is that is there any optimization advice that I can use
to improve= the query performance and maximize the usage of my machines, what
ever = the configuration or the table ddl.
=C2=A0 =C2=A0 Any replies is appreci= ated.

Thanks,
Qiao

<= br>

=C2=A0

=

--

Regards, --Qifan

=C2=A0



=C2=A0

--

Regards, --Qifan

=C2=A0

<= /div>


--
Regards, --Qifan

--001a113fb460197d81053cf06d8b--