trafodion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liu, Yao-Hua (Joshua)" <yaohua....@esgyn.cn>
Subject 答复: how would esp do when it was launched?
Date Tue, 02 Jan 2018 03:02:12 GMT
Hi Dave,

	Thanks for your suggestion!
	Actually the table is trafodion table which is only named with _HIVE. For your 3 steps
	1. prepare
	  It would take 2 seconds
	2. execute first time
	  It would take 78 seconds. Here to start all the ESPs would take less than 1 second
	3. execute the second time
	  It would take 3 seconds.
	
	So I am wondering what does ESP do during it was lauched?

Thanks
Joshua

-----邮件原件-----
发件人: Dave Birdsall [mailto:dave.birdsall@esgyn.com] 
发送时间: 2017年12月31日 8:42
收件人: dev@trafodion.apache.org; dev@trafodion.incubator.apache.org
主题: RE: how would esp do when it was launched?

Hi,

How big is the table? How many esps are we creating? Perhaps we are creating the esps serially;
maybe that is what is taking the time.

Another factor to look at is compile time. You can separate that out as follows:

Step 1: using trafci, do a "prepare" of your query. See how long that takes.
Step 2: then execute the query. How long does that take?
Step 3: re-execute the query. How long does that take?

I expect part of that 80 seconds will be consumed in step 1 as compile time. Would be interesting
to know if compile time is, say, 2 seconds or 78 seconds. If the latter, perhaps the issue
is how we read statistics for a Hive table with many partitions.

Dave

-----Original Message-----
From: Liu, Yao-Hua (Joshua) [mailto:yaohua.liu@esgyn.cn] 
Sent: Friday, December 22, 2017 1:01 AM
To: dev@trafodion.incubator.apache.org
Subject: how would esp do when it was launched?

Hi all,

       Suresh and I found some interesting thing when run some queries.

       Step 1:
       Use trafci, run query: select count(*) from CELL_INDICATOR_HIVE where starttime=20170801000000000;
 // CELL_INDICATOR_HIVE has 100 billion rows and each starttime would have 4346483 rows. Starttime
is the first column in store by keys
       This would take about 1 minute and 20 seconds to finish.
       Step2
       Run above sql again, then it would take 3 seconds to finish.
       Here 80s vs 3s, we may guess it's due to esp start time or cache. But we checked,

1.     to start all the esps would take less than 1 seconds.

2.     If due to cache, we can run another table for a test:
       Step3
       Run another query: select count(*) from SERVERIP_INDICATOR_BAK where starttime=20170801000000000;
 // SERVERIP_INDICATOR_BAK has 64 billion rows and each starttime would have 2.8 million rows.
Starttime is also the first column in store by keys. Then it would take 2 seconds to finish.

       By the way, if we start another trafci(not the same mxosrvr from above) and run above
select count(*) from SERVERIP_INDICATOR_BAK where starttime=20170801000000000, it would also
take 1 minute or more.

       So we are wondering what does esp do when it was started? Why the first time the esp
to scan one table would take so much time but the second time to scan another table could
be much faster?

Thanks
Joshua
Mime
View raw message