trafodion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Song, Hao-Lin" <>
Subject 答复: Question about trafodion insert
Date Sat, 28 Apr 2018 05:02:18 GMT
Hi Anoop,

Thanks very much for your explain!

宋昊霖 (Haolin(Leo) Song)

发件人: Anoop Sharma <> 
发送时间: 2018年4月28日 12:46
主题: RE: Question about trafodion insert

For each insert statement, sql engine need to:
- compile the query
- begin transaction
- send data to hbase server
- commit transaction.

For simple queries like an insert, most of the cost is in the 3rd step where values are sent
to hbase server.
This is a process hop from mxosrvr to hbase region server.

When the insert query has multiple values, then all those values are buffered up and sent
to hbase.
That will reduce the number of process sends that have to be done between mxosrvr and region

This is the same improvement that is seen if values are sent from client to server using rowsets.
In that mode, all rows that are included in the rowset are shipped in one call from client
to server.
That will make it run much faster compared to executing the query with one row.
(there are limits of rowset size and buffer size sent to hbase. Values are internally buffered
up to the max buffer/rowset size).

Compile time query cache will cache the insert statement.
 Which means that the statement ' insert into song values(i)' when issued multiple times will
use the cached plan.
But that will not reduce the buffer being sent from mxosrvr to region server (step #3 above).
That will be done whenever an insert stmt is issued.

In addition to use of rowsets or multiple values in one insert, there are other clauses that
can be done to make inserts run faster. Those are use of upsert to avoid duplicate detection,
use of 'upsert using load' to eliminate transaction being started/committed. Some of them
depend on application need and its usage.

another note: using multiple values in one insert statement (insert ... values (i), (i+1)...)
is ok if the number of values are 'small' but it is better to use rowsets if large number
of values are to be inserted.


-----Original Message-----
From: Song, Hao-Lin <>
Sent: Friday, April 27, 2018 7:50 PM
Subject: Question about trafodion insert

Hi all

I have a question about trafodion insert that if trafodion executes query ‘insert into table
values(i),(i+1)…,(i+100)’ N/100 times faster than executing query ‘insert into table
values(i) N times. So I made a simple test:
I used jdbcT4 to execute query ‘insert into song values(i)’ 10K times (i from 1 to 10K,
10k rows)and it cost about 89 seconds. Then I executed query ‘insert into song values(i),(I
+1),…,(i+100)’ 100 times(total of 10,000 rows) and it cost about 8.6 seconds.
I am not sure if it only because simple 1 needs to send buffer to mxosrvr 10k times. But I
am more concerned about if query ‘insert into table values(i),(),(),(I + N)’ will be executed
faster in sql engine than executing ‘insert into table values(i)’ N times.

Besides, how will the engine handle query ‘insert into table values(a),(b),(c)’? Will
engine use execute plan cache?

宋昊霖 (Haolin(Leo) Song)

View raw message