hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kuien Liu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HAWQ-1550) Query on hawq_toolkit.hawq_log_master_concise is slower as time going
Date Tue, 14 Nov 2017 03:15:00 GMT
Kuien Liu created HAWQ-1550:

             Summary: Query on hawq_toolkit.hawq_log_master_concise is slower as time going
                 Key: HAWQ-1550
                 URL: https://issues.apache.org/jira/browse/HAWQ-1550
             Project: Apache HAWQ
          Issue Type: Improvement
            Reporter: Kuien Liu
            Assignee: Radar Lei

As time going, log file size on master is expending linearly, query on hawq_toolkit.hawq_log_master_concise
is slower and slower.

I have collected a set of performance data (on a daily-build machine) with following SQL:

select count(*) from hawq_toolkit.hawq_log_master_concise;

||log size||tuples||time||
|5.0M|17381 rows|291.866 ms|
|10.0M|32650 rows| 522.552 ms|
|20.0M|5939 rows|938.230 ms|

That means:
1. if we wanna perform monitoring on hawq_log_master_concise within 1 second, to raise warning
when ERROR, FATAL or PANIC, the log size must be constrained. 
2. And, we need a way to focus on latest HOT log rotation files.

Besides, as time going, log files are heavy for master node, and it is a little bit expensive
for users to keep COLD logs in Cloud instances, they may choose public Log Service to handle

We consider to add two GUCs (log_max_size, and log_max_age) to constrain log file size, and
introduce a new view (e.g., hawq_log_master_concise_hot) on fresh log files. What do you think?

 Any comments and suggestions are welcome.

This message was sent by Atlassian JIRA

View raw message