From dev-return-5653-apmail-tajo-dev-archive=tajo.apache.org@tajo.apache.org Thu Mar 5 08:37:22 2015 Return-Path: X-Original-To: apmail-tajo-dev-archive@minotaur.apache.org Delivered-To: apmail-tajo-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A6A9C10445 for ; Thu, 5 Mar 2015 08:37:22 +0000 (UTC) Received: (qmail 43286 invoked by uid 500); 5 Mar 2015 08:37:22 -0000 Delivered-To: apmail-tajo-dev-archive@tajo.apache.org Received: (qmail 43239 invoked by uid 500); 5 Mar 2015 08:37:22 -0000 Mailing-List: contact dev-help@tajo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tajo.apache.org Delivered-To: mailing list dev@tajo.apache.org Received: (qmail 43228 invoked by uid 99); 5 Mar 2015 08:37:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Mar 2015 08:37:22 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of azuryyyu@gmail.com designates 209.85.216.169 as permitted sender) Received: from [209.85.216.169] (HELO mail-qc0-f169.google.com) (209.85.216.169) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Mar 2015 08:36:54 +0000 Received: by qcrw7 with SMTP id w7so41729688qcr.4 for ; Thu, 05 Mar 2015 00:36:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=TKpCcQs5ccudC/dZQ5WnRW/9SnKL0BOzMMyxJHhjRDE=; b=b3C6UExU2k1qscCvfFnD6Bdgu5CcQKGGuhYnhxo88J/SrN9GzSb9mvlSDcXN2Ib5A7 pMkjKP2LCw/mjTk3Jg2si93EyKJJBqQAEENuaQ3CYOnPBX8s3sPHqxnDe0+iM+apBJhb pMO4rQX9g77akDDhcq/7+/n1/W1M9msw/zeXAi464LrLm0FY81oK6npMR0h2F3770kNu iP7yvYDjjq+V6HyJk+8V/Hz5s2HklnfMei3sVcfLJXeqoywiQcWsv3JAlPg/509bqiE/ JgxtPJ8v57ZXQwJmALGzJjy0DMi81M5eg47gGMBwZRFPFmrncIiZM7s97IGrsM4p9xEM 7nSA== MIME-Version: 1.0 X-Received: by 10.55.51.194 with SMTP id z185mr15699381qkz.84.1425544612963; Thu, 05 Mar 2015 00:36:52 -0800 (PST) Received: by 10.140.83.201 with HTTP; Thu, 5 Mar 2015 00:36:52 -0800 (PST) In-Reply-To: References: Date: Thu, 5 Mar 2015 16:36:52 +0800 Message-ID: Subject: Re: Tajo query scheduler and performance question From: Azuryy Yu To: "dev@tajo.apache.org" Content-Type: multipart/alternative; boundary=001a11490a1e446b5f051086780a X-Virus-Checked: Checked by ClamAV on apache.org --001a11490a1e446b5f051086780a Content-Type: text/plain; charset=UTF-8 Thanks Jihoon. It seems performance is good using your SQL. 1) dfs-dir-aware is set to true default> select count(*) from (select dvc_id from tds_did_user_targ_day where dt='20150228' and platform='pc' group by dvc_id) t; Progress: 0%, response time: 0.681 sec Progress: 0%, response time: 0.683 sec Progress: 0%, response time: 1.085 sec Progress: 0%, response time: 1.886 sec Progress: 25%, response time: 2.887 sec Progress: 27%, response time: 3.889 sec Progress: 32%, response time: 4.89 sec Progress: 33%, response time: 5.892 sec Progress: 33%, response time: 6.894 sec Progress: 50%, response time: 7.895 sec Progress: 50%, response time: 8.896 sec Progress: 58%, response time: 9.897 sec Progress: 58%, response time: 11.379 sec Progress: 56%, response time: 12.38 sec Progress: 65%, response time: 13.382 sec Progress: 100%, response time: 14.135 sec ?count ------------------------------- 35620159 (1 rows, 14.135 sec, 9 B selected) 2) dfs-dir-aware is comment out default> select count(*) from (select dvc_id from tds_did_user_targ_day where dt='20150228' and platform='pc' group by dvc_id) t; Progress: 0%, response time: 0.94 sec Progress: 0%, response time: 0.941 sec Progress: 0%, response time: 1.342 sec Progress: 0%, response time: 2.144 sec Progress: 0%, response time: 3.145 sec Progress: 5%, response time: 4.147 sec Progress: 9%, response time: 5.148 sec Progress: 12%, response time: 6.15 sec Progress: 16%, response time: 7.152 sec Progress: 21%, response time: 8.153 sec Progress: 23%, response time: 9.154 sec Progress: 29%, response time: 10.156 sec Progress: 30%, response time: 11.157 sec Progress: 32%, response time: 12.158 sec Progress: 32%, response time: 13.159 sec Progress: 33%, response time: 14.161 sec Progress: 33%, response time: 15.164 sec Progress: 40%, response time: 16.165 sec Progress: 48%, response time: 17.166 sec Progress: 54%, response time: 18.167 sec Progress: 56%, response time: 19.169 sec Progress: 60%, response time: 20.17 sec Progress: 61%, response time: 21.171 sec Progress: 58%, response time: 22.173 sec Progress: 60%, response time: 23.174 sec Progress: 66%, response time: 24.175 sec Progress: 100%, response time: 24.383 sec ?count ------------------------------- 35620159 (1 rows, 24.383 sec, 9 B selected) So, yes dfs-dir-aware is helpfule. and your SQL using the different logical plan. On Thu, Mar 5, 2015 at 4:24 PM, Jihoon Son wrote: > Thanks for sharing. > > It seems that the distinct query has some problems. Basically, distinct > aggregation without the group-by clause should be performed with a > multiple-level aggregation algorithm. But, it does not seem to work in that > way. > > Would you test with the following equivalent query again? > > select count(*) from (select dvc_id from tds_did_user_targ_day where > dt='20150228' > and platform='pc' group by dvc_id) t; > > In addition, the performance will be improved if you enable the ' > tajo.worker.resource.dfs-dir-aware' option which is currently commented > out. > > Best regards, > Jihoon > > On Thu, Mar 5, 2015 at 4:51 PM Azuryy Yu wrote: > > > Thanks Jihoon. > > > > My test dataset has 20 files under one partition with RCFile format. (96 > > columns), the first column is deviceID. > > I only test this one partition, if I use all partitions, count(distinct) > is > > even slow. > > > > I've set HDFS replication to 9.(I have 9 datanodes) , HDFS block size: > > 64MB, set dfs.datanode.hdfs-blocks-metadata.enabled=true > > > > the following is some test results: > > > > 1. > > default> select count(*) from tds_did_user_targ_day where dt='20150228' > and > > platform='pc'; > > Progress: 0%, response time: 0.306 sec > > Progress: 0%, response time: 0.307 sec > > Progress: 3%, response time: 0.711 sec > > Progress: 21%, response time: 1.513 sec > > Progress: 43%, response time: 2.514 sec > > Progress: 100%, response time: 3.139 sec > > ?count > > ------------------------------- > > 35743711 > > (1 rows, 3.139 sec, 9 B selected) > > > > 2. default> select sum(cast(day_movie_vv as bigint)), > sum(cast(day_movie_cv > > as bigint)), sum(cast(day_movie_pt as bigint)) from tds_did_user_targ_day > > where dt= '20150228' and platform='pc'; > > Progress: 0%, response time: 0.299 sec > > Progress: 0%, response time: 0.299 sec > > Progress: 0%, response time: 0.7 sec > > Progress: 6%, response time: 1.502 sec > > Progress: 21%, response time: 2.503 sec > > Progress: 32%, response time: 3.504 sec > > Progress: 44%, response time: 4.505 sec > > Progress: 50%, response time: 5.506 sec > > Progress: 100%, response time: 5.568 sec > > ?sum_3, ?sum_4, ?sum_5 > > ------------------------------- > > 7302934, 6453007, 6504000842 > > (1 rows, 5.568 sec, 27 B selected) > > > > > > 3) > > default> select count(distinct dvc_id) from tds_did_user_targ_day where > dt= > > '20150228' and platform='pc'; > > Progress: 0%, response time: 0.3 sec > > Progress: 0%, response time: 0.301 sec > > Progress: 0%, response time: 0.702 sec > > Progress: 2%, response time: 1.503 sec > > Progress: 3%, response time: 2.504 sec > > Progress: 4%, response time: 3.506 sec > > Progress: 10%, response time: 4.507 sec > > Progress: 14%, response time: 5.508 sec > > Progress: 17%, response time: 6.509 sec > > Progress: 21%, response time: 7.51 sec > > Progress: 25%, response time: 8.511 sec > > Progress: 27%, response time: 9.512 sec > > Progress: 28%, response time: 10.513 sec > > Progress: 33%, response time: 11.514 sec > > Progress: 33%, response time: 12.516 sec > > Progress: 33%, response time: 13.52 sec > > Progress: 50%, response time: 14.523 sec > > Progress: 50%, response time: 15.525 sec > > Progress: 50%, response time: 16.527 sec > > Progress: 50%, response time: 17.529 sec > > Progress: 50%, response time: 18.53 sec > > Progress: 50%, response time: 19.531 sec > > Progress: 50%, response time: 20.533 sec > > Progress: 51%, response time: 21.534 sec > > Progress: 51%, response time: 22.535 sec > > Progress: 51%, response time: 23.536 sec > > Progress: 52%, response time: 24.538 sec > > Progress: 52%, response time: 25.539 sec > > Progress: 54%, response time: 26.54 sec > > Progress: 54%, response time: 27.542 sec > > Progress: 54%, response time: 28.543 sec > > Progress: 54%, response time: 29.545 sec > > Progress: 55%, response time: 30.546 sec > > Progress: 56%, response time: 31.547 sec > > Progress: 57%, response time: 32.549 sec > > Progress: 60%, response time: 33.55 sec > > Progress: 60%, response time: 34.551 sec > > Progress: 63%, response time: 35.552 sec > > Progress: 64%, response time: 36.554 sec > > Progress: 65%, response time: 37.555 sec > > Progress: 66%, response time: 38.556 sec > > Progress: 66%, response time: 39.559 sec > > Progress: 66%, response time: 40.563 sec > > Progress: 66%, response time: 41.567 sec > > Progress: 66%, response time: 42.571 sec > > Progress: 66%, response time: 43.575 sec > > Progress: 66%, response time: 44.579 sec > > Progress: 66%, response time: 45.584 sec > > Progress: 66%, response time: 46.588 sec > > Progress: 66%, response time: 47.592 sec > > Progress: 66%, response time: 48.596 sec > > Progress: 66%, response time: 49.6 sec > > Progress: 66%, response time: 50.601 sec > > Progress: 83%, response time: 51.602 sec > > Progress: 83%, response time: 52.603 sec > > Progress: 83%, response time: 53.604 sec > > Progress: 83%, response time: 54.605 sec > > Progress: 84%, response time: 55.606 sec > > Progress: 84%, response time: 56.607 sec > > Progress: 84%, response time: 57.608 sec > > Progress: 84%, response time: 58.609 sec > > Progress: 84%, response time: 59.61 sec > > Progress: 85%, response time: 60.612 sec > > Progress: 85%, response time: 61.613 sec > > Progress: 85%, response time: 62.614 sec > > Progress: 86%, response time: 63.615 sec > > Progress: 86%, response time: 64.616 sec > > Progress: 86%, response time: 65.617 sec > > Progress: 88%, response time: 66.618 sec > > Progress: 88%, response time: 67.619 sec > > Progress: 88%, response time: 68.62 sec > > Progress: 89%, response time: 69.621 sec > > Progress: 89%, response time: 70.622 sec > > Progress: 89%, response time: 71.623 sec > > Progress: 89%, response time: 72.624 sec > > Progress: 90%, response time: 73.625 sec > > Progress: 90%, response time: 74.627 sec > > Progress: 90%, response time: 75.628 sec > > Progress: 90%, response time: 76.629 sec > > Progress: 90%, response time: 77.63 sec > > Progress: 90%, response time: 78.632 sec > > Progress: 90%, response time: 79.633 sec > > Progress: 90%, response time: 80.634 sec > > Progress: 90%, response time: 81.636 sec > > Progress: 86%, response time: 82.637 sec > > Progress: 86%, response time: 83.638 sec > > Progress: 86%, response time: 84.64 sec > > Progress: 86%, response time: 85.641 sec > > Progress: 86%, response time: 86.642 sec > > Progress: 86%, response time: 87.643 sec > > Progress: 88%, response time: 88.645 sec > > Progress: 88%, response time: 89.646 sec > > Progress: 88%, response time: 90.647 sec > > Progress: 92%, response time: 91.648 sec > > Progress: 92%, response time: 92.649 sec > > Progress: 92%, response time: 93.65 sec > > Progress: 92%, response time: 94.651 sec > > Progress: 93%, response time: 95.652 sec > > Progress: 93%, response time: 96.653 sec > > Progress: 94%, response time: 97.654 sec > > Progress: 94%, response time: 98.655 sec > > Progress: 95%, response time: 99.656 sec > > Progress: 95%, response time: 100.657 sec > > Progress: 95%, response time: 101.658 sec > > Progress: 95%, response time: 102.659 sec > > Progress: 96%, response time: 103.66 sec > > Progress: 96%, response time: 104.661 sec > > Progress: 97%, response time: 105.662 sec > > Progress: 97%, response time: 106.663 sec > > Progress: 99%, response time: 107.665 sec > > Progress: 99%, response time: 108.666 sec > > Progress: 100%, response time: 109.074 sec > > ?count > > ------------------------------- > > 35620158 > > (1 rows, 109.074 sec, 9 B selected) > > > > For the last query, Logic plan: > > Logical Plan > > > > ----------------------------- > > Query Block Graph > > ----------------------------- > > |-#ROOT > > ----------------------------- > > Optimization Log: > > [LogicalPlan] > > > ProjectionNode is eliminated. > > > PartitionTableRewriter chooses 1 of partitions > > ----------------------------- > > > > GROUP_BY(3)() > > => exprs: (count( distinct default.tds_did_user_targ_day.dvc_id > (TEXT))) > > => target list: ?count (INT8) > > => out schema:{(1) ?count (INT8)} > > => in schema:{(1) default.tds_did_user_targ_day.dvc_id (TEXT)} > > PARTITIONS_SCAN(5) on default.tds_did_user_targ_day > > => target list: default.tds_did_user_targ_day.dvc_id (TEXT) > > => num of filtered paths: 1 > > => out schema: {(1) default.tds_did_user_targ_day.dvc_id (TEXT)} > > => in schema: {(91) default.tds_did_user_targ_day.dvc_id > > (TEXT),default.tds_did_user_targ_day.user_id > > (TEXT),default.tds_did_user_targ_day.p1 > > (TEXT),default.tds_did_user_targ_day.p2 > > (TEXT),default.tds_did_user_targ_day.p3 > > (TEXT),default.tds_did_user_targ_day.prod_code > > (TEXT),default.tds_did_user_targ_day.login_ip > > (TEXT),default.tds_did_user_targ_day.cntry_name > > (TEXT),default.tds_did_user_targ_day.area_name > > (TEXT),default.tds_did_user_targ_day.prov_name > > (TEXT),default.tds_did_user_targ_day.city_name > > (TEXT),default.tds_did_user_targ_day.chnl_type > > (TEXT),default.tds_did_user_targ_day.chnl_type_name > > (TEXT),default.tds_did_user_targ_day.chnl_code > > (TEXT),default.tds_did_user_targ_day.chnl_name > > (TEXT),default.tds_did_user_targ_day.login_ref > > (TEXT),default.tds_did_user_targ_day.net_type > > (TEXT),default.tds_did_user_targ_day.oper_sys > > (TEXT),default.tds_did_user_targ_day.oper_sys_ver > > (TEXT),default.tds_did_user_targ_day.dvc_brand > > (TEXT),default.tds_did_user_targ_day.dvc_model > > (TEXT),default.tds_did_user_targ_day.dvc_type > > (TEXT),default.tds_did_user_targ_day.dev_dpi > > (TEXT),default.tds_did_user_targ_day.brows_name > > (TEXT),default.tds_did_user_targ_day.login_ts > > (TEXT),default.tds_did_user_targ_day.first_login_date > > (TEXT),default.tds_did_user_targ_day.first_login_ver > > (TEXT),default.tds_did_user_targ_day.last_login_date > > (TEXT),default.tds_did_user_targ_day.last_app_ver > > (TEXT),default.tds_did_user_targ_day.evil_ip > > (TEXT),default.tds_did_user_targ_day.day_pv > > (TEXT),default.tds_did_user_targ_day.day_input_pv > > (TEXT),default.tds_did_user_targ_day.day_ins_pv > > (TEXT),default.tds_did_user_targ_day.day_qry_pv > > (TEXT),default.tds_did_user_targ_day.day_outs_pv > > (TEXT),default.tds_did_user_targ_day.day_coop_pv > > (TEXT),default.tds_did_user_targ_day.day_vv > > (TEXT),default.tds_did_user_targ_day.day_cv > > (TEXT),default.tds_did_user_targ_day.day_pt > > (TEXT),default.tds_did_user_targ_day.day_vod_vv > > (TEXT),default.tds_did_user_targ_day.day_vod_cv > > (TEXT),default.tds_did_user_targ_day.day_vod_pt > > (TEXT),default.tds_did_user_targ_day.day_live_vv > > (TEXT),default.tds_did_user_targ_day.day_live_cv > > (TEXT),default.tds_did_user_targ_day.day_live_pt > > (TEXT),default.tds_did_user_targ_day.day_ca_vv > > (TEXT),default.tds_did_user_targ_day.day_ca_cv > > (TEXT),default.tds_did_user_targ_day.day_ca_pt > > (TEXT),default.tds_did_user_targ_day.day_try_vv > > (TEXT),default.tds_did_user_targ_day.day_try_cv > > (TEXT),default.tds_did_user_targ_day.day_try_pt > > (TEXT),default.tds_did_user_targ_day.day_pay_vv > > (TEXT),default.tds_did_user_targ_day.day_pay_cv > > (TEXT),default.tds_did_user_targ_day.day_pay_pt > > (TEXT),default.tds_did_user_targ_day.day_off_vv > > (TEXT),default.tds_did_user_targ_day.day_off_cv > > (TEXT),default.tds_did_user_targ_day.day_off_pt > > (TEXT),default.tds_did_user_targ_day.block_ts > > (TEXT),default.tds_did_user_targ_day.day_drag_ts > > (TEXT),default.tds_did_user_targ_day.day_drag_ahd_ts > > (TEXT),default.tds_did_user_targ_day.day_drag_bwd_ts > > (TEXT),default.tds_did_user_targ_day.day_click_ts > > (TEXT),default.tds_did_user_targ_day.day_instl_ts > > (TEXT),default.tds_did_user_targ_day.day_stup_ts > > (TEXT),default.tds_did_user_targ_day.day_movie_vv > > (TEXT),default.tds_did_user_targ_day.day_movie_cv > > (TEXT),default.tds_did_user_targ_day.day_movie_pt > > (TEXT),default.tds_did_user_targ_day.day_tvp_vv > > (TEXT),default.tds_did_user_targ_day.day_tvp_cv > > (TEXT),default.tds_did_user_targ_day.day_tvp_pt > > (TEXT),default.tds_did_user_targ_day.day_cartn_vv > > (TEXT),default.tds_did_user_targ_day.day_cartn_cv > > (TEXT),default.tds_did_user_targ_day.day_cartn_pt > > (TEXT),default.tds_did_user_targ_day.day_var_vv > > (TEXT),default.tds_did_user_targ_day.day_var_cv > > (TEXT),default.tds_did_user_targ_day.day_var_pt > > (TEXT),default.tds_did_user_targ_day.day_amuse_vv > > (TEXT),default.tds_did_user_targ_day.day_amuse_cv > > (TEXT),default.tds_did_user_targ_day.day_amuse_pt > > (TEXT),default.tds_did_user_targ_day.day_sport_vv > > (TEXT),default.tds_did_user_targ_day.day_sport_cv > > (TEXT),default.tds_did_user_targ_day.day_sport_pt > > (TEXT),default.tds_did_user_targ_day.day_music_vv > > (TEXT),default.tds_did_user_targ_day.day_music_cv > > (TEXT),default.tds_did_user_targ_day.day_music_pt > > (TEXT),default.tds_did_user_targ_day.day_fin_vv > > (TEXT),default.tds_did_user_targ_day.day_fin_cv > > (TEXT),default.tds_did_user_targ_day.day_fin_pt > > (TEXT),default.tds_did_user_targ_day.day_hot_vv > > (TEXT),default.tds_did_user_targ_day.day_hot_cv > > (TEXT),default.tds_did_user_targ_day.day_hot_pt (TEXT)} > > => 0: hdfs://realtime-cluster/data/basetable/tds_did_user_targ_ > > day/dt=20150228/platform=pc > > > > > > > > > > > > > > > > > > On Thu, Mar 5, 2015 at 3:17 PM, Jihoon Son wrote: > > > > > Hi Azuryy, > > > truly sorry for late response. > > > I left some comments below. > > > > > > Sincerely, > > > Jihoon > > > > > > On Wed, Mar 4, 2015 at 7:15 PM Azuryy Yu wrote: > > > > > > > Hi, > > > > > > > > I read theTajo-0.9.0 source code, I found Tajo using a simple FIFO > > > > scheduler, > > > > > > > > I accept this in the current stage. but when Tajo peek a query from > the > > > > scheduler queue, then allocate workers for this query, > > > > > > > > Allocator only consider availale resource on a random worker list, > > then > > > > specify a set of workers. > > > > > > > > 1) > > > > so My question is why we don't consider HDFS locatbility? otherwise > > > network > > > > will be the bottleneck. > > > > > > > > I understand Tajo don't use YARN as a scheduler currently. and write > a > > > > temporary simple FIFO scheduler. and I am also looked at > > > > https://issues.apache.org/jira/browse/TAJO-540 , I hope new Tajo > > > scheduler > > > > similar to Sparrow. > > > > > > > It seems that there are some misunderstandings on our resource > > scheduling. > > > The > > > FIFO scheduler has a role of the query scheduler. That is, given a list > > of > > > submitted queries, it reserves resources required to execute queries > > > consecutively. The Sparrow-like scheduler can be used for the > concurrent > > > execution of multiple queries. > > > > > > Once a query is started, the *task scheduler* is responsible for > > allocating > > > tasks to workers. As you said, tasks are allocated to workers if they > > have > > > enough resources. However when allocating tasks, our task scheduler > > > considers the physical disk location where the data is stored on as > well > > as > > > the location of the node containing data. For example, with your > cluster, > > > each worker can be assigned 12 tasks each of which processes data > stored > > on > > > different 12 disks. Since a worker is generally equipped multiple > disks, > > > this approach can utilize the disk bandwidth efficiently. > > > > > > You can see the locality information in the Tajo's query master log. > Here > > > is an example. > > > ... > > > 2015-03-05 15:14:12,662 INFO > > > org.apache.tajo.querymaster.DefaultTaskScheduler: Assigned > > > Local/Rack/Total: (9264/1555/10819), Locality: 85.63%, Rack host: > > > xxx.xxx.xxx.xxx > > > ... > > > > > > > > > > 2) performance related. > > > > I setup 10 nodes clusters, (1 master, 9 workers) > > > > > > > > 64GB mem, 24CPU, 12*4TB HDD, 1.6GB test data.(160 million records) > > > > > > > > It's works good for some agg sql tests except count(distinct) > > > > > > > > count(distinct) is very slow - ten minutes. > > > > > > > This result looks strange and difficult to find what makes the query > > > execution slow. > > > Would you mind sharing some logs and additional information of input > data > > > (# of files, the distribution of data on HDFS)? > > > In addition, it would be great if you share the evaluation results of > > other > > > queries which you think the response time is sufficiently short. > > > > > > > > > > > who can give me a simple explanation of how Tajo works with > > > > count(distinct), I can share my tajo-site here: > > > > > > > > > > > > > > > > tajo.rootdir > > > > hdfs://realtime-cluster/tajo > > > > > > > > > > > > > > > > > > > > tajo.master.umbilical-rpc.address > > > > xx:26001 > > > > > > > > > > > > tajo.master.client-rpc.address > > > > xx:26002 > > > > > > > > > > > > tajo.master.info-http.address > > > > xx:26080 > > > > > > > > > > > > tajo.resource-tracker.rpc.address > > > > xx:26003 > > > > > > > > > > > > tajo.catalog.client-rpc.address > > > > xx:26005 > > > > > > > > > > > > > > > > tajo.worker.tmpdir.locations > > > > > > > > file:///data/hadoop/data1/tajo,file:///data/hadoop/ > > > > data2/tajo,file:///data/hadoop/data3/tajo,file:/// > > > > data/hadoop/data4/tajo,file:///data/hadoop/data5/tajo,file:/ > > > > //data/hadoop/data6/tajo,file:///data/hadoop/data7/tajo, > > > > > file:///data/hadoop/data8/tajo,file:///data/hadoop/data9/tajo,file:/// > > > > data/hadoop/data10/tajo,file:///data/hadoop/data11/tajo,file:/ > > > > //data/hadoop/data12/tajo > > > > > > > > > > > > tajo.worker.tmpdir.cleanup-at-startup > > > > true > > > > > > > > > > > > tajo.worker.history.expire-interval-minutes > > > > 60 > > > > > > > > > > > > tajo.worker.resource.tajo.worker.resource.cpu-cores > > > > 24 > > > > > > > > > > > > tajo.worker.resource.memory-mb > > > > 60512 > > > > > > > > > > > > tajo.task.memory-slot-mb.default > > > > 3000 > > > > > > > > > > > > tajo.task.disk-slot.default > > > > 1.0f > > > > > > > > > > > > tajo.shuffle.fetcher.parallel-execution.max-num > > > > 5 > > > > > > > > > > > > tajo.executor.external-sort.thread-num > > > > 2 > > > > > > > > > > > > > > > > tajo.rpc.client.worker-thread-num > > > > 4 > > > > > > > > > > > > tajo.cli.print.pause > > > > false > > > > > > > > > > > > > > > > > > > > > > > > tajo-env: > > > > > > > > export TAJO_WORKER_HEAPSIZE=60000 > > > > > > > > > > --001a11490a1e446b5f051086780a--