From issues-return-198030-apmail-hive-issues-archive=hive.apache.org@hive.apache.org Tue Aug 25 10:53:02 2020 Return-Path: X-Original-To: apmail-hive-issues-archive@locus.apache.org Delivered-To: apmail-hive-issues-archive@locus.apache.org Received: from mailroute1-lw-us.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by minotaur.apache.org (Postfix) with ESMTP id 2CE4719E19 for ; Tue, 25 Aug 2020 10:53:02 +0000 (UTC) Received: from mail.apache.org (localhost [127.0.0.1]) by mailroute1-lw-us.apache.org (ASF Mail Server at mailroute1-lw-us.apache.org) with SMTP id D10B2124803 for ; Tue, 25 Aug 2020 10:53:01 +0000 (UTC) Received: (qmail 75165 invoked by uid 500); 25 Aug 2020 10:53:01 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 75132 invoked by uid 500); 25 Aug 2020 10:53:01 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 75118 invoked by uid 99); 25 Aug 2020 10:53:01 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Aug 2020 10:53:01 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id B50174078C for ; Tue, 25 Aug 2020 10:53:00 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 356DF780259 for ; Tue, 25 Aug 2020 10:53:00 +0000 (UTC) Date: Tue, 25 Aug 2020 10:53:00 +0000 (UTC) From: "ASF GitHub Bot (Jira)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Work logged] (HIVE-24061) Improve llap task scheduling for better cache hit rate MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-24061?focusedWorklogId=3D= 474236&page=3Dcom.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpa= nel#worklog-474236 ] ASF GitHub Bot logged work on HIVE-24061: ----------------------------------------- Author: ASF GitHub Bot Created on: 25/Aug/20 10:52 Start Date: 25/Aug/20 10:52 Worklog Time Spent: 10m=20 Work Description: rbalamohan opened a new pull request #1431: URL: https://github.com/apache/hive/pull/1431 https://issues.apache.org/jira/browse/HIVE-24061 =20 Changes: 1. Adjust locality delay when the task is getting scheduled. 2. Reset locality delay when all nodes in the cluster are busy and would= n't be able to schedule tasks. 3. Optimize schedulePendingTasks to exit early, when all nodes are busy.= This helps in reducing lock contention as well. =20 Patch was tested on a medium scale cluster and observed good improvement= in runtime of queries. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 474236) Remaining Estimate: 0h Time Spent: 10m > Improve llap task scheduling for better cache hit rate=20 > ------------------------------------------------------- > > Key: HIVE-24061 > URL: https://issues.apache.org/jira/browse/HIVE-24061 > Project: Hive > Issue Type: Improvement > Reporter: Rajesh Balamohan > Priority: Major > Labels: perfomance > Time Spent: 10m > Remaining Estimate: 0h > > TaskInfo is initialized with the "requestTime and locality delay". When l= ots of vertices are in the same level, "taskInfo" details would be availabl= e upfront. By the time, it gets to scheduling, "requestTime + localityDelay= " won't be higher than current time. Due to this, it misses scheduling dela= y details and ends up choosing random node. This ends up missing cache hits= and reads data from remote storage. > E.g Observed this pattern in Q75 of tpcds. > Related lines of interest in scheduler: [https://github.com/apache/hive/b= lob/master/llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTas= kSchedulerService.java |https://github.com/apache/hive/blob/master/llap-tez= /src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.j= ava] > {code:java} > boolean shouldDelayForLocality =3D request.shouldDelayForLocality(sche= dulerAttemptTime); > .. > .. > boolean shouldDelayForLocality(long schedulerAttemptTime) { > return localityDelayTimeout > schedulerAttemptTime; > } > {code} > =C2=A0 > Ideally, "localityDelayTimeout" should be adjusted based on it's first sc= heduling opportunity. -- This message was sent by Atlassian Jira (v8.3.4#803005)