From user-return-10718-apmail-drill-user-archive=drill.apache.org@drill.apache.org Tue Jan 7 19:00:43 2020 Return-Path: X-Original-To: apmail-drill-user-archive@www.apache.org Delivered-To: apmail-drill-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by minotaur.apache.org (Postfix) with SMTP id 10DE719A92 for ; Tue, 7 Jan 2020 19:00:42 +0000 (UTC) Received: (qmail 37642 invoked by uid 500); 7 Jan 2020 19:00:40 -0000 Delivered-To: apmail-drill-user-archive@drill.apache.org Received: (qmail 37423 invoked by uid 500); 7 Jan 2020 19:00:40 -0000 Mailing-List: contact user-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@drill.apache.org Delivered-To: mailing list user@drill.apache.org Received: (qmail 37397 invoked by uid 99); 7 Jan 2020 19:00:39 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jan 2020 19:00:39 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 9087CC0620; Tue, 7 Jan 2020 19:00:38 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.25 X-Spam-Level: X-Spam-Status: No, score=0.25 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=0.2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-he-de.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id dJAwXvKao14Z; Tue, 7 Jan 2020 19:00:36 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2a00:1450:4864:20::12b; helo=mail-lf1-x12b.google.com; envelope-from=nitinpawar432@gmail.com; receiver= Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id C63057DDEC; Tue, 7 Jan 2020 19:00:35 +0000 (UTC) Received: by mail-lf1-x12b.google.com with SMTP id m30so505897lfp.8; Tue, 07 Jan 2020 11:00:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fKHKOHVcrtT79IOHIRVccvfsYC4WSSHnXmBmncV12PQ=; b=ZKgqfWWr3OpItrOD+EV/I0l+uFIAlOBpki6Q7X6Pmw4DzK2uqeG2Zg+0mQYir7acpU g0LJaIxzLej/FE8qw1Mm245VPNgVTnF6PYhAel5hRf0qA4kH/O5cMFqlxjgl5ZKoylrz uP4wj+JCEgX3QZcC+rNte6X59RZkQN0EDkNLWvSJzWC1LwnCrX6f991jCdhln/v6ye0w pDsKqWuH+tU9/jzJ5ZnTkNQyjWtlSCgrbeOELkz2R/93i0Up1jtKkzZxyn60A96EYKES YEuuPErET/9BloMmpnTrMHBZYLzbDRaKRSu3mQY3CajNeqGyK25zpZPIuzQ35RYYRhSf e9oQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fKHKOHVcrtT79IOHIRVccvfsYC4WSSHnXmBmncV12PQ=; b=tDyifCakkMmXTmq0vy2V7jXJDbqipL+uoEvMGMW971vF6LR1qyqFW+bxBlzwF0qjl9 +MDu0ssJYTkBBD2m17u1TjTsiVfgY0nacXOau2gAJXPU3NiBfo991hxA8WpNkD5DGkHS jEGeqMu0Fy04Z5059x6Dd5VqMJmo66ZKBdpi/jiqbfq5kC1vMPzVFEpGuvgEh4dpn+/V yk4iJM8QznCyoYziTd5+q+HbOI+z2kPy0smOkjMO57i3OD/TqSvbMKY24F2CYbsH/Yvi eLFYYmFoe1kVD4JRp6vWRYgMX11iNhxD2ihCYSHbQrMPwvnLfavGxPTVyOP0Z9Gz3giI z3zw== X-Gm-Message-State: APjAAAWJV/wCCGrCX2to+ipPah0XdolUcxaMoQFWpXDz3GNuCWES4yKU gQCPCkpa+x3zSy0AbR6CSkcsQydHqDc309Y1z5Z3Lz1t X-Google-Smtp-Source: APXvYqzt22hVhkW30kEESTazEp/nNY/vrPIMBZlL/J2YJ0nduxYkSBmx+YRQUaoF7FCBqTeT/q25461V89CvubaphjM= X-Received: by 2002:a19:7401:: with SMTP id v1mr519133lfe.129.1578423629069; Tue, 07 Jan 2020 11:00:29 -0800 (PST) MIME-Version: 1.0 References: <979182834.8027436.1578423439261@mail.yahoo.com> In-Reply-To: <979182834.8027436.1578423439261@mail.yahoo.com> From: Nitin Pawar Date: Wed, 8 Jan 2020 00:30:17 +0530 Message-ID: Subject: Re: Question about foreman restart To: user@drill.apache.org Cc: dev@drill.apache.org Content-Type: multipart/alternative; boundary="000000000000b74321059b9163ad" --000000000000b74321059b9163ad Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Paul, Thanks for the response. I will attach OOM logs to tickets once i get it from my OPS team. Thanks, Nitin On Wed, Jan 8, 2020 at 12:27 AM Paul Rogers wrote: > Hi Nitin, > > Thanks for letting us know about the OOM issues. These are serious and we > should focus on finding the cause and fixing them. In general, it is the > goal of the Drill project that Drill suffer no OOM errors on a cluster > configured properly for your target workload. > > Thank you for filing a JIRA ticket. The stack trace in that ticket > describes a connection shut down. Your e-mail mentioned an OOM error. Can > you attach a stack trace or log entry that led you to believe you were > getting an OOM error? How many queries are running at the time of the err= or? > > As you know, Drill uses two kinds of memory: heap and off-heap (AKA > "direct" or "unsafe.") Generally, you want much more off-heap than heap > memory. But, until we know which kind is being exhausted, it is hard to s= ay > what to adjust. > > If a Drillbit fails, all queries anywhere on the cluster will fail. The > reason is simple: all queries are distributed across all nodes. This is w= hy > we must find and fix the underlying OOM error. > > On a 64 GB machine, if you are running only Drill, you can give most of > the memory to Drill itself. Determine how much your OS and other process > need. Then, split the rest between heap and off-heap. It is very likely y= ou > have already customized the Drill memory settings: it is the first thing > everyone does when deploying. [1] Check your settings. > > Until we know if you are running out of heap vs. off-heap, it is hard to > suggest which setting to adjust. If it is heap memory that is affected, > then you can increase the heap memory setting to see what affect that has > on Drillbit lifetime. > > Thanks, > - Paul > > [1] http://drill.apache.org/docs/configuring-drill-memory/ > > > > > > > On Tuesday, January 7, 2020, 08:45:46 AM PST, Nitin Pawar < > nitinpawar432@gmail.com> wrote: > > Hello Team > We have recently upgraded to drill-1.16 from drill-1.13 version > and we have started to notice lots of OOM issues .. its same setup with > changed binaries > till we figured out what=E2=80=99s the issue, we wanted to keep restartin= g > drillbits with cronjobs > > my question is : *If a drill is restarted .. would the queries with this > node as foreman be resubmitted automatically ?* > > Also we have a 64GB RAM machines. Can someone recommend memory setting fo= r > this environment > > -- > Nitin Pawar --=20 Nitin Pawar --000000000000b74321059b9163ad--