From user-return-7954-apmail-drill-user-archive=drill.apache.org@drill.apache.org Thu Jun 1 20:56:47 2017 Return-Path: X-Original-To: apmail-drill-user-archive@www.apache.org Delivered-To: apmail-drill-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2ED351A7AA for ; Thu, 1 Jun 2017 20:56:47 +0000 (UTC) Received: (qmail 68510 invoked by uid 500); 1 Jun 2017 20:56:45 -0000 Delivered-To: apmail-drill-user-archive@drill.apache.org Received: (qmail 68439 invoked by uid 500); 1 Jun 2017 20:56:45 -0000 Mailing-List: contact user-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@drill.apache.org Delivered-To: mailing list user@drill.apache.org Received: (qmail 68427 invoked by uid 99); 1 Jun 2017 20:56:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Jun 2017 20:56:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 833F3CEC80 for ; Thu, 1 Jun 2017 20:56:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id swaRn6AaW-9L for ; Thu, 1 Jun 2017 20:56:43 +0000 (UTC) Received: from mail-qk0-f178.google.com (mail-qk0-f178.google.com [209.85.220.178]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id EED9E5F2FD for ; Thu, 1 Jun 2017 20:56:42 +0000 (UTC) Received: by mail-qk0-f178.google.com with SMTP id p66so29825254qkf.3 for ; Thu, 01 Jun 2017 13:56:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to; bh=4cgQV1nQ0EnExs1b3dYtFn6e7zx+0020uwzmJUZqbD0=; b=mB96IQa04bSAUp7MXcq5m8p51McNPVIiRGiDd4Q99E/Oq5thUzEDCg32Beh//tW6Fs XXIi8EKhXT40W/RLtWmbei16JrUF5wTgviA+WDcBBsqigS1gt9XmccqtgLp6dIWsl5de NjLmDlsf9nTYSf/7mgsiZGBX2NuqbezqCNne4pGlCiPUS6NvPyrBU2pv7JRD8cmosS/Z P5gJ7gy5OwpRZWDuG585jTohh6k6SsxIVc2EvCn2rk64jMbMtbF+wSNm4iSYKJxmWQBR 9wF1yiHl4l+prJGLgvP2KxVUbn9TG3Qskze9vAo2azIFqti1FMtU98HMk6f9qzu69Scc kHBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to; bh=4cgQV1nQ0EnExs1b3dYtFn6e7zx+0020uwzmJUZqbD0=; b=K6hBKT8KKBTxFkFhAakm/XMXujUIiTUqfESgfNQvtTjh2DVb5H9LrHNgnPHsIadtjx gNIY+NfOmOaYEhsWfYkeLOPkTtfwGaUSGH8JtB87XzhMHefWN66yBcrEfOqotl/z8IR5 Sq96qlNYAHsHCpyby06MmT4GgDzyufjV4hXpj/h19CmCVxb3IX9pRvtVuDkD4HbVxBAf ZFNeOtg4/hCvk2/5qDYdgG9mj/Hbkwf0Z4l88bvgYiuD7ZDzZCj2Yzau+goKqttsJRbJ pSocL7CuDJ8mYzY9k3dXq5juvAEtFePm0q5q6K4cYQ6oJcwG0X7ddYaC7hHQb8fD9suW wWbw== X-Gm-Message-State: AODbwcAAK8lS7uh1QYQj4zz5VxSYmzh39/bYjIe23f420eRoGKoQo3Y1 9ZHLAyyBR76sxScVFFIO3E+NLstCnA2m X-Received: by 10.55.175.199 with SMTP id y190mr4097821qke.155.1496350596189; Thu, 01 Jun 2017 13:56:36 -0700 (PDT) MIME-Version: 1.0 Sender: yousef.lasi@gmail.com X-Google-Sender-Delegation: yousef.lasi@gmail.com Received: by 10.12.148.146 with HTTP; Thu, 1 Jun 2017 13:56:35 -0700 (PDT) In-Reply-To: References: From: Raz Baluchi Date: Thu, 1 Jun 2017 16:56:35 -0400 X-Google-Sender-Auth: 1fMz11Hpn4Z5aYN7OzEiHpDhUXs Message-ID: Subject: Re: Parquet on S3 - timeouts To: user@drill.apache.org Content-Type: multipart/alternative; boundary="94eb2c0658eabed4160550ec45db" --94eb2c0658eabed4160550ec45db Content-Type: text/plain; charset="UTF-8" I noticed that if I precede the query with a select count(*) with the same filters, I no longer experience timeouts. By 'priming' the query in this way, the second query is also faster. This seems to be an acceptable workaround as it it seems to allow me to essentially include all partitions in the filter and still get results pretty quickly. I am still curious why this occurs? On Thu, Jun 1, 2017 at 4:08 PM, Abhishek Girish wrote: > Can you take a look at [1] and let us know if that helps resolve your > issue? > > [1] > https://drill.apache.org/docs/s3-storage-plugin/#quering- > parquet-format-files-on-s3 > > On Thu, Jun 1, 2017 at 12:55 PM, Raz Baluchi > wrote: > > > Now that I have Drill working with parquet files on dfs, the next step > was > > to move the parquet files to S3. > > > > I get pretty good performance - I can query for events by date range > > within 10 seconds. ( out of a total of ~ 800M events across 25 years) > > However, there seems to be some threshold beyond which queries start > > timing out. > > > > SYSTEM ERROR: ConnectionPoolTimeoutException: Timeout waiting for > > connection from pool > > > > My first question is, is there a default timeout value to queries against > > S3? Anything that takes longer than ~ 150 seconds seems to hit the > timeout > > error. > > > > The second question has to do with the possible conditions that trigger > the > > prolonged query time. It seems that if I increase the filters beyond a > > certain number - it doesn't take much - the query times out. > > > > For example the query: > > > > select * from events where YEAR in (2012, 2013) works fine - however, > > select * from events where YEAR in (2012, 2013, 2014) fails with a > timeout. > > > > To make it worse, I can't use the first query either until I restart > > drill... > > > --94eb2c0658eabed4160550ec45db--