From user-return-10131-apmail-drill-user-archive=drill.apache.org@drill.apache.org Sun Feb 10 16:08:48 2019 Return-Path: X-Original-To: apmail-drill-user-archive@www.apache.org Delivered-To: apmail-drill-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 52925184A4 for ; Sun, 10 Feb 2019 16:08:48 +0000 (UTC) Received: (qmail 70853 invoked by uid 500); 10 Feb 2019 16:08:48 -0000 Delivered-To: apmail-drill-user-archive@drill.apache.org Received: (qmail 70754 invoked by uid 500); 10 Feb 2019 16:08:47 -0000 Mailing-List: contact user-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@drill.apache.org Delivered-To: mailing list user@drill.apache.org Received: (qmail 70741 invoked by uid 99); 10 Feb 2019 16:08:47 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 10 Feb 2019 16:08:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id CE10FC87EA for ; Sun, 10 Feb 2019 16:08:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.201 X-Spam-Level: X-Spam-Status: No, score=-0.201 tagged_above=-999 required=6.31 tests=[DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id MwB6vIaeq5ig for ; Sun, 10 Feb 2019 16:08:45 +0000 (UTC) Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 9A1745F205 for ; Sun, 10 Feb 2019 15:59:09 +0000 (UTC) Received: by mail-qt1-f171.google.com with SMTP id b15so9488583qto.8 for ; Sun, 10 Feb 2019 07:59:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:subject:date:references :to:in-reply-to:message-id; bh=04x/TupmAvFaOxCgCMZweqU4jNskGovTUsQi25IZuaU=; b=BhPg2GAo7UktlHdVsbi5DZRaEHy8XQQbTgsExzIeJcmANfE/nfRMC2VK0tpB6u/WiQ V/TIcV8ovN4rH1FQ0zd0GCQaMvqNPDLFLdCsQNCc40svVWnqiK0R07JmtMB4XY+RIVlx U6uy+wehfh5Sdh8LoVB1+IrdQwPZX7dCtrxtwDYFovSCoCf7CypJQXoBO/IaLXyJ3WLi xh8TBwKrzVZ+5zb9DcTDVHLCAvPsMHBS6wrngdZQU4JFWEwIluZyEwcVbrl7D+sMHUFU rkQQJEdUXHzPLL+63V8y1J5RB+3w9K5WrVUH5Wax5x6ZSWnwXDLuVNolOYDfQqiEiBLJ XDUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=04x/TupmAvFaOxCgCMZweqU4jNskGovTUsQi25IZuaU=; b=rVkfPz97raw0UX549jXMz5GpBvR7nLYhv/IZFZRt1/tXSRZwS5D/cAu5BP1gjefjOa zQxNoyeuKJ1B0nVc62T65ADxL00j3dyl6SP5xuLSY5h4Nm4F9SxT78u39MkQrYwCzbMv k6yP+ngs4oepYakI8rd4VhfUQRmkGyR0mdl96AunJWmY+KnUCFPcgEOHvbyhUM5N1S9Z S6m88mxYWw0n5fpr0xLU5HP0O3zCKR0ZJKpx9rbK6I2a8llBbdRbMZ8hN5wpqWXSzapy bTM274tftWxIwWTAZkbc79csfJizDaNheEaXCtbqByRjTn7/Mrn7BcdsRUFsaAFkQPO6 PPdQ== X-Gm-Message-State: AHQUAubTJ9eb1LG9zvlUYrfDOpb1sEtXq2GWqV3bjAy4BrQOczf2BsVA +kzbgDS0A68Ie8fRUWgVfB5PgZ9y X-Google-Smtp-Source: AHgI3IY0JBYmRmtgYszhz7ShOfE952smQCR6sz03vabzXJ4vSHolf2IYnNDUbGwyv51mq2Bs0+52aw== X-Received: by 2002:ac8:b0e:: with SMTP id e14mr24411800qti.336.1549814343270; Sun, 10 Feb 2019 07:59:03 -0800 (PST) Received: from [192.168.1.21] (pool-108-22-224-35.bltmmd.east.verizon.net. [108.22.224.35]) by smtp.gmail.com with ESMTPSA id s13sm12189717qkl.71.2019.02.10.07.59.02 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 10 Feb 2019 07:59:02 -0800 (PST) From: Charles Givre Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: Drill fails to query pcap files Date: Sun, 10 Feb 2019 10:59:01 -0500 References: <1002D02D-FC78-456F-9F91-E1FA763AD2D6@gmail.com> <58C84879-B414-4FFC-91BA-0A43224C9FD8@gmail.com> <0DEBCA2A-EB3E-48B2-BF6D-1EB821880FBC@gmail.com> <977B672B-F3F6-464C-B231-59F7C777CB21@gmail.com> <9514D86E-5B72-4D7B-9B77-DCD4A99EC177@gmail.com> <9C723473-44F7-403A-91FA-A30D0700CDB6@gmail.com> To: user@drill.apache.org In-Reply-To: Message-Id: <2114B4E5-918D-46D4-BB0C-2D0E3A73C60B@gmail.com> X-Mailer: Apple Mail (2.3445.102.3) If I can get some more examples of corrupted files I=E2=80=99ll test = more thoroughly. Also, we=E2=80=99ll need to apply the same methodology = to PCAP-NG, so I=E2=80=99ll need some examples there as well. My = strategy is going to be get as much data as possible out of the corrupt = packet.=20 =E2=80=94 C > On Feb 10, 2019, at 10:54, Ted Dunning wrote: >=20 > I think that accessing fields in corrupted packets will also cause > exceptions. But this is a great start. Conditionalizing field access = on > !is_corrupt() might be sufficient for the next step. >=20 >=20 >=20 > On Sun, Feb 10, 2019 at 4:58 AM Charles Givre = wrote: >=20 >> All, >> I posted the following PR for this issue: >> https://github.com/apache/drill/pull/1637 < >> https://github.com/apache/drill/pull/1637> >>=20 >> Basically this PR does two things. >> 1. It creates a boolean column called is_corrupt and >> 2. If the PCAP file has a corrupt row, it marks that row as corrupt = by >> setting is_corrupt to true and keeps going >>=20 >> WIth the example from Giovanni, I was able to find 590 or so corrupt = rows >> out of 7000 in that PCAP file. It was late and I don=E2=80=99t know = if that was >> what ti was supposed to find, but it worked and was able to query = that. >> If you guys could send a few more examples, I=E2=80=99d like to test = this on other >> files to make sure it works with them. We=E2=80=99re also going to = have to do the >> same thing for the PCAP-NG format I would assume. >>=20 >>> On Feb 10, 2019, at 03:07, Ted Dunning = wrote: >>>=20 >>> On Sat, Feb 9, 2019 at 2:25 PM Bob Rudis wrote: >>>=20 >>>> ... >>>> And, I did indeed find a few and am just waiting for a formal = review so >> I >>>> can submit them for the Drill dev & tests. >>>>=20 >>>=20 >>> Awesome! >>=20 >>=20