drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel McQuillen <daniel.mcquil...@gmail.com>
Subject Re: S3 with mixed files
Date Fri, 20 Oct 2017 21:27:32 GMT
Hi Arjun,

Yes! Thanks. I didn't have my "log" storage plugin defined correctly (It
was missing the "extensions" key set to value "log".)

However, when I try to query a file like abc.log.gz

select * from ibios3.root.`/tracking/abc.log.gz`;


I get a different error

org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
IllegalStateException: You tried to start when you are using a ValueWriter
of type NullableVarCharWriterImpl. Fragment 0:0 [Error Id:
33dedb5f-2e3d-4e54-a918-0ad3553436ce on
ip-10-0-0-24.us-west-1.compute.internal:31010]

I've followed the docs and have my storage plugin defined as:

    "log": {
      "type": "json",
      "extensions": [
        "gz"
      ]
    },

I also tried (thinking maybe I'm misreading the docs and .gz support is
built it)...

    "log": {
      "type": "json",
      "extensions": [
        "log"
      ]
    },

and

    "log": {
      "type": "json",
      "extensions": [
        "log", "gz"
      ]
    },

with no luck.

Thanks for any further direction you can provide!

Best Regards,

Daniel





On Fri, Oct 20, 2017 at 6:52 PM, Arjun kr <arjun.kr@outlook.com> wrote:

> Hi Daniel,
>
> This error may occur if you don't have format defined in S3 storage plugin
> that handles ".log" extension.
>
> For eg:
>
> -- I have file input.csv and have csv format defined in s3 storage plugin.
>
> 2 rows selected (1.233 seconds)
> 0: jdbc:drill:schema=dfs> select * from s3.root.`test-dir/input.csv`;
> +--------------------------------------------------+
> |                     columns                      |
> +--------------------------------------------------+
> | ["\"Pespsi,Pepsi\",\"Pespsi,Pepsi [100.00]",""]  |
> | ["Pespsi,Pepsi\",\"Pespsi,Pepsi [100.00]",""]    |
> | ["Pespsi,Pepsi","Pespsi,Pepsi [100.00]"]         |
> +--------------------------------------------------+
> 3 rows selected (3.418 seconds)
>
> -- Renamed S3 file input.csv to input.log
>
> 0: jdbc:drill:schema=dfs> select * from s3.root.`test-dir/input.log`;
> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 16:
> Table 's3.root.test-dir/input.log' not found
>
> SQL Query null
>
> [Error Id: 5996db7d-c886-45a8-bddf-99f11159db66 on arjun-lab-73:31010]
> (state=,code=0)
> 0: jdbc:drill:schema=dfs>
>
> Thanks,
>
> Arjun
>
>
> ________________________________
> From: Divya Gehlot <divya.htconex@gmail.com>
> Sent: Friday, October 20, 2017 12:50 PM
> To: user@drill.apache.org
> Subject: Re: S3 with mixed files
>
> Hi Daniel,
> Can you try select * from ibios3.root.`./tracking/tracking.log`;
> instead of
> select * from ibios3.root.`tracking/tracking.log`;
>
> Thanks,
> Divya
>
>
> On 20 October 2017 at 13:13, Daniel McQuillen <daniel.mcquillen@gmail.com>
> wrote:
>
> > Thanks for your help, Padma!
> >
> > Just tried the following, per your suggestion:
> >
> > select * from ibios3.root.`tracking/tracking.log`;
> >
> > Still getting an error (although as I mentioned before I can do a 'show
> > files;' ok so the credentials must be working):
> >
> >  "org.apache.drill.common.exceptions.UserRemoteException: VALIDATION
> > ERROR:
> > From line 1, column 15 to line 1, column 20: Table
> > 'ibios3.root.tracking/tracking.log' not found SQL Query null [Error Id:
> > fbd59cf8-d6ec-4022-b682-9b51d33f8302 on
> > ip-10-0-0-24.us-west-1.compute.internal:31010]
> >
> >
> > I tried from both the embedded command line and the web interface. Do you
> > have any other suggestions? Thanks in advance.
> >
> > Best Regards,
> >
> > Daniel
> >
> >
> >
> > On Fri, Oct 20, 2017 at 12:25 PM, Padma Penumarthy <ppenumarthy@mapr.com
> >
> > wrote:
> >
> > > From your error log, it seems like you may be specifying the table
> > > incorrectly.
> > > Instead of 'ibios3.root.tracking/tracking.log’, can you try
> > > ibios3.root.`tracking/tracking.log`
> > >
> > > i.e. for example, select * from ibios3.root.`tracking/tracking.log`
> > >
> > > Thanks
> > > Padma
> > >
> > >
> > > > On Oct 18, 2017, at 7:15 PM, Daniel McQuillen <
> > > daniel.mcquillen@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > Attempting to use Apache Drill to parse Open edX tracking log files I
> > > have
> > > > stored on S3.
> > > >
> > > > I've successfully set up an S3 connection and I can see my different
> > > > directories in the target S3 bucket when I type `show files;` in
> > embedded
> > > > drill. Hooray!
> > > >
> > > > However, I can't seem to do a query. I keep getting a "not found"
> error
> > > >
> > > > SEVERE: org.apache.calcite.runtime.CalciteContextException: From
> line
> > 1,
> > > > column 15 to line 1, column 20: Table 'ibios3.root.tracking/
> > > tracking.log'
> > > > not found
> > > >
> > > > The "tracking" subdirectory has a most recent `tracking.log` file as
> > well
> > > > as a bunch of  gzipped older files, e.g. `tracking-log-20170518-1234.
> > gz`
> > > > ... could this be confusing Drill? I've tried querying an individual
> > file
> > > > (tracking.log) as well as the directory itself, but not luck.
> > > >
> > > > Thanks for any thoughts!
> > > >
> > > >
> > > > - Daniel
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message