drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Updike, Clark" <Clark.Upd...@jhuapl.edu>
Subject Re: Re: HDFS file is listable but not queryable (object not found)
Date Thu, 23 Jul 2020 15:48:29 GMT
Sorry if my use of hdfs as the name caused any confusion.  I simply copied the dfs plugin to
hdfs to make it clear what it was, but otherwise, it is essentially the same as the dfs with
just the tweaks for hdfs, viz:

{
  "type": "file",
  "connection": "hdfs://nn01:8020",
  "config": null,
  "workspaces": {
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    }
  },
  "formats": {
....

I was thinking, perhaps naively, that the fact that the file lists via SHOW FILES eliminates
plugin misconfig as an issue...

Thanks,
Clark

´╗┐On 7/23/20, 11:43 AM, "Rafael Jaimes III" <rafjaimes@gmail.com> wrote:

    Right, but do you need the rest of the config at the top of the dfs default
    config? Here's what I assume to be the full config taken from my 1.17 dfs
    config (with other formats deleted):
    
    {
      "type": "file",
      "connection": "file:///",
      "config": null,
      "workspaces": {
        "tmp": {
          "location": "/tmp",
          "writable": true,
          "defaultInputFormat": null,
          "allowAccessOutsideWorkspace": false
        },
        "root": {
          "location": "/",
          "writable": false,
          "defaultInputFormat": null,
          "allowAccessOutsideWorkspace": false
        }
      },
      "formats": {
        "json": {
          "type": "json",
          "extensions": [
            "json"
          ]
        }
      },
      "enabled": true
    }
    
    - Rafael
    
    On Thu, Jul 23, 2020 at 11:37 AM Charles Givre <cgivre@gmail.com> wrote:
    
    > Rafael,
    > Clark is using the filesystem plugin to query a Hadoop cluster.  It seems
    > weird that you can enumerate the files in a directory but when you try to
    > query that file, it breaks...
    > -- C
    >
    >
    >
    > > On Jul 23, 2020, at 11:35 AM, Rafael Jaimes III <rafjaimes@gmail.com>
    > wrote:
    > >
    > > Hi all,
    > >
    > > It looks like the file is 644 already which should be good.
    > > I'm confused why the schema is called hdfs. dfs is a pre-built schema for
    > > HDFS and querying against flat files such as .json as you're trying to
    > do.
    > > The default config for dfs also has a lot more content than what you
    > > pasted. Can you use the default and try again?
    > >
    > > Hope this helps,
    > > Rafael
    > >
    > >
    > > On Thu, Jul 23, 2020 at 11:30 AM Charles Givre <cgivre@gmail.com> wrote:
    > >
    > >> Hi Clark,
    > >> That's strange.  My initial thought is that this could be a permission
    > >> issue.  However, it might also be that Drill isn't finding the file for
    > >> some reason.
    > >>
    > >> Could you try:
    > >>
    > >> SELECT *
    > >> FROM hdfs.`<full hdfs path to file>`
    > >>
    > >> Best,
    > >> --- C
    > >>
    > >>
    > >>> On Jul 23, 2020, at 11:23 AM, Updike, Clark <Clark.Updike@jhuapl.edu>
    > >> wrote:
    > >>>
    > >>> This is in 1.17.  I can use SHOW FILES to list the file I'm targeting,
    > >> but I cannot query it:
    > >>>
    > >>> apache drill> show files in hdfs.root.`/tmp/employee.json`;
    > >>>
    > >>
    > +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
    > >>> |     name      | isDirectory | isFile | length |  owner   |   group
    > >> | permissions |       accessTime        |    modificationTime     |
    > >>>
    > >>
    > +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
    > >>> | employee.json | false       | true   | 474630 | me       | supergroup
    > >> | rw-r--r--   | 2020-07-23 10:53:15.055 | 2020-07-23 10:53:15.387 |
    > >>>
    > >>
    > +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
    > >>> 1 row selected (3.039 seconds)
    > >>>
    > >>>
    > >>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
    > >>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
    > >> Object '/tmp/employee.json' not found within 'hdfs.root'
    > >>> [Error Id: 3b833622-4fac-4ecc-becd-118291cd8560 ] (state=,code=0)
    > >>>
    > >>> The storage plugin uses the standard json config:
    > >>>
    > >>>   "json": {
    > >>>     "type": "json",
    > >>>     "extensions": [
    > >>>       "json"
    > >>>     ]
    > >>>   },
    > >>>
    > >>> I can't see any problems on the HDFS side.  Full stack trace is below.
    > >>>
    > >>> Any ideas what could be causing this behavior?
    > >>>
    > >>> Thanks, Clark
    > >>>
    > >>>
    > >>>
    > >>> FULL STACKTRACE:
    > >>>
    > >>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
    > >>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
    > >> Object '/tmp/employee.json' not found within 'hdfs.root'
    > >>>
    > >>>
    > >>> [Error Id: 69c8ffc0-4933-4008-a786-85ad623578ea ]
    > >>>
    > >>> (org.apache.calcite.runtime.CalciteContextException) From line 1,
    > >> column 15 to line 1, column 18: Object '/tmp/employee.json' not found
    > >> within 'hdfs.root'
    > >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
    > >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance():62
    > >>>   sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
    > >>>   java.lang.reflect.Constructor.newInstance():423
    > >>>   org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
    > >>>   org.apache.calcite.sql.SqlUtil.newContextException():824
    > >>>   org.apache.calcite.sql.SqlUtil.newContextException():809
    > >>>
    > >>
    > org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
    > >>>   org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
    > >>>
    >  org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
    > >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
    > >>>
    > >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
    > >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
    > >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
    > >>>
    > >>
    > org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
    > >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
    > >>>
    > >>
    > org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
    > >>>
    >  org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
    > >>>   org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
    > >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
    > >>>
    > >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
    > >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
    > >>>   org.apache.calcite.sql.SqlSelect.validate():216
    > >>>
    > >>
    > org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
    > >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
    > >>>   org.apache.drill.exec.planner.sql.SqlConverter.validate():218
    > >>>
    > >>
    > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
    > >>>
    > >>
    > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
    > >>>
    > >>
    > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
    > >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
    > >>>
    >  org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
    > >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
    > >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
    > >>>   org.apache.drill.exec.work.foreman.Foreman.runSQL():590
    > >>>   org.apache.drill.exec.work.foreman.Foreman.run():275
    > >>>   java.util.concurrent.ThreadPoolExecutor.runWorker():1142
    > >>>   java.util.concurrent.ThreadPoolExecutor$Worker.run():617
    > >>>   java.lang.Thread.run():745
    > >>> Caused By (org.apache.calcite.sql.validate.SqlValidatorException)
    > >> Object '/tmp/employee.json' not found within 'hdfs.root'
    > >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
    > >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance():62
    > >>>   sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
    > >>>   java.lang.reflect.Constructor.newInstance():423
    > >>>   org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
    > >>>   org.apache.calcite.runtime.Resources$ExInst.ex():572
    > >>>   org.apache.calcite.sql.SqlUtil.newContextException():824
    > >>>   org.apache.calcite.sql.SqlUtil.newContextException():809
    > >>>
    > >>
    > org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
    > >>>   org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
    > >>>
    >  org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
    > >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
    > >>>
    > >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
    > >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
    > >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
    > >>>
    > >>
    > org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
    > >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
    > >>>
    > >>
    > org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
    > >>>
    >  org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
    > >>>   org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
    > >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
    > >>>
    > >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
    > >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
    > >>>   org.apache.calcite.sql.SqlSelect.validate():216
    > >>>
    > >>
    > org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
    > >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
    > >>>   org.apache.drill.exec.planner.sql.SqlConverter.validate():218
    > >>>
    > >>
    > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
    > >>>
    > >>
    > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
    > >>>
    > >>
    > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
    > >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
    > >>>
    >  org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
    > >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
    > >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
    > >>>   org.apache.drill.exec.work.foreman.Foreman.runSQL():590
    > >>>   org.apache.drill.exec.work.foreman.Foreman.run():275
    > >>>   java.util.concurrent.ThreadPoolExecutor.runWorker():1142
    > >>>   java.util.concurrent.ThreadPoolExecutor$Worker.run():617
    > >>>   java.lang.Thread.run():745 (state=,code=0)
    > >>
    > >>
    >
    >
    

Mime
View raw message