drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Jaimes III <rafjai...@gmail.com>
Subject Re: HDFS file is listable but not queryable (object not found)
Date Thu, 23 Jul 2020 15:43:00 GMT
Right, but do you need the rest of the config at the top of the dfs default
config? Here's what I assume to be the full config taken from my 1.17 dfs
config (with other formats deleted):

{
  "type": "file",
  "connection": "file:///",
  "config": null,
  "workspaces": {
    "tmp": {
      "location": "/tmp",
      "writable": true,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    },
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    }
  },
  "formats": {
    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    }
  },
  "enabled": true
}

- Rafael

On Thu, Jul 23, 2020 at 11:37 AM Charles Givre <cgivre@gmail.com> wrote:

> Rafael,
> Clark is using the filesystem plugin to query a Hadoop cluster.  It seems
> weird that you can enumerate the files in a directory but when you try to
> query that file, it breaks...
> -- C
>
>
>
> > On Jul 23, 2020, at 11:35 AM, Rafael Jaimes III <rafjaimes@gmail.com>
> wrote:
> >
> > Hi all,
> >
> > It looks like the file is 644 already which should be good.
> > I'm confused why the schema is called hdfs. dfs is a pre-built schema for
> > HDFS and querying against flat files such as .json as you're trying to
> do.
> > The default config for dfs also has a lot more content than what you
> > pasted. Can you use the default and try again?
> >
> > Hope this helps,
> > Rafael
> >
> >
> > On Thu, Jul 23, 2020 at 11:30 AM Charles Givre <cgivre@gmail.com> wrote:
> >
> >> Hi Clark,
> >> That's strange.  My initial thought is that this could be a permission
> >> issue.  However, it might also be that Drill isn't finding the file for
> >> some reason.
> >>
> >> Could you try:
> >>
> >> SELECT *
> >> FROM hdfs.`<full hdfs path to file>`
> >>
> >> Best,
> >> --- C
> >>
> >>
> >>> On Jul 23, 2020, at 11:23 AM, Updike, Clark <Clark.Updike@jhuapl.edu>
> >> wrote:
> >>>
> >>> This is in 1.17.  I can use SHOW FILES to list the file I'm targeting,
> >> but I cannot query it:
> >>>
> >>> apache drill> show files in hdfs.root.`/tmp/employee.json`;
> >>>
> >>
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> >>> |     name      | isDirectory | isFile | length |  owner   |   group
> >> | permissions |       accessTime        |    modificationTime     |
> >>>
> >>
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> >>> | employee.json | false       | true   | 474630 | me       | supergroup
> >> | rw-r--r--   | 2020-07-23 10:53:15.055 | 2020-07-23 10:53:15.387 |
> >>>
> >>
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> >>> 1 row selected (3.039 seconds)
> >>>
> >>>
> >>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
> >>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
> >> Object '/tmp/employee.json' not found within 'hdfs.root'
> >>> [Error Id: 3b833622-4fac-4ecc-becd-118291cd8560 ] (state=,code=0)
> >>>
> >>> The storage plugin uses the standard json config:
> >>>
> >>>   "json": {
> >>>     "type": "json",
> >>>     "extensions": [
> >>>       "json"
> >>>     ]
> >>>   },
> >>>
> >>> I can't see any problems on the HDFS side.  Full stack trace is below.
> >>>
> >>> Any ideas what could be causing this behavior?
> >>>
> >>> Thanks, Clark
> >>>
> >>>
> >>>
> >>> FULL STACKTRACE:
> >>>
> >>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
> >>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
> >> Object '/tmp/employee.json' not found within 'hdfs.root'
> >>>
> >>>
> >>> [Error Id: 69c8ffc0-4933-4008-a786-85ad623578ea ]
> >>>
> >>> (org.apache.calcite.runtime.CalciteContextException) From line 1,
> >> column 15 to line 1, column 18: Object '/tmp/employee.json' not found
> >> within 'hdfs.root'
> >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
> >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance():62
> >>>   sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
> >>>   java.lang.reflect.Constructor.newInstance():423
> >>>   org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
> >>>   org.apache.calcite.sql.SqlUtil.newContextException():824
> >>>   org.apache.calcite.sql.SqlUtil.newContextException():809
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
> >>>   org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
> >>>
>  org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
> >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>>
>  org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
> >>>   org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
> >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>>   org.apache.calcite.sql.SqlSelect.validate():216
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
> >>>   org.apache.drill.exec.planner.sql.SqlConverter.validate():218
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
> >>>
>  org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
> >>>   org.apache.drill.exec.work.foreman.Foreman.runSQL():590
> >>>   org.apache.drill.exec.work.foreman.Foreman.run():275
> >>>   java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> >>>   java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> >>>   java.lang.Thread.run():745
> >>> Caused By (org.apache.calcite.sql.validate.SqlValidatorException)
> >> Object '/tmp/employee.json' not found within 'hdfs.root'
> >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
> >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance():62
> >>>   sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
> >>>   java.lang.reflect.Constructor.newInstance():423
> >>>   org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
> >>>   org.apache.calcite.runtime.Resources$ExInst.ex():572
> >>>   org.apache.calcite.sql.SqlUtil.newContextException():824
> >>>   org.apache.calcite.sql.SqlUtil.newContextException():809
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
> >>>   org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
> >>>
>  org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
> >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>>
>  org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
> >>>   org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
> >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>>   org.apache.calcite.sql.SqlSelect.validate():216
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
> >>>   org.apache.drill.exec.planner.sql.SqlConverter.validate():218
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
> >>>
>  org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
> >>>   org.apache.drill.exec.work.foreman.Foreman.runSQL():590
> >>>   org.apache.drill.exec.work.foreman.Foreman.run():275
> >>>   java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> >>>   java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> >>>   java.lang.Thread.run():745 (state=,code=0)
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message