Rafael,
Clark is using the filesystem plugin to query a Hadoop cluster. It seems weird that you can
enumerate the files in a directory but when you try to query that file, it breaks...
-- C
> On Jul 23, 2020, at 11:35 AM, Rafael Jaimes III <rafjaimes@gmail.com> wrote:
>
> Hi all,
>
> It looks like the file is 644 already which should be good.
> I'm confused why the schema is called hdfs. dfs is a pre-built schema for
> HDFS and querying against flat files such as .json as you're trying to do.
> The default config for dfs also has a lot more content than what you
> pasted. Can you use the default and try again?
>
> Hope this helps,
> Rafael
>
>
> On Thu, Jul 23, 2020 at 11:30 AM Charles Givre <cgivre@gmail.com> wrote:
>
>> Hi Clark,
>> That's strange. My initial thought is that this could be a permission
>> issue. However, it might also be that Drill isn't finding the file for
>> some reason.
>>
>> Could you try:
>>
>> SELECT *
>> FROM hdfs.`<full hdfs path to file>`
>>
>> Best,
>> --- C
>>
>>
>>> On Jul 23, 2020, at 11:23 AM, Updike, Clark <Clark.Updike@jhuapl.edu>
>> wrote:
>>>
>>> This is in 1.17. I can use SHOW FILES to list the file I'm targeting,
>> but I cannot query it:
>>>
>>> apache drill> show files in hdfs.root.`/tmp/employee.json`;
>>>
>> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
>>> | name | isDirectory | isFile | length | owner | group
>> | permissions | accessTime | modificationTime |
>>>
>> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
>>> | employee.json | false | true | 474630 | me | supergroup
>> | rw-r--r-- | 2020-07-23 10:53:15.055 | 2020-07-23 10:53:15.387 |
>>>
>> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
>>> 1 row selected (3.039 seconds)
>>>
>>>
>>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
>>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
>> Object '/tmp/employee.json' not found within 'hdfs.root'
>>> [Error Id: 3b833622-4fac-4ecc-becd-118291cd8560 ] (state=,code=0)
>>>
>>> The storage plugin uses the standard json config:
>>>
>>> "json": {
>>> "type": "json",
>>> "extensions": [
>>> "json"
>>> ]
>>> },
>>>
>>> I can't see any problems on the HDFS side. Full stack trace is below.
>>>
>>> Any ideas what could be causing this behavior?
>>>
>>> Thanks, Clark
>>>
>>>
>>>
>>> FULL STACKTRACE:
>>>
>>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
>>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
>> Object '/tmp/employee.json' not found within 'hdfs.root'
>>>
>>>
>>> [Error Id: 69c8ffc0-4933-4008-a786-85ad623578ea ]
>>>
>>> (org.apache.calcite.runtime.CalciteContextException) From line 1,
>> column 15 to line 1, column 18: Object '/tmp/employee.json' not found
>> within 'hdfs.root'
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance():62
>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
>>> java.lang.reflect.Constructor.newInstance():423
>>> org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
>>> org.apache.calcite.sql.SqlUtil.newContextException():824
>>> org.apache.calcite.sql.SqlUtil.newContextException():809
>>>
>> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
>>> org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
>>> org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
>>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>>>
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
>>>
>> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
>>>
>> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
>>> org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
>>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>>>
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>>> org.apache.calcite.sql.SqlSelect.validate():216
>>>
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
>>> org.apache.drill.exec.planner.sql.SqlConverter.validate():218
>>>
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
>>>
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
>>>
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
>>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
>>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
>>> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
>>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
>>> org.apache.drill.exec.work.foreman.Foreman.runSQL():590
>>> org.apache.drill.exec.work.foreman.Foreman.run():275
>>> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
>>> java.lang.Thread.run():745
>>> Caused By (org.apache.calcite.sql.validate.SqlValidatorException)
>> Object '/tmp/employee.json' not found within 'hdfs.root'
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance():62
>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
>>> java.lang.reflect.Constructor.newInstance():423
>>> org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
>>> org.apache.calcite.runtime.Resources$ExInst.ex():572
>>> org.apache.calcite.sql.SqlUtil.newContextException():824
>>> org.apache.calcite.sql.SqlUtil.newContextException():809
>>>
>> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
>>> org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
>>> org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
>>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>>>
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
>>>
>> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
>>>
>> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
>>> org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
>>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>>>
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>>> org.apache.calcite.sql.SqlSelect.validate():216
>>>
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
>>> org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
>>> org.apache.drill.exec.planner.sql.SqlConverter.validate():218
>>>
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
>>>
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
>>>
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
>>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
>>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
>>> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
>>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
>>> org.apache.drill.exec.work.foreman.Foreman.runSQL():590
>>> org.apache.drill.exec.work.foreman.Foreman.run():275
>>> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
>>> java.lang.Thread.run():745 (state=,code=0)
>>
>>
|