drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andries Engelbrecht <aengelbre...@maprtech.com>
Subject Re: S3 Access errors
Date Mon, 02 Mar 2015 20:24:05 GMT
Try taking out the tmp workspace in the s3 plugin config.

I don’t know if it is valid the way you defined it, vs creating a separate dfs or fs plugin
for tmp.

I made a copy of the core-site.xml with the s3 credentials in my drill conf directory when
experiencing issues with s3 connectivity, worth a try.

—Andries


On Mar 2, 2015, at 11:45 AM, Daniil Osipov <daniil.osipov@shazam.com> wrote:

> Thanks for the support everyone!
> 
> @Sudhakar: unfortunately the commands didn't have. Same failure.
> 
> @Andries: I've isolated the failure to S3 - local file access works as
> expected. The output of "show files" with the slash is the same - I've
> tried different combinations of slashes and root paths, all with the same
> failure.
> 
> @Paul - Thanks, that helps! I'll build trunk, and see if it works better.
> 
> On Mon, Mar 2, 2015 at 11:24 AM, Andries Engelbrecht <
> aengelbrecht@maprtech.com> wrote:
> 
>> Danill,
>> 
>> Try to isolate the issue.
>> 
>> Copy the file to the local dfs or filesystem and see what it does when
>> just querying through the dfs workspace.
>> This way you know if it is a S3 or file format/extension issue.
>> 
>> 
>> Also what does
>> show files from s3n.root.`/dev/dan/cleaned_1210/clean.txt`;
>> show?
>> 
>> I noticed the path is different between the show files and the select
>> query (not that it should cause an error from what I have tested, but worth
>> a try).
>> 
>> —Andries
>> 
>> 
>> On Mar 2, 2015, at 11:17 AM, Paul Pearcy <Paul.Pearcy@blackboard.com>
>> wrote:
>> 
>>> Hi,
>>> I¹ve had pain in the same ares.
>>> 
>>> These tickets are relevant to querying json with other extensions:
>>> https://issues.apache.org/jira/browse/DRILL-1871
>>> 
>>> https://issues.apache.org/jira/browse/DRILL-1545
>>> 
>>> 
>>> There have also been other fixes around compressed JSON failures, e.g.:
>>> https://issues.apache.org/jira/browse/DRILL-1960
>>> 
>>> 
>>> All but DRILL-1545 are fixed on the master branch.
>>> 
>>> Best Regards,
>>> Paul
>>> 
>>> 
>>> 
>>> On 3/2/15, 1:59 PM, "Daniil Osipov" <daniil.osipov@shazam.com> wrote:
>>> 
>>>> Thanks Sudhakar, I'll give this a try. Can you point me to some
>>>> documentation about extension/type handling? The actual files I'm trying
>>>> to
>>>> query are compressed JSON, and have an extension .gz
>>>> 
>>>> 
>>>> On Mon, Mar 2, 2015 at 10:49 AM, Sudhakar Thota <sthota@maprtech.com>
>>>> wrote:
>>>> 
>>>>> Daniil,
>>>>> 
>>>>> Please try doing these 2 things and check one more time again.
>>>>> 
>>>>> 1. Name that file to clean.json.
>>>>> 2. Issue this statement before you run your query.
>>>>> 
>>>>> alter system set `store.json.all_text_mode` = true
>>>>> 
>>>>> 
>>>>> Thanks
>>>>> Sudhakar Thota
>>>>> sthota@maprtech.com
>>>>> www.mapr.com
>>>>> Now Available - Free Hadoop On-Demand Training
>>>>> 
>>>>> On Mar 2, 2015, at 10:03 AM, Daniil Osipov <daniil.osipov@shazam.com>
>>>>> wrote:
>>>>> 
>>>>>> I'm continuing exploration of accessing files on S3, and running
into
>>>>> this
>>>>>> issue:
>>>>>> 0: jdbc:drill:> *SELECT COUNT(1) FROM
>>>>>> s3n.root.`/dev/dan/cleaned_1210/clean.txt`;*
>>>>>> *Query failed: Query failed: Failure validating SQL.
>>>>>> org.eigenbase.util.EigenbaseContextException: From line 1, column
22
>>>>> to
>>>>>> line 1, column 24: Table 's3n.root./dev/dan/cleaned_1210/clean.txt'
>>>>> not
>>>>>> found*
>>>>>> 
>>>>>> *Error: exception while executing query: Failure while executing
>>>>> query.
>>>>>> (state=,code=0)*
>>>>>> 
>>>>>> At the same time:
>>>>>> 0: jdbc:drill:> *show files from
>>>>> s3n.`dev/dan/cleaned_1210/clean.txt`;*
>>>>>> 
>>>>> 
>>>>> 
>> *+------------+-------------+------------+------------+------------+-----
>>>>> -------+-------------+------------+------------------+*
>>>>>> *|    name    | isDirectory |   isFile   |   length   |   owner 
  |
>>>>>> group    | permissions | accessTime | modificationTime |*
>>>>>> 
>>>>> 
>>>>> 
>> *+------------+-------------+------------+------------+------------+-----
>>>>> -------+-------------+------------+------------------+*
>>>>>> *| clean.txt  | false       | true       | 1313500    |         
  |
>>>>>>   | rw-rw-rw-   | 1970-01-01 00:00:00.0 | 2014-12-10 23:51:59.0 |*
>>>>>> 
>>>>> 
>>>>> 
>> *+------------+-------------+------------+------------+------------+-----
>>>>> -------+-------------+------------+------------------+*
>>>>>> *1 row selected (0.53 seconds)*
>>>>>> 
>>>>>> My storage config is below. Any suggestions on what could be wrong,
or
>>>>> how
>>>>>> to debug this error?
>>>>>> 
>>>>>> {
>>>>>> "type": "file",
>>>>>> "enabled": true,
>>>>>> "connection": "s3n://bucket-name",
>>>>>> "workspaces": {
>>>>>>  "root": {
>>>>>>    "location": "/",
>>>>>>    "writable": false,
>>>>>>    "defaultInputFormat": null
>>>>>>  },
>>>>>>  "tmp": {
>>>>>>    "location": "file:///tmp",
>>>>>>    "writable": true,
>>>>>>    "defaultInputFormat": null
>>>>>>  }
>>>>>> },
>>>>>> "formats": {
>>>>>>  "psv": {
>>>>>>    "type": "text",
>>>>>>    "extensions": [
>>>>>>      "tbl"
>>>>>>    ],
>>>>>>    "delimiter": "|"
>>>>>>  },
>>>>>>  "csv": {
>>>>>>    "type": "text",
>>>>>>    "extensions": [
>>>>>>      "csv"
>>>>>>    ],
>>>>>>    "delimiter": ","
>>>>>>  },
>>>>>>  "tsv": {
>>>>>>    "type": "text",
>>>>>>    "extensions": [
>>>>>>      "tsv"
>>>>>>    ],
>>>>>>    "delimiter": "\t"
>>>>>>  },
>>>>>>  "parquet": {
>>>>>>    "type": "parquet"
>>>>>>  },
>>>>>>  "json": {
>>>>>>    "type": "json"
>>>>>>  }
>>>>>> }
>>>>>> }
>>>>> 
>>>>> 
>>> 
>>> This email and any attachments may contain confidential and proprietary
>> information of Blackboard that is for the sole use of the intended
>> recipient. If you are not the intended recipient, disclosure, copying,
>> re-distribution or other use of any of this information is strictly
>> prohibited. Please immediately notify the sender and delete this
>> transmission if you received this email in error.
>> 
>> 


Mime
View raw message