drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe L. Korn" <uw...@xhochy.com>
Subject Re: QUESTION: Drill Configuration to access S3 buckets
Date Thu, 15 Jun 2017 05:39:02 GMT
The current Drill releases use the hadoop-io libraries from the 2.7.x series. Locally I have
built against the 3.0.0 alpha (2.8 should also work) and can access the regions with newer
signature versions. But you should be careful with that as I had to do some code changes to
have it built with the 3.0 jars and there were some breaking unit tests afterwards.

Also note that 2.8/3.0 greatly improves on S3 performance if you select the new (and experimental)
random-access mode in s3a. This resulted for me in massive improvements for queries that only
access a fraction of all columns or that have multiple RowGroups inside each Parquet file.

> Am 15.06.2017 um 06:36 schrieb Shankar Mane <shankar.mane@games24x7.com>:
> 
> aws new regions uses only signature version 4 protocol for S3. Other
> regions has both V2 and V4 compatible. Drill works very well if regions has
> both signature versions.
> 
> By adding endpoints, same problem persists. May be Drill API doesn't have
> support to V4 protocol yet.
> 
> This V4 problems is also with native hadoop versions prior to 2.8.0.
> 
> 
> 
>> On 15-Jun-2017 9:49 AM, "Jack Ingoldsby" <jack.ingoldsby@gmail.com> wrote:
>> 
>> Useful to know, thanks. Also having problems with Ohio. Will try another
>> region
>> 
>>> On Wed, Jun 14, 2017, 19:46 Сергей Боровик <borovyksv@gmail.com>
wrote:
>>> 
>>> Hi!
>>> I have an AWS EC2 instance with apache drill 1-10.0.and configured IAM
>>> Role.
>>> 
>>> And I am able to access and query S3 bucket in US East (N. Virginia)
>>> region,
>>> but not able to access/query buckets in US East (Ohio) region, it fails
>>> with
>>> "error: system error: amazons3exception: status code 400, AWS Service:
>>> Amazon S3,
>>> AWS Request ID:9D54A8310F26582B, AWS Error Code: null, AWS Error Message:
>>> Bad Request"
>>> 
>>> 
>>> I've tried set conf/core-site.xml property to:
>>> 
>>> <property>
>>>    <name>fs.s3a.endpoint</name>
>>>    <value>s3.us-east-2.amazonaws.com</value>
>>> </property>
>>> 
>>> in this case Ohio fails with the same error,
>>> and N. Virginia has error status code 301, AWS Error Code:
>>> PermanentRedirect,
>>> AWS Error message: The bucket you are attempting to access must be
>>> addressed using the specified endpoint
>>> 
>>> 1) Is there any specific configuration that needs to be enabled on Drill
>>> for Ohio region?
>>> 2) Does Drill not work on aws signature version 4?
>>> 
>>> Thank you in advance.
>>> Any advice is much appreciated!
>>> 
>> 


Mime
View raw message