jclouds-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yury Kats <yuryk...@yahoo.com>
Subject Re: aws-s3 etag when using multipart
Date Tue, 22 Sep 2015 15:21:48 GMT
In AWS, the Etag for multipart object is hash of hashes of all parts dash number of parts.
See: https://forums.aws.amazon.com/thread.jspa?messageID=456442

In general, S3 says the ETag would not be a valid MD5 in a number of cases, including multipart.
See ETag definition here: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html

On 9/22/2015 11:10 AM, Veit Guna wrote:
> Hi.
> We're using jclouds 1.9.1 with the aws-s3 provider. Until now, we have used the returned
etag of blobStore.putBlob() to manually verify
> against a client provided hash. That worked quite well for us. But since we are hitting
the 5GB limit of S3, we switched to the multipart() upload
> that jclouds offers. But now, putBlob() returns someting like <md5-hash>-<number>
e.g. 90644a2d0c7b74483f8d2036f3e29fc5-2 that of course
> fails with our validation.
> I guess this is due to the fact, that each chunk is hashed separately and send to S3.
So there is no complete hash over the whole payload that could
> be returned by putBlob() - is that correct?
> During my research I stumbled across this:
> https://github.com/jclouds/jclouds/commit/f2d897d9774c2c0225c199c7f2f46971637327d6
> Now I'm wondering, what the contract of putBlob() is. Should it only return valid etag/hashes
otherwise return null?
> I'm asking that, because otherwise, I would have to start parsing and validating the
returned value by myself and skip any
> validation when it isn't a normal md5 hash. My guess is, that this is the hash from the
last transferred chunk plus
> the chunk number?
> Maybe someone can shed some light on this :).
> Thanks
> Veit

View raw message