jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Wehner (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (OAK-4565) S3Backend fails to upload large metadata records
Date Fri, 15 Jul 2016 11:00:25 GMT

     [ https://issues.apache.org/jira/browse/OAK-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Martin Wehner updated OAK-4565:
-------------------------------
    Attachment: OAK-4565-TRUNK.patch
                OAK-4565-1.4.patch

Attached are proposed patches (for trunk & 1.4) which spool the metadata record to a temporary
file and hand that file over to the TransferManager. With this patch we were able to successfully
run a Blob GC on our 47TB repository for the first time.
It's a bit unfortunate that in the GC case the file already exists on disk and has to be recreated
again, but since the signature of {{SharedDatastore}} requires an InputStream I see no other
way of solving this. Also it kind of slows down the common use case of just creating 0-byte
marker records, but this is so infrequent that it shouldn't be a problem.

> S3Backend fails to upload large metadata records
> ------------------------------------------------
>
>                 Key: OAK-4565
>                 URL: https://issues.apache.org/jira/browse/OAK-4565
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: blob
>    Affects Versions: 1.4.5
>            Reporter: Martin Wehner
>              Labels: gc, s3
>         Attachments: OAK-4565-1.4.patch, OAK-4565-TRUNK.patch
>
>
> If a large enough metadata record is added to a S3 DS (like the list of blob references
collected during the mark phase of the MarkSweepGC) the upload will fail (i.e. never start).
This is caused by {{S3Backend.addMetadataRecord()}} providing an InputStream to the S3 TransferManager
without specifying the size in the Metadata. 
> A warning to this effect is logged by the AWS SDK each time you add a metadata record:

> {noformat}
> [s3-transfer-manager-worker-1] AmazonS3Client.java:1364 No content length specified for
stream data.  Stream contents will be buffered in memory and could result in out of memory
errors.
> {noformat}
> Normally this shouldn't be too big of a problem but in a repository with over 36 million
blob references the list of marked refs produced by the GC is over 5GB. In this case the S3
transfer worker thread will be stuck in a seemingly endless loop where it tries to allocate
the memory reading the file into memory and never finishes (although the JVM has 80GB of heap),
eating away resources in the process:
> {noformat}
>    java.lang.Thread.State: RUNNABLE
> 	at org.apache.http.util.ByteArrayBuffer.append(ByteArrayBuffer.java:90)
> 	at org.apache.http.util.EntityUtils.toByteArray(EntityUtils.java:137)
> 	at org.apache.http.entity.BufferedHttpEntity.<init>(BufferedHttpEntity.java:63)
> 	at com.amazonaws.http.HttpRequestFactory.newBufferedHttpEntity(HttpRequestFactory.java:247)
> 	at com.amazonaws.http.HttpRequestFactory.createHttpRequest(HttpRequestFactory.java:126)
> 	at com.amazonaws.http.AmazonHttpClient$ExecOneRequestParams.newApacheRequest(AmazonHttpClient.java:650)
> 	at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:730)
> 	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
> 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
> 	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
> 	at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1382)
> 	at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:131)
> 	at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:123)
> 	at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
> 	at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat} 
> The last log message by the GC thread will be like this:
> {noformat}
> *INFO* [sling-oak-observation-1273] org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector
Number of valid blob references marked under mark phase of Blob garbage collection [36147734]
> {noformat} 
> followed by the above AWS warning, then it will stall waiting for the transfer to finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message