jclouds-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dori Polotsky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCLOUDS-847) S3 poor upload performance
Date Mon, 22 Apr 2019 23:50:00 GMT

    [ https://issues.apache.org/jira/browse/JCLOUDS-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16823571#comment-16823571

Dori Polotsky commented on JCLOUDS-847:

Hi Andrew,


Just to update you regarding the write performance - I prepared a test branch with the configurable
BufferedOutputStream that we discussed;  However, even though the code is clean and understandable,
it unfortunately does not seem to provide the performance boost that I was expecting - the
extra copying of the entire payload through an additional buffer was significant enough that
it caused a measurable degradation on our platform compared to the earlier fork that we were
using.  Therefore the only viable option that I see at the moment is to resort to the original
solution that we deployed, which was to have a small ByteStreams2.copy that accepts a buffer
size instead of the fixed-size Guava ByteStreams.copy, as this will prevent the secondary
copying of the payload.  I expect to have another update on this shortly.


As for the other items, I will see what I can address now, the next item is probably to check
whether or not there is a socket leak and to address it in case there is.



> S3 poor upload performance
> --------------------------
>                 Key: JCLOUDS-847
>                 URL: https://issues.apache.org/jira/browse/JCLOUDS-847
>             Project: jclouds
>          Issue Type: Improvement
>          Components: jclouds-drivers
>    Affects Versions: 1.8.1
>         Environment: JDK 1.7.0_55, 64bit, Windows 7
> EU bucket, https
>            Reporter: Veit Guna
>            Priority: Major
>              Labels: performance
>         Attachments: s3-upload-test.zip
> Hi.
> I'm using jclouds 1.8.1 together with the Apache HttpClient module to upload files to
S3. During tests, I encountered that upload performance is quite poor in comparison to jets3t
or windows tools like Cloudberry S3 Explorer.
> Sending a 10MB binary file on a cable connection (100mbit down/5mbit up), to an EU bucket
(https, default endpoints), from a Windows 7 machine (JDK 1.7.0_55, 64bit) gives the following
> jclouds: ~55 secs
> Amazon Java SDK: ~55 secs.
> jets3t: ~18 secs
> S3 Explorer: ~18 secs
> Using a faster connection upload time increased up to 200 seconds with jclouds/Amazon
SDK. The rest kept the same around 18 secs.
> So I wondered, where this difference comes from. I started digging into the source code
of jclouds, jets3t, httpclient and took a look at the network packages which are send.
> Long story short: too small buffer sizes!
> Jclouds uses for the payload the "default" HttpEntities which HttpClient provides. Such
as FileEntity and InputStreamEntity. These are using an output buffer size of hardcoded 4096
> This seems no problem, when the available upload bandwidth is quite small, but slows
down the connection on bigger bandwidth - as it seems.
> For testing I simply created my own HttpClient module, based on the shipped ones and
made a simple change that adds a 128k buffer to the to-be-send entity. The result is, that
upload performance is now up to the other guys like jets3t and S3 Explorer.
> Please find attached a small maven project that can be used demonstrate the difference.
> To be honest, I'm not too deep into the jclouds code to provide a proper patch, but my
suggestion would be to provide an own (jclouds) implementation of File- and InputStreamEntity
that provide proper output buffer sizes. Maybe with an option to overwrite them by configuration.
> I also tried the HttpClient "http.socket.buffer-size", but that hadn't any effect. Also
the 2.0.0-SNAPSHOT version shows no difference.
> What do you guys think? Is this a known problem? Or are there other settings to increase
the upload performance?
> BTW: The same problem exists with the default JavaUrlHttpCommandExecutorServiceModule
which also
> uses a 4k buffer. Also tried the OkHttp driver with the same results (1.8.1, 2.0.0-SNAPHOT
failed with an exception).
> Thanks!
> Veit

This message was sent by Atlassian JIRA

View raw message