jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Ryan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-7083) CompositeDataStore - ReadOnly/ReadWrite Delegate Support
Date Tue, 01 May 2018 15:53:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459793#comment-16459793
] 

Matt Ryan commented on OAK-7083:
--------------------------------

Over the past few days I did some performance testing.  The test covers the "production/staging"
scenario.

I set up a system representing production first.  A folder named "prod" was created and 10,000
random JPEG images were added to this folder.  Each JPEG image is approximately 100K in size. 
Additional renditions were generated on these images as they were added so the actual number
of blobs was higher (around 40,000).

Using that system I then cloned it to the staging environment and created a new "stg" folder,
adding another 10,000 random images just like (but distinct from) the "prod" folder.

Once that was done I created a content package of "stg" and added that back to the production
system.  So therefore at the end each system had a "prod" and "stg" folder with 10,000 base
images (plus renditions) in each folder.  The content trees in each were essentially identical,
but the "stg" system was using {{CompositeDataStore}} to access the "prod" folder read-only.

I wrote a JMeter test that would randomly choose between the "prod" or "stg" folder, then
randomly choose one of the 10,000 images in that folder, and download it.  This step would
be repeated 250,000 times in a single test run.  The test ran with no delays between requests
and metrics were collected to see how quickly the test would complete.

I initially ran 8 test runs on each system.  I later decided to expand this to 20 so I did
an additional 12 runs on the staging system then took a break for the weekend.  Coming back
I did the additional 12 runs on the production system, but saw results different from the
original runs on either system.  I reran 8 runs on the staging system in which I saw more
consistent results.  I'm reporting 20 runs on each system, done at comparable times between
the two to attempt to show comparable results.  So some of the abnormal results from the
original runs 11-20 on the staging system were replaced with later runs done at a similar
time to runs 11-20 on production.

I collected the following metrics for each test run:  response time (minimum, maximum, and
average), average requests per second, and average KB per second downloaded.

After collecting these metrics for 20 runs I computed min, max, and average values for each
category.  When computing the average I threw out the extreme min and max values in order
to avoid the average being swayed too much by an outlier.

Then I compared the averaged metrics from one system to those of the other.

The results are below:

Staging system (with {{CompositeDataStore}}):
|*Run*|*Average Response Time (ms)*|*Min Response Time (ms)*|*Max Response Time (ms)*|*Requests/Sec*|*KB/sec*|
|1|3|1|522|240.1|24882.28|
|2|3|1|587|247.3|25635.84|
|3|3|1|647|251|26012.87|
|4|3|1|705|252.2|26143.79|
|5|2|1|718|248.6|25764.68|
|6|3|1|388|280.5|29071.79|
|7|3|1|353|269.2|27901.22|
|8|3|1|581|260.1|26953.55|
|9|3|1|537|288.6|29907.38|
|10|3|1|608|294.1|30481.24|
|11|3|1|414|288.7|29924.89|
|12|3|1|167|300.4|31136.34|
|13|3|1|593|272.9|28281.77|
|14|3|1|625|271.1|28101.01|
|15|3|1|452|272.5|28240.11|
|16|3|1|465|304.8|31594.93|
|17|3|1|525|287.6|29804.54|
|18|3|1|625|276.7|28678.59|
|19|3|1|650|273.5|28343.48|
|20|3|1|548|294.8|30551.06|

Production system (without {{CompositeDataStore}}):
|*Run*|*Average Response Time (ms)*|*Min Response Time (ms)*|*Max Response Time (ms)*|*Requests/Sec*|*KB/sec*|
|1|3|2|739|233.2|24507.37|
|2|3|1|489|254.4|26739.06|
|3|3|1|171|260.7|27398.34|
|4|3|1|526|234|24591.95|
|5|3|1|544|284.4|29888.1|
|6|3|1|520|284.8|29933.02|
|7|3|1|181|287.8|30242.71|
|8|3|1|364|268.3|28192.92|
|9|3|1|419|297.7|31283.1|
|10|3|1|155|293.3|30820.8|
|11|3|1|638|276.5|29059.47|
|12|3|1|638|197.6|20762.37|
|13|3|1|528|254.5|26741.96|
|14|3|1|619|264.5|27797.78|
|15|3|1|169|303.3|31873.11|
|16|3|1|433|298.7|31389.12|
|17|3|1|546|278.7|29288.03|
|18|3|1|522|275.2|28923.43|
|19|2|1|530|298|31322.16|
|20|3|1|520|303.9|31933.47|

Average and minimum response times are essentially identical between both systems.  Comparison
of the other metrics below:
|| ||Without CDS||With CDS||CDS Performance Difference||% Change||
|*Max Response Time - Min Value (ms)*|155|167|{color:#d04437}+12{color}|{color:#d04437}+7.8%{color}|
|*Max Response Time - Max Value (ms)*|739 |718|{color:#14892c}-21{color}|{color:#14892c}-2.8%{color}|
|*Max Response Time - Average Value (ms)*|464.28|545.83|{color:#d04437}+81.55{color}|{color:#d04437}+18%{color}|
|*Requests / Sec - Min Value*|197.6|240.1|{color:#14892c}+42.5{color}|{color:#14892c}+21.5%{color}|
|*Requests / Sec - Max Value*|303.9|304.8|{color:#14892c}+0.9{color}|{color:#14892c}+0.3%{color}|
|*Request / Sec - Average Value*|274.89|273.88|{color:#d04437}-1.01{color}|{color:#d04437}-0.4%{color}|
|*KB / Sec - Min Value*|20762.37|24882.28|{color:#14892c}+4119.91{color}|{color:#14892c}+19.8%{color}|
|*KB / Sec - Max Value*|31933.47|31594.93|{color:#d04437}-338.54{color}|{color:#d04437}-1.1%{color}|
|*KB / Sec - Average Value*|28888.47|28385.23|{color:#d04437}-503.24{color}|{color:#d04437}-1.7%{color}|

The most important metrics here are average requests per second and average KB per second. 
We can see from the testing that the difference when adding the composite data store is minimal
in terms of overhead, and with the variance in results from either group some of the difference
can even be considered circumstantial.  Of course, the composite data store introduces code
and processing into the path so we should expect some level of overhead.  I don't think the
amount is enough to worry about.

The other metrics, like the minimums and maximums of all the metrics, primarily serve to help
us see whether the composite data store causes any significant extremes outside the norm,
which it does not.  Going through each group of metrics individually I'd make the following
conclusions:
 * *Response Time* - Response times with CompositeDataStore included are within the reasonable
range of response times without CompositeDataStore.  Typical response times are identical
and max values for response times are within the range of response times without.  It appears
to have no worrisome impact on the overall expected response time.
 * *Requests / Sec* - Again, the range of metrics for requests per second with CompositeDataStore
included come within the range of metrics for tests when it was not included.  The average
value is slightly lower which is to be expected.
 * *KB / Sec* - Again, the range of metrics with CompositeDataStore included come within
the range of metrics for tests when it was not included.  The average value is slightly lower
which is to be expected.

My recommendation is to move forward with it in this state.  For the use case tested the
overhead is not enough to worry about given the potential benefits.

> CompositeDataStore - ReadOnly/ReadWrite Delegate Support
> --------------------------------------------------------
>
>                 Key: OAK-7083
>                 URL: https://issues.apache.org/jira/browse/OAK-7083
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: blob, blob-cloud, blob-cloud-azure, blob-plugins
>            Reporter: Matt Ryan
>            Assignee: Matt Ryan
>            Priority: Major
>
> Support a specific composite data store use case, which is the following:
> * One instance uses no composite data store, but instead is using a single standard Oak
data store (e.g. FileDataStore)
> * Another instance is created by snapshotting the first instance node store, and then
uses a composite data store to refer to the first instance's data store read-only, and refers
to a second data store as a writable data store
> One way this can be used is in creating a test or staging instance from a production
instance.  At creation, the test instance will look like production, but any changes made
to the test instance do not affect production.  The test instance can be quickly created from
production by cloning only the node store, and not requiring a copy of all the data in the
data store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message