Subject trafficserver git commit: Docs for collapsed_forwarding plugin.
Date Wed, 02 Mar 2016 01:54:14 GMT
Repository: trafficserver
Updated Branches:
  refs/heads/master 75d182c44 -> 5a0db7c2c

Docs for collapsed_forwarding plugin.


Branch: refs/heads/master
Commit: 5a0db7c2c3731e5567026ff0e58ab501028ddac4
Parents: 75d182c
Author: Sudheer Vinukonda <>
Authored: Wed Mar 2 01:53:54 2016 +0000
Committer: Sudheer Vinukonda <>
Committed: Wed Mar 2 01:53:54 2016 +0000

 .../plugins/collapsed_forwarding.en.rst         | 155 +++++++++++++++++++
 1 file changed, 155 insertions(+)
diff --git a/doc/admin-guide/plugins/collapsed_forwarding.en.rst b/doc/admin-guide/plugins/collapsed_forwarding.en.rst
new file mode 100644
index 0000000..2f75577
--- /dev/null
+++ b/doc/admin-guide/plugins/collapsed_forwarding.en.rst
@@ -0,0 +1,155 @@
+.. _admin-plugins-collapsed-forwarding:
+Collapsed Forwarding Plugin
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+This is a plugin for Apache Traffic Server that allows you to proactively
+fetch content from Origin in a way that it will fill the object into
+cache. This is particularly useful when all (or most) of your client requests
+are of the byte-Range type. The underlying problem being that Traffic Server
+is not able to cache request / responses with byte ranges.
+To make this plugin available, you must either enable experimental plugins
+when building |TS|::
+    ./configure --enable-experimental-plugins
+Or use :program:`tsxs` to compile the plugin against your current |TS| build.
+To do this, you must ensure that:
+#. Development packages for |TS| are installed.
+#. The :program:`tsxs` binary is in your path.
+#. The version of this plugin you are building, and the version of |TS| against
+   which you are building it are compatible.
+Once those conditions are satisfied, enter the source directory for the plugin
+and perform the following::
+    make -f Makefile.tsxs
+    make -f Makefile.tsxs install
+Using the plugin
+This plugin functions as a per remap plugin, and it takes two optional
+arguments for specifying the delay between successive retries and a max
+number of retries.
+To activate the plugin in per remap mode, in :file:`remap.config`, simply append the
+below to the specific remap line::
+ @pparam=--delay=<delay> @pparam=--retries=<retries>
+ATS plugin to allow collapsed forwarding of concurrent requests for the same
+object. This plugin is based on open_write_fail_action feature, which detects
+cache open write failure on a cache miss and returns a 502 error along with a
+special @-header indicating the reason for 502 error. The plugin acts on the 
+error by using an internal redirect follow back to itself, essentially blocking
+the request until a response arrives, at which point, relies on read-while-writer
+feature to start downloading the object to all waiting clients. The following
+config parameters are assumed to be set for this plugin to work:
+:ts:cv:`proxy.config.http.cache.open_write_fail_action`        1
+:ts:cv:`proxy.config.cache.enable_read_while_writer`           1
+:ts:cv:`proxy.config.http.redirection_enabled`                 1
+:ts:cv:`proxy.config.http.number_of_redirections`             10
+:ts:cv:`proxy.config.http.redirect_use_orig_cache_key`         1
+:ts:cv:`proxy.config.http.background_fill_active_timeout`      0
+:ts:cv:`proxy.config.http.background_fill_completed_threshold` 0
+Traffic Server has been affected severely by the Thundering Herd problem caused by its inability
+to do effective connection collapse of multiple concurrent requests for the same segment.
This is
+especially critical when Traffic Server is used as a solution to use cases such as delivering
+large scale video live streaming. This problem results in a specific behavior where multiple
+of requests for the same file are leaked upstream to the Origin layer choking the upstream
+due to the duplicated large file downloads or process intensive file at the Origin layer.
+ultimately can cause stability problems on the origin layer disrupting the overall network
+ATS supports several kind of connection collapse mechanisms including Read-While-Writer (RWW),
+stale-while-revalidate (SWR) etc each very effective dealing with a majority of the use cases
+that can result in the Thundering herd problem.
+For a large scale video streaming scenario, there’s a combination of a large number of
+(e.g. media playlists) and cache misses (e.g. media segments) that occur for the same file.
Traffic Server’s
+RWW works great in collapsing the concurrent requests in such a scenario, however, as described
+``_admin-configuration-reducing-origin-requests``, Traffic Server’s implementation of RWW
has a significant
+limitation, which restricts its ability to invoke RWW only when the response headers are
already received.
+This means that any number of concurrent requests for the same file that are received before
the response
+headers arrive are leaked upstream, which can result in a severe Thundering herd problem,
depending on
+the network latencies (which impact the TTFB for the response headers) at a given instant
of time.
+To address this limitation, Traffic Server supports a few “workaround” solutions, such
as Open Read Retry,
+and a new feature called Open Write Fail action from 6.0. To understand how these approaches
work, it is
+important to understand the high level flow of how Traffic Server handles a GET request.
+On receiving a HTTP GET request, Traffic Server generates the cache key (basically, a hash
of the request URL)
+and looks up for the directory entry (dirent) using the generated index. On a cache miss,
the lookup fails and
+Traffic Server then tries to just get a write lock for the cache object and proceeds to the
origin to download
+the object. On the Other hand, if the lookup is successful, meaning, the dirent exists for
the generated cache
+key, Traffic Server tries to obtain a read lock on the cache object to be able to serve it
from the cache. If
+the read lock is not successful (possibly, due to the fact that the object’s being written
to at that same
+instant and the response headers are not in the cache yet), Traffic Server then moves to
the next step of trying
+to obtain an exclusive write lock. If the write lock is already held exclusively by another
request (transaction),
+the attempt fails and at this point Traffic Server simply disables the cache on that transaction
+downloads the object in a proxy-only mode::
+  1). Cache Lookup (lookup for the dirent using the request URL as cache key).
+    1.1). If lookup fails (cache miss), goto (3).
+    1.2). If lookup succeeds, try to obtain a read lock, goto (2).
+  2). Open Cache Read (try to obtain read lock)
+    2.1). If read lock succeeds, serve from cache, goto (4).
+    2.2). If read lock fails, goto (3).
+  3). Open Cache Write (try to obtain write lock).
+    3.1). If write lock succeeds, download the object into cache and to the client in parallel
+    3.2). If write lock fails, disable cache, and download to the client in a proxy-only
+  4). Done
+As can be seen above, if a majority of concurrent requests arrive before response headers
are received, they hit
+(2.2) and (3.2) above.  Open Read Retry can help to repeat (2) after a configured delay on
2.2, thereby increasing
+the chances for obtaining a read lock and being able to serve from the cache.
+However, the Open Read Retry can not help with the concurrent requests that hit (1.1) above,
jumping to (3)
+directly. Only one such request will be able to obtain the exclusive write lock and all other
requests are leaked
+upstream. This is where, the recently developed ATS feature Open Write Fail Action will help.
The feature
+detects the write lock failure and can return a stale copy for a Cache Revalidation or a
5xx status code for a
+Cache Miss with a special internal header <@Ats-Internal> that allows a TS plugin to
take other special actions
+depending on the use-case.
+``collapsed_forwarding`` plugin catches that error in SEND_RESPONSE_HDR_HOOK and performs
an internal 3xx Redirect
+back to the same host, the configured number of times with the configured amount of delay
between consecutive
+retries, allowing to be able to initiate RWW, whenever the response headers are received
for the request that was
+allowed to go to the Origin.
+More details are available at

