trafficserver-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Sorber <sor...@apache.org>
Subject Re: [VOTE] Release Apache Traffic Server 4.2.1 (RC0)
Date Wed, 16 Apr 2014 16:34:34 GMT
So given all this I am voting -1 and calling this vote as a failure. I am
attempting to test Alan's new patch and hopefully I will roll a 4.2.1-rc1
later this week.

Thanks


On Wed, Apr 16, 2014 at 9:18 AM, Alan M. Carroll <
amc@network-geographics.com> wrote:

> I was asked for a translation of my previous email, bonging the 4.2.1 RC0.
>
> The problem in 4.2.0 was a shift in the set of WKS values. These are not
> just live data but also written to the cache in the object headers so if
> they change at all, it de facto invalidates the cache. The 4.2.0 crashes
> (TS-2564) are due to this, because various secondary bits of data get
> written inconsistently which in turns causes ATS to look up the wrong data
> for header fields. For instance, the VARY field would be written out along
> with a hint about where it was in the header. When read back in 4.2.0 ATS
> would use the stored WKS index to lookup the hint location and get the
> wrong location (because VARY had shifted) and use that to find the wrong
> data for VARY (possibly null or unallocated memory).
>
> To fix this, 4.2.1 simply clears all the hints and rewrites them when the
> object is read from disk if using a cache version earlier than 4.2.1. This
> ignores the stored values and uses only the current in memory values.
>
> However, it turns out that when the object is read from disk, it may be
> stored in the ram cache. If retrieved from ram cache later, it goes through
> the same logic as if it had been loaded from disk, which includes clearing
> and rewriting the hints. The ATS logic, though, doesn't lock the object for
> this because it is expected to be read only once read from the disk. The
> TS-2564 logic violates this and thereby creates a race condition between
> two transaction both access the same object. It is possible for one to
> check the valid hints for a field and then, while it is trying to retrieve
> the field, the other transaction can clear the hints causing the field to
> not be found. This leads to a crash because the logic assumes (reasonably)
> that if it's checked the hints and verified the field presence, the field
> is present and will be found. If the field is not found, you get a null
> pointer dereference.
>
> The solution is to prevent the 4.2.0 fixup from being done on objects
> retrieved from the ram cache. There's no need as the fixup was done when it
> was read from disk and put in the ram cache. There is no race condition for
> disk reads because those are not shared until after the fixup.
>
>

Mime
View raw message