freemarker-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Van Gundy <mvang...@cisco.com>
Subject Re: Auto-Sanitization for FreeMarker
Date Tue, 15 Mar 2016 22:45:47 GMT
Hi Daniel,

I took a quick look at the auto-escaping feature documentation and it
looks like it's doing part of what we're doing.  Given the potential
lead time that may be involved in getting our contribution request
approved, I wouldn't put 2.3.24 on hold for our changes.  However, at
first blush, I think that we'll be able to capitalize on the changes.
In particular, the fewer explicit calls to "s?escape" the better.  Also,
explicit use of "s?no_esc" will help us out as well.

At a very high-level, in order to achieve context-sensitive
auto-sanitization, you need to do the following:

1. Track
    - Data origin integrity --- is the source trusted or untrusted?
    - Sanitizations applied --- to each value on each program path

2. At every output of a trusted value: print(trusted)
   Model the effect on the browser’s parser state (context)
   E.g. (HTML_PCDATA, “<“)  =>  HTML_BEFORE_TAG_NAME

3. At every output of an untrusted value: print(untrusted)
   If the value has not been sufficiently sanitized, sanitize it
   appropriately for the current context

I'm guessing that you're currently doing 2 & 3-lite by checking whether
you're about to output a TemplateMarkupOutputModel in FreeMarkerWriter.
If it's a TemplateMarkupOutputModel, don't escape. Otherwise, escape.
The approach is similar for context-sensitive auto-sanitization except
that we also track the browser's parser state and parse the markup
before sending it to the browser so that we know what the browser's next
parser state will be.

For tracking data flow, we label each value in the program with the
sanitization that has been applied.  Then for each operation on that
value, we define how it updates the label.  A straw man example would be
determining the integrity (trustworthiness) of boolean values:

class Boolean {
    boolean value;


    public static Boolean and(Boolean x,
                              Boolean y) {
        boolean result = x.value
                      && y.value;


        return new Boolean(result);
    }
}

would become:

class Boolean {
    boolean value;
    boolean untrusted;

    public static Boolean and(Boolean x,
                              Boolean y) {
        boolean result = x.value
                      && y.value;
        boolean new_label = x.untrusted
                         || y.untrusted);
        return new Boolean(result,
                           new_label);
    }
}

Of course the labels for tracking sanitization are more complicated.
But, in practice, we extended each Template*Model with a label field and
updated all operations to compute an updated label value.

The one sticking point was java.lang.String.  There are a number of
places in the FreeMarker Engine where java.lang.String was used instead
of freemarker.template.TemplateScalarModel.  Due to time constraints, we
modified FreeMarker to promote all java.lang.String operations on
potentially-untrusted data to the labeled
freemarker.template.TemplateScalarModel.  With heavy refactoring, it may
be possible to track string data flow external to the individual string
values and avoid that conversion.

In all our changes to src/ come out to:
 114 files changed, 11555 insertions(+), 895 deletions(-)

with 18 files changed, 6701 insertions(+) being entirely new parsing and
sanitization code that we ported and extended from Closure Templates.

Cheers,
Matt


On 3/15/16 2:12 PM, Daniel Dekany wrote:
> Hi,
> 
> FreeMarker 2.3.24 (not yet released, though it meant to be ASAP) adds
> a simple auto-escaping feature. It's not context sensitive. It's
> pretty much just a better version of #escape. I mention it as it might
> interferes on some ways with your work. This is the guide for it:
> http://freemarker.org/builds/2.3.24-rc01/_html/dgui_misc_autoescaping.html
> Note that we have TemplateMarkupOutputModel objects on runtime, which
> prevents double-escaping. As you track data flow on runtime, maybe
> that's something that you can utilize. If so, I can put 2.3.24 on hold
> and see how it could be improved to serve that goal.
> 
> As of having different branches of 2.x, I haven't seen how different
> your branch is, but as currently FM development resources are very
> scarce, it's perhaps not a realistic to maintain an additional branch.
> But, there's a plan for starting a FreeMarker 3 branch, whose purpose
> is getting rid of the backward compatibility baggage piled up in the
> last ~12 years, and fixing some known design issues, be ignoring
> backward compatibility. (If it will happen, it will be in a separate
> package, so it can be used next to FreeMarker 2.x, which will pretty
> much switch to maintenance mode.) FreeMarker 3 is also a good chance
> to put in architectural changes upon which more ambitious features can
> be built, such as more intelligent white-space handling, or maybe
> context sensitive escaping too. So I'm interested in what's needed for
> context sensitive escaping to work.
> 
> 
> Tuesday, March 15, 2016, 4:03:57 PM, Matthew Van Gundy wrote:
> 
>> Hi dev@freemarker,
>>
>> Cross-Site Scripting (XSS) vulnerabilities are embarrassingly common in
>> web applications.  A few of us in the Cisco Advanced Security
>> Initiatives Group have extended FreeMarker to support context-sensitive
>> auto-sanitization in order to automatically detect and prevent
>> Cross-Site Scripting vulnerabilities.  We would like to contribute our
>> improvements back to the open-source community.  As part of that
>> process, we'd like to solicit your feedback about whether and under what
>> conditions FreeMarker might be willing to integrate our changes or if it
>> would be better to maintain a fork that attempts to retain compatibility
>> and feature parity with FreeMarker.
>>
>> Our modified FreeMarker:
>>
>> * Precisely tracks the flow of untrusted data through the FreeMarker
>>   engine
>>
>> * Tracks the current parsing context of an HTML5-compliant browser
>>   parser
>>
>> * Automatically sanitizes untrusted data correctly for the enclosing
>>   HTML/JavaScript/CSS context where the data is to be embedded
>>
>> * Detects manually-sanitized data and avoids over-sanitization as long
>>   as the manually-applied sanitization is safe for the enclosing web
>>   page context
>>
>>
>> For example, given the template:
>>
>> Hello ${email}!
>> <button onclick="signin('${email}')">
>>
>>
>> When rendered with email = "<script>pwn()</script>" the result will be:
>>
>> Hello &lt;script&rt;pwn()&lt;/script&rt;!
>> <button onclick="signin('&lt;script&rt;pwn()&lt;/script&rt;')">
>>
>>
>> When rendered with email = "'+ pwn() +'" the result will be:
>>
>> Hello '+ pwn() +'!
>> <button onclick="signin('\'+ pwn() +\'')">
>>
>>
>> In both cases, whenever untrusted data is embedded into a web page, it
>> is sanitized safely and correctly for the enclosing context.  Our
>> approach was inspired by Closure Templates [1] and Samuel, Saxena, and
>> Song's CCS 2011 paper on Context-Sensitive Auto-Sanitization [2] with
>> the caveat that our system does not perform static analysis, it is
>> runtime only.
>>
>> As a runtime-only approach, there is a performance impact.  In our
>> current benchmarks, data flow tracking alone incurs a 1.6x slowdown
>> while full context-sensitive auto-sanitization (including data flow
>> tracking) incurs a 2x slowdown.
>>
>> Precise data flow tracking requires all string operations in the
>> FreeMarker engine to be instrumented.  It is not possible to allow data
>> flow tracking to be enabled/disabled on demand without extensive
>> refactoring.  As a result, the 1.6x slowdown due to data flow tracking
>> is always present in the current implementation.  To mitigate this for
>> applications that cannot incur a 1.6x slowdown in template rendering, we
>> provide an API-compatible replacement for freemarker.jar which can be
>> used during testing.  In this use case, our system provides a
>> verification that manual sanitization has been applied correctly for all
>> test cases seen during the normal dev-test cycle.
>>
>> We would like your opinions on whether and under what constraints,
>> FreeMarker would be interested in adopting these changes.  Would
>> FreeMarker be interested in offering a context-sensitive auto-sanitizing
>> branch which maintains feature parity with the main release but provides
>> additional safety for a performance cost?  If extensive refactoring
>> could allow data flow tracking and auto-sanitization could be
>> selectively enabled, would FreeMarker be willing to integrate those
>> changes?  Or, would releasing a separate project be the preferred route?
>>
>> Thank you for all of your hard work creating an excellent tool.  We
>> appreciate any feedback you may have.
>>
>> Matt
>>
>> [1] https://developers.google.com/closure/templates/
>> [2]
>> https://github.asig.cisco.com/ASIG/trogdor/blob/master/docs/samuel-csas.pdf
>>
> 



Mime
View raw message