Why is it common to put CSRF prevention tokens in cookies

cookiescsrfowaspSecurityweb

I'm trying to understand the whole issue with CSRF and appropriate ways to prevent it. (Resources I've read, understand, and agree with: OWASP CSRF Prevention CHeat Sheet, Questions about CSRF.)

As I understand it, the vulnerability around CSRF is introduced by the assumption that (from the webserver's point of view) a valid session cookie in an incoming HTTP request reflects the wishes of an authenticated user. But all cookies for the origin domain are magically attached to the request by the browser, so really all the server can infer from the presence of a valid session cookie in a request is that the request comes from a browser which has an authenticated session; it cannot further assume anything about the code running in that browser, or whether it really reflects user wishes. The way to prevent this is to include additional authentication information (the "CSRF token") in the request, carried by some means other than the browser's automatic cookie handling. Loosely speaking, then, the session cookie authenticates the user/browser and the CSRF token authenticates the code running in the browser.

So in a nutshell, if you're using a session cookie to authenticate users of your web application, you should also add a CSRF token to each response, and require a matching CSRF token in each (mutating) request. The CSRF token then makes a roundtrip from server to browser back to server, proving to the server that the page making the request is approved by (generated by, even) that server.

On to my question, which is about the specific transport method used for that CSRF token on that roundtrip.

It seems common (e.g. in AngularJS, Django, Rails) to send the CSRF token from server to client as a cookie (i.e. in a Set-Cookie header), and then have Javascript in the client scrape it out of the cookie and attach it as a separate XSRF-TOKEN header to send back to the server.

(An alternate method is the one recommended by e.g. Express, where the CSRF token generated by the server is included in the response body via server-side template expansion, attached directly to the code/markup that will supply it back to the server, e.g. as a hidden form input. That example is a more web 1.0-ish way of doing things, but would generalize fine to a more JS-heavy client.)

Why is it so common to use Set-Cookie as the downstream transport for the CSRF token / why is this a good idea? I imagine the authors of all these frameworks considered their options carefully and didn't get this wrong. But at first glance, using cookies to work around what's essentially a design limitation on cookies seems daft. In fact, if you used cookies as the roundtrip transport (Set-Cookie: header downstream for the server to tell the browser the CSRF token, and Cookie: header upstream for the browser to return it to the server) you would reintroduce the vulnerability you are trying to fix.

I realize that the frameworks above don't use cookies for the whole roundtrip for the CSRF token; they use Set-Cookie downstream, then something else (e.g. a X-CSRF-Token header) upstream, and this does close off the vulnerability. But even using Set-Cookie as the downstream transport is potentially misleading and dangerous; the browser will now attach the CSRF token to every request including genuine malicious XSRF requests; at best that makes the request bigger than it needs to be and at worst some well-meaning but misguided piece of server code might actually try to use it, which would be really bad. And further, since the actual intended recipient of the CSRF token is client-side Javascript, that means this cookie can't be protected with http-only. So sending the CSRF token downstream in a Set-Cookie header seems pretty suboptimal to me.

Best Answer

A good reason, which you have sort of touched on, is that once the CSRF cookie has been received, it is then available for use throughout the application in client script for use in both regular forms and AJAX POSTs. This will make sense in a JavaScript heavy application such as one employed by AngularJS (using AngularJS doesn't require that the application will be a single page app, so it would be useful where state needs to flow between different page requests where the CSRF value cannot normally persist in the browser).

Consider the following scenarios and processes in a typical application for some pros and cons of each approach you describe. These are based on the Synchronizer Token Pattern.

Request Body Approach

User successfully logs in.
Server issues auth cookie.
User clicks to navigate to a form.
If not yet generated for this session, server generates CSRF token, stores it against the user session and outputs it to a hidden field.
User submits form.
Server checks hidden field matches session stored token.

Advantages:

Simple to implement.
Works with AJAX.
Works with forms.
Cookie can actually be HTTP Only.

Disadvantages:

All forms must output the hidden field in HTML.
Any AJAX POSTs must also include the value.
The page must know in advance that it requires the CSRF token so it can include it in the page content so all pages must contain the token value somewhere, which could make it time consuming to implement for a large site.

Custom HTTP Header (downstream)

User successfully logs in.
Server issues auth cookie.
User clicks to navigate to a form.
Page loads in browser, then an AJAX request is made to retrieve the CSRF token.
Server generates CSRF token (if not already generated for session), stores it against the user session and outputs it to a header.
User submits form (token is sent via hidden field).
Server checks hidden field matches session stored token.

Advantages:

Works with AJAX.
Cookie can be HTTP Only.

Disadvantages:

Doesn't work without an AJAX request to get the header value.
All forms must have the value added to its HTML dynamically.
Any AJAX POSTs must also include the value.
The page must make an AJAX request first to get the CSRF token, so it will mean an extra round trip each time.
Might as well have simply output the token to the page which would save the extra request.

Custom HTTP Header (upstream)

User successfully logs in.
Server issues auth cookie.
User clicks to navigate to a form.
If not yet generated for this session, server generates CSRF token, stores it against the user session and outputs it in the page content somewhere.
User submits form via AJAX (token is sent via header).
Server checks custom header matches session stored token.

Advantages:

Works with AJAX.
Cookie can be HTTP Only.

Disadvantages:

Doesn't work with forms.
All AJAX POSTs must include the header.

Custom HTTP Header (upstream & downstream)

User successfully logs in.
Server issues auth cookie.
User clicks to navigate to a form.
Page loads in browser, then an AJAX request is made to retrieve the CSRF token.
Server generates CSRF token (if not already generated for session), stores it against the user session and outputs it to a header.
User submits form via AJAX (token is sent via header) .
Server checks custom header matches session stored token.

Advantages:

Works with AJAX.
Cookie can be HTTP Only.

Disadvantages:

Doesn't work with forms.
All AJAX POSTs must also include the value.
The page must make an AJAX request first to get the CRSF token, so it will mean an extra round trip each time.

Set-Cookie

User successfully logs in.
Server issues auth cookie.
User clicks to navigate to a form.
Server generates CSRF token, stores it against the user session and outputs it to a cookie.
User submits form via AJAX or via HTML form.
Server checks custom header (or hidden form field) matches session stored token.
Cookie is available in browser for use in additional AJAX and form requests without additional requests to server to retrieve the CSRF token.

Advantages:

Simple to implement.
Works with AJAX.
Works with forms.
Doesn't necessarily require an AJAX request to get the cookie value. Any HTTP request can retrieve it and it can be appended to all forms/AJAX requests via JavaScript.
Once the CSRF token has been retrieved, as it is stored in a cookie the value can be reused without additional requests.

Disadvantages:

All forms must have the value added to its HTML dynamically.
Any AJAX POSTs must also include the value.
The cookie will be submitted for every request (i.e. all GETs for images, CSS, JS, etc, that are not involved in the CSRF process) increasing request size.
Cookie cannot be HTTP Only.

So the cookie approach is fairly dynamic offering an easy way to retrieve the cookie value (any HTTP request) and to use it (JS can add the value to any form automatically and it can be employed in AJAX requests either as a header or as a form value). Once the CSRF token has been received for the session, there is no need to regenerate it as an attacker employing a CSRF exploit has no method of retrieving this token. If a malicious user tries to read the user's CSRF token in any of the above methods then this will be prevented by the Same Origin Policy. If a malicious user tries to retrieve the CSRF token server side (e.g. via curl) then this token will not be associated to the same user account as the victim's auth session cookie will be missing from the request (it would be the attacker's - therefore it won't be associated server side with the victim's session).

As well as the Synchronizer Token Pattern there is also the Double Submit Cookie CSRF prevention method, which of course uses cookies to store a type of CSRF token. This is easier to implement as it does not require any server side state for the CSRF token. The CSRF token in fact could be the standard authentication cookie when using this method, and this value is submitted via cookies as usual with the request, but the value is also repeated in either a hidden field or header, of which an attacker cannot replicate as they cannot read the value in the first place. It would be recommended to choose another cookie however, other than the authentication cookie so that the authentication cookie can be secured by being marked HttpOnly. So this is another common reason why you'd find CSRF prevention using a cookie based method.

What's happening

As it is, Internet Explorer gives lower level of trust to IFRAME pages (IE calls this "third-party" content). If the page inside the IFRAME doesn't have a Privacy Policy, its cookies are blocked (which is indicated by the eye icon in status bar, when you click on it, it shows you a list of blocked URLs).

_{(source: piskvor.org)}

In this case, when cookies are blocked, session identifier is not sent, and the target script throws a 'session not found' error.

(I've tried setting the session identifier into the form and loading it from POST variables. This would have worked, but for political reasons I couldn't do that.)

It is possible to make the page inside the IFRAME more trusted: if the inner page sends a P3P header with a privacy policy that is acceptable to IE, the cookies will be accepted.

How to solve it

Create a p3p policy

A good starting point is the W3C tutorial. I've gone through it, downloaded the IBM Privacy Policy Editor and there I created a representation of the privacy policy and gave it a name to reference it by (here it was policy1).

NOTE: at this point, you actually need to find out if your site has a privacy policy, and if not, create it - whether it collects user data, what kind of data, what it does with it, who has access to it, etc. You need to find this information and think about it. Just slapping together a few tags will not cut it. This step cannot be done purely in software, and may be highly political (e.g. "should we sell our click statistics?").

(e.g. "the site is operated by ACME Ltd., it uses anonymous per-session identifiers for its operation, collects user data only if explicitly permitted and only for the following purposes, the data is stored only as long as necessary, only our company has access to it, etc. etc.").

(When editing with this tool, it's possible to view errors/omissions in the policy. Also very useful is the tab "HTML Policy": at the bottom, it has a "Policy Evaluation" - a quick check if the policy will be blocked by IE's default settings)

The Editor exports to a .p3p file, which is an XML representation of the above policy. Also, it can export a "compact version" of this policy.

Link to the policy

Then a Policy Reference file (http://example.com/w3c/p3p.xml) was needed (an index of privacy policies the site uses):

<META>
  <POLICY-REFERENCES>
    <POLICY-REF about="/w3c/example-com.p3p#policy1">
      <INCLUDE>/</INCLUDE>
      <COOKIE-INCLUDE/>
    </POLICY-REF>
  </POLICY-REFERENCES>
</META>

The <INCLUDE> shows all URIs that will use this policy (in my case, the whole site). The policy file I've exported from the Editor was uploaded to http://example.com/w3c/example-com.p3p

Send the compact header with responses

I've set the webserver at example.com to send the compact header with responses, like this:

HTTP/1.1 200 OK 
P3P: policyref="/w3c/p3p.xml", CP="IDC DSP COR IVAi IVDi OUR TST"
// ... other headers and content

policyref is a relative URI to the Policy Reference file (which in turn references the privacy policies), CP is the compact policy representation. Note that the combination of P3P headers in the example may not be applicable on your specific website; your P3P headers MUST truthfully represent your own privacy policy!

Profit!

In this configuration, the Evil Eye does not appear, the cookies are saved even in the IFRAME, and the application works.

Edit: What NOT to do, unless you like defending from lawsuits

Several people have suggested "just slap some tags into your P3P header, until the Evil Eye gives up".

The tags are not only a bunch of bits, they have real world meanings, and their use gives you real world responsibilities!

For example, pretending that you never collect user data might make the browser happy, but if you actually collect user data, the P3P is conflicting with reality. Plain and simple, you are purposefully lying to your users, and that might be criminal behavior in some countries. As in, "go to jail, do not collect $200".

A few examples (see p3pwriter for the full set of tags):

NOI : "Web Site does not collected identified data." (as soon as there's any customization, a login, or any data collection (***** Analytics, anyone?), you must acknowledge it in your P3P)
STP: Information is retained to meet the stated purpose. This requires information to be discarded at the earliest time possible. Sites MUST have a retention policy that establishes a destruction time table. The retention policy MUST be included in or linked from the site's human-readable privacy policy." (so if you send STP but don't have a retention policy, you may be committing fraud. How cool is that? Not at all.)

I'm not a lawyer, but I'm not willing to go to court to see if the P3P header is really legally binding or if you can promise your users anything without actually willing to honor your promises.

How do browser cookie domains work

Although there is the RFC 2965 (Set-Cookie2, had already obsoleted RFC 2109) that should define the cookie nowadays, most browsers don’t fully support that but just comply to the original specification by Netscape.

There is a distinction between the Domain attribute value and the effective domain: the former is taken from the Set-Cookie header field and the latter is the interpretation of that attribute value. According to the RFC 2965, the following should apply:

If the Set-Cookie header field does not have a Domain attribute, the effective domain is the domain of the request.
If there is a Domain attribute present, its value will be used as effective domain (if the value does not start with a . it will be added by the client).

Having the effective domain it must also domain-match the current requested domain for being set; otherwise the cookie will be revised. The same rule applies for choosing the cookies to be sent in a request.

Mapping this knowledge onto your questions, the following should apply:

Cookie with Domain=.example.com will be available for www.example.com
Cookie with Domain=.example.com will be available for example.com
Cookie with Domain=example.com will be converted to .example.com and thus will also be available for www.example.com
Cookie with Domain=example.com will not be available for anotherexample.com
www.example.com will be able to set cookie for example.com
www.example.com will not be able to set cookie for www2.example.com
www.example.com will not be able to set cookie for .com

And to set and read a cookie for/by www.example.com and example.com, set it for .www.example.com and .example.com respectively. But the first (.www.example.com) will only be accessible for other domains below that domain (e.g. foo.www.example.com or bar.www.example.com) where .example.com can also be accessed by any other domain below example.com (e.g. foo.example.com or bar.example.com).

Best Answer

Request Body Approach

Custom HTTP Header (downstream)

Custom HTTP Header (upstream)

Custom HTTP Header (upstream & downstream)

Set-Cookie

Related Solutions

Cookie blocked/not saved in IFRAME in Internet Explorer

What's happening

How to solve it

Create a p3p policy

Link to the policy

Send the compact header with responses

Profit!

Edit: What NOT to do, unless you like defending from lawsuits

How do browser cookie domains work

Related Topic