They are slightly different - the ETag does not have any information that the client can use to determine whether or not to make a request for that file again in the future. If ETag is all it has, it will always have to make a request. However, when the server reads the ETag from the client request, the server can then determine whether to send the file (HTTP 200) or tell the client to just use their local copy (HTTP 304). An ETag is basically just a checksum for a file that semantically changes when the content of the file changes.
The Expires header is used by the client (and proxies/caches) to determine whether or not it even needs to make a request to the server at all. The closer you are to the Expires date, the more likely it is the client (or proxy) will make an HTTP request for that file from the server.
So really what you want to do is use BOTH headers - set the Expires header to a reasonable value based on how often the content changes. Then configure ETags to be sent so that when clients DO send a request to the server, it can more easily determine whether or not to send the file back.
One last note about ETag - if you are using a load-balanced server setup with multiple machines running Apache you will probably want to turn off ETag generation. This is because inodes are used as part of the ETag hash algorithm which will be different between the servers. You can configure Apache to not use inodes as part of the calculation but then you'd want to make sure the timestamps on the files are exactly the same, to ensure the same ETag gets generated for all servers.
I had this same question, and found some info in my searches (your question came up as one of the results). Here's what I determined...
There are two sides to the Cache-Control
header. One side is where it can be sent by the web server (aka. "origin server"). The other side is where it can be sent by the browser (aka. "user agent").
When sent by the origin server
I believe max-age=0
simply tells caches (and user agents) the response is stale from the get-go and so they SHOULD revalidate the response (eg. with the If-Not-Modified
header) before using a cached copy, whereas, no-cache
tells them they MUST revalidate before using a cached copy. From 14.9.1 What is Cacheable:
no-cache
...a cache MUST NOT use the response
to satisfy a subsequent request
without successful revalidation with
the origin server. This allows an
origin server to prevent caching even
by caches that have been configured to
return stale responses to client
requests.
In other words, caches may sometimes choose to use a stale response (although I believe they have to then add a Warning
header), but no-cache
says they're not allowed to use a stale response no matter what. Maybe you'd want the SHOULD-revalidate behavior when baseball stats are generated in a page, but you'd want the MUST-revalidate behavior when you've generated the response to an e-commerce purchase.
Although you're correct in your comment when you say no-cache
is not supposed to prevent storage, it might actually be another difference when using no-cache
. I came across a page, Cache Control Directives Demystified, that says (I can't vouch for its correctness):
In practice, IE and Firefox have
started treating the no-cache
directive as if it instructs the
browser not to even cache the page.
We started observing this behavior
about a year ago. We suspect that
this change was prompted by the
widespread (and incorrect) use of this
directive to prevent caching.
...
Notice that of late, "cache-control:
no-cache" has also started behaving
like the "no-store" directive.
As an aside, it appears to me that Cache-Control: max-age=0, must-revalidate
should basically mean the same thing as Cache-Control: no-cache
. So maybe that's a way to get the MUST-revalidate behavior of no-cache
, while avoiding the apparent migration of no-cache
to doing the same thing as no-store
(ie. no caching whatsoever)?
When sent by the user agent
I believe shahkalpesh's answer applies to the user agent side. You can also look at 13.2.6 Disambiguating Multiple Responses.
If a user agent sends a request with Cache-Control: max-age=0
(aka. "end-to-end revalidation"), then each cache along the way will revalidate its cache entry (eg. with the If-Not-Modified
header) all the way to the origin server. If the reply is then 304 (Not Modified), the cached entity can be used.
On the other hand, sending a request with Cache-Control: no-cache
(aka. "end-to-end reload") doesn't revalidate and the server MUST NOT use a cached copy when responding.
Best Answer
Yes, Expires and max-age do the same thing, but in two different ways. Same thing with Last-Modified and Etag
Expires depends on accuracy of user's clock, so it's mostly a bad choice (as most browsers support HTTP/1.1). Use max-age, to tell the browser that the file is good for that many seconds. For example, a 1 day cache would be:
Cache-Control: max-age=86400
Note that when both
Cache-Control
andExpires
are present,Cache-Control
takes precedence. readYou're right, Last-Modified should be better. Although it's a time, it's sent by the server. Hence no issue with user's clock. It's the reason why Last-Modified is better than Expires. The browser sends the Last-Modified the server sent last time it asked for the file, and if it's the same, the server answsers with an empty response «304 Not Modified»
Also, you must note that ETag can be useful too, because Last-Modified has a time window of one second. So you can't distinguish two different sources with the same Last-Modified value. [2]
Etag needs some more computing than Last-Modified, as it's a signature of the current state of the file (similar to a md5 sum or a CRC32).
All files can benefit caching. You've got two different approaches:
The more you cache, the faster your pages will show up. But it's a difficult task to flush caches, so use with care.
A good place to learn all this with many explanations: http://www.mnot.net/cache_docs/
[2]: rfc7232 Etag https://www.rfc-editor.org/rfc/rfc7232#section-2.3