Browser cache is the storage of resources that users have recently requested on the local disk of the browser. When visitors access the same resource again, the browser can directly load the resource from the local disk, reducing data transmission with the server, reducing the burden on the server, and speeding up the page response speed.
A good cache strategy can reduce the repetitive loading of resources and improve the overall loading speed of web pages. Usually, browser cache strategies are divided into strong cache and negotiation cache. Common HTTP caching can only store GET responses and will not cache other types of responses.
In theory, once a resource is cached, it should be permanently stored in the cache. However, due to limited space for storing resource copies, the cache periodically deletes some copies, a process called cache eviction. On the other hand, when a resource on the server is updated, the corresponding resource in the cache should also be updated. Because HTTP is a C/S protocol, the server cannot directly notify the client to update the cache when updating a resource, so both parties must agree on an expiration time for the resource. Before the expiration time, the cache copy of the resource is fresh. After the expiration time, the cache copy of the resource becomes stale. The eviction algorithm is used to replace stale resource cache copies with fresh ones. Note that a stale resource cache copy will not be directly cleared or ignored. When a client makes a request, if the cache retrieves a corresponding stale resource cache copy, the cache will first attach an If-None-Match
header to this request and then send it to the target server to check whether this resource copy is still fresh. If the server returns 304 (Not Modified)
, it means that this resource copy is fresh. Note that this response will not contain entity information. This way, some bandwidth can be saved. If the server determines after checking with If-None-Match
or If-Modified-Since
that the resource has expired, it will return the entity content of that resource. The above request process can be summarized as follows:
expires
and cache-control
. If the cache is hit and the cache has not expired, it uses the local cache directly.last-modified
and etag
. If the match is successful, the server will respond with 304
, but will not return the data of the resource. Instead, it will still read the resource from the cache. If it is not hit, the resource will be returned and the response will be 200
.Strong cache controls the validity period of the cache in the local browser through Expires
and Cache-Control
.
Expires
is a Header
proposed in HTTP 1.0
that represents the expiration time of a resource. It describes an absolute time returned by the server. Expires
is limited to local time, and if the local time is modified, it may cause the cache to become invalidated. For a resource request, if it is within Expires
, the browser will directly read the cache and no longer request the server.
Cache-Control
appeared in HTTP 1.1
and has a higher priority than Expires
. It represents a relative time and is supported by both request and response headers, defining cache strategies by providing different values.
Cache-Control: no-store
: The cache must not store any content about the client request and server response. Each request initiated by the client will download the complete response content.Cache-Control: no-cache
: The cache will store the server response content, but it cannot be provided to the browser until it is revalidated with the server for freshness. In simple terms, the browser will cache the server's response resources, but on each request, the cache must assess the validity of the cached response with the server, negotiate if the cache is available, and determine whether to use local cached resources or server response resources based on the response being 304
or 200
.Cache-Control: public || private
: public
means that the response can be cached by any intermediary such as an intermediate proxy, CDN
, etc. The default response is private
, where private
means that the response is exclusive, and intermediaries cannot cache this response. This response can only be applied to the browser's private cache.Cache-Control: max-age=31536000
: The response has the maximum expiration time, and the directive is max-age=<seconds>
, indicating the maximum time the resource can be cached to remain fresh. max-age
is the number of seconds from the time the request is initiated.Cache-Control: must-revalidate
: When the must-revalidate
directive is used, it means that when the cache considers using a stale resource, it must first validate its status. Expired caches will not be used. In normal circumstances, it is not necessary to use this directive because in the case of expired strong cache, negotiation caching will occur. However, the HTTP
specification allows clients to use expired caches directly in certain special circumstances, such as when the validation request fails or when configuring special directives such as stale-while-revalidate
, stale-if-error
, etc. The must-revalidate
directive requires the cache to revalidate under any circumstances after expiration.When the browser's request for a particular resource does not hit the strong cache, it sends a request to the server to verify whether the negotiation cache is hit. If the negotiation cache is hit, the response to the request is 304 (Not Modified)
, and no entity data is carried. If it is not hit, it returns 200
and carries the resource entity data. Negotiation caching uses the pair of Last-Modified, If-Modified-Since
and ETag, If-None-Match
to manage it.
Last-Modified, If-Modified-Since
was introduced in HTTP 1.0
. Last-Modified
represents the last modification date of the local file. The browser will add If-Modified-Since
in the request header, which is the value of the last response's Last-Modified
, to ask the server if there have been updates to the resource since that date. If there have been updates, the new resource will be sent back. However, if the cached file is opened locally, it will cause the Last-Modified
to be modified. Therefore, ETag
appeared in HTTP 1.1
.
ETag
is like a fingerprint, and changes in the resource will cause the ETag
to change, regardless of the last modification time. ETag
can ensure that each resource is unique. The request header field If-None-Match
will send the last returned ETag
to the server, asking if the ETag
of the resource has been updated. If it has changed, a new resource will be sent back. The priority of ETag
is higher than Last-Modified
. The specific use of ETag
is mainly considered in the following situations:
GET
.N
times in 1s
. If-Modified-Since
can only check the granularity of seconds, and cannot detect such frequent modifications.