Strong Cache and Negotiation Cache

Browser cache is the storage of resources that users have recently requested on the local disk of the browser. When visitors access the same resource again, the browser can directly load the resource from the local disk, reducing data transmission with the server, reducing the burden on the server, and speeding up the page response speed.

Description

A good cache strategy can reduce the repetitive loading of resources and improve the overall loading speed of web pages. Usually, browser cache strategies are divided into strong cache and negotiation cache. Common HTTP caching can only store GET responses and will not cache other types of responses. In theory, once a resource is cached, it should be permanently stored in the cache. However, due to limited space for storing resource copies, the cache periodically deletes some copies, a process called cache eviction. On the other hand, when a resource on the server is updated, the corresponding resource in the cache should also be updated. Because HTTP is a C/S protocol, the server cannot directly notify the client to update the cache when updating a resource, so both parties must agree on an expiration time for the resource. Before the expiration time, the cache copy of the resource is fresh. After the expiration time, the cache copy of the resource becomes stale. The eviction algorithm is used to replace stale resource cache copies with fresh ones. Note that a stale resource cache copy will not be directly cleared or ignored. When a client makes a request, if the cache retrieves a corresponding stale resource cache copy, the cache will first attach an If-None-Match header to this request and then send it to the target server to check whether this resource copy is still fresh. If the server returns 304 (Not Modified), it means that this resource copy is fresh. Note that this response will not contain entity information. This way, some bandwidth can be saved. If the server determines after checking with If-None-Match or If-Modified-Since that the resource has expired, it will return the entity content of that resource. The above request process can be summarized as follows:

  • When a browser initiates a request for a resource, it first checks if there is a cache locally. If there is a cache, it checks whether the cache has expired through expires and cache-control. If the cache is hit and the cache has not expired, it uses the local cache directly.
  • If the local cache is not hit, the browser sends a negotiation request to the server to verify whether the resource is hit by negotiation caching through last-modified and etag. If the match is successful, the server will respond with 304, but will not return the data of the resource. Instead, it will still read the resource from the cache. If it is not hit, the resource will be returned and the response will be 200.

Strong Cache

Strong cache controls the validity period of the cache in the local browser through Expires and Cache-Control.

Expires

Expires is a Header proposed in HTTP 1.0 that represents the expiration time of a resource. It describes an absolute time returned by the server. Expires is limited to local time, and if the local time is modified, it may cause the cache to become invalidated. For a resource request, if it is within Expires, the browser will directly read the cache and no longer request the server.

Expires: Sun, 14 Jun 2020 02:50:57 GMT

Cache-Control

Cache-Control appeared in HTTP 1.1 and has a higher priority than Expires. It represents a relative time and is supported by both request and response headers, defining cache strategies by providing different values.

Cache-Control: max-age=300
  • Cache-Control: no-store: The cache must not store any content about the client request and server response. Each request initiated by the client will download the complete response content.
  • Cache-Control: no-cache: The cache will store the server response content, but it cannot be provided to the browser until it is revalidated with the server for freshness. In simple terms, the browser will cache the server's response resources, but on each request, the cache must assess the validity of the cached response with the server, negotiate if the cache is available, and determine whether to use local cached resources or server response resources based on the response being 304 or 200.
  • Cache-Control: public || private: public means that the response can be cached by any intermediary such as an intermediate proxy, CDN, etc. The default response is private, where private means that the response is exclusive, and intermediaries cannot cache this response. This response can only be applied to the browser's private cache.
  • Cache-Control: max-age=31536000: The response has the maximum expiration time, and the directive is max-age=<seconds>, indicating the maximum time the resource can be cached to remain fresh. max-age is the number of seconds from the time the request is initiated.
  • Cache-Control: must-revalidate: When the must-revalidate directive is used, it means that when the cache considers using a stale resource, it must first validate its status. Expired caches will not be used. In normal circumstances, it is not necessary to use this directive because in the case of expired strong cache, negotiation caching will occur. However, the HTTP specification allows clients to use expired caches directly in certain special circumstances, such as when the validation request fails or when configuring special directives such as stale-while-revalidate, stale-if-error, etc. The must-revalidate directive requires the cache to revalidate under any circumstances after expiration.

Negotiation Caching

When the browser's request for a particular resource does not hit the strong cache, it sends a request to the server to verify whether the negotiation cache is hit. If the negotiation cache is hit, the response to the request is 304 (Not Modified), and no entity data is carried. If it is not hit, it returns 200 and carries the resource entity data. Negotiation caching uses the pair of Last-Modified, If-Modified-Since and ETag, If-None-Match to manage it.

Last-Modified If-Modified-Since

Last-Modified, If-Modified-Since was introduced in HTTP 1.0. Last-Modified represents the last modification date of the local file. The browser will add If-Modified-Since in the request header, which is the value of the last response's Last-Modified, to ask the server if there have been updates to the resource since that date. If there have been updates, the new resource will be sent back. However, if the cached file is opened locally, it will cause the Last-Modified to be modified. Therefore, ETag appeared in HTTP 1.1.

ETag If-None-Match

ETag is like a fingerprint, and changes in the resource will cause the ETag to change, regardless of the last modification time. ETag can ensure that each resource is unique. The request header field If-None-Match will send the last returned ETag to the server, asking if the ETag of the resource has been updated. If it has changed, a new resource will be sent back. The priority of ETag is higher than Last-Modified. The specific use of ETag is mainly considered in the following situations:

  • Some files may change periodically, but their content does not change, such as only the modification time changes. In this case, we do not want the client to think that the file has been modified and re-GET.
  • Some files are modified very frequently, such as being modified within seconds, for example, N times in 1s. If-Modified-Since can only check the granularity of seconds, and cannot detect such frequent modifications.
  • Some servers cannot accurately obtain the last modification time of the file.

Daily Question

https://github.com/WindrunnerMax/EveryDay

Reference

https://zhuanlan.zhihu.com/p/60357719 https://www.jianshu.com/p/9c95db596df5 https://segmentfault.com/a/1190000008956069 https://github.com/amandakelake/blog/issues/41 https://juejin.im/post/5ccfccaff265da03ab233bf5 https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Caching_FAQ