Skip to main content

Optimizing Drupal Performance - Internal Page Cache

The Internal Page Cache is a core module in Drupal responsible for caching pages requested by anonymous users.

When a page is cached and an anonymous user makes a new request, Drupal does not need to perform any rendering or page-building processes. It simply retrieves the rendered page from the cache and sends it to the client.

The reason it only applies to anonymous users and not authenticated users is that the page returned to the client must have exactly the same content for all users.

In the case of authenticated users, although part of the content may be the same for everyone, there are always elements that can vary, such as the user block displaying the user's name or other user-specific information.

For these cases, there is the Dynamic Page Cache module, which handles caching for both anonymous and authenticated users.

Functionality

Cache Bin

For storing and managing cached pages, the Internal Page Cache defines its own cache bin called “Page,” meaning that cached objects are stored independently of other existing cache systems in Drupal.

Image
Page Cache bin definition

 

If we access the database of an installation where this module is active and no in-memory caching system (like Redis or Memcache) is used, we will see a table called “page_cache”.

In this table, cached response objects are stored, consisting mainly of the following columns:

  • CID: The URL of the cached response object, used as a unique identifier to retrieve the cached object.
  • DATA: Stores the serialized response object.
  • EXPIRE: The expiration date of the cached object. By default, it will be -1, indicating that it does not expire.
  • TAGS: The cache tags of all components that make up the page. This way, if any of these components is modified, the cached object is invalidated, and the page is rebuilt.
  • CHECKSUM: Thanks to this field, Drupal can quickly determine if any of the cache tags associated with the object have been invalidated.
Image
Page cache bin table

HTTP Middleware

It defines its own middleware  (Middleware API) to intercept the HTTP request before it reaches the main kernel.

Image
Middleware HTTP


Broadly speaking, without going into too much detail or verification processes, this middleware determines if the response is suitable for caching. If it is, it checks if there is a cached object; if it exists, it returns it; if not, it delegates the generation of the response object. Before sending the response back to the client, it caches it for future requests.

How to Know if a Page is Cached

The way to know if a page is being cached by the Internal Page Cache is to check the response headers in the browser's inspector.

When the module is active, it adds a new parameter to the headers of responses that can be cached by it: “X-Drupal-Cache”.

Image
Headers response - Page cache

 

Therefore, if you visit a page after logging in, you will not see this parameter in the request headers because, as mentioned earlier, it only applies to anonymous users. However, if you visit it as an anonymous user, you will see this parameter.

Values of the X-Drupal-Cache Parameter:

  • MISS: Indicates that there was no cached object for that response when the request was made, so the page returned is not from the cache. The next time a request is made for this same page, the value will be HIT.
  • HIT: Indicates that a cached page is being returned.

Particularities

Max Age

The Max Age parameter is not active for the Internal Page Cache. This means that even if a specific value is set in the parameter, the Internal Page Cache will not consider this value and will never invalidate or regenerate the cached object based on this parameter.

One possible solution, if we want to set invalidation based on Max Age, is to create a custom cache tag and associate it with the cached objects to which we want to apply a certain expiration.

Then, we must configure a cron job to run at the desired interval to invalidate the cache tag.

Cache Context

Cache context does not apply to pages served to anonymous users and cached by the Internal Page Cache. As we have seen, it returns the cached object directly, so it only considers cache contexts at the time of the page construction before caching.

Invalidation

The only methods to regenerate the cached objects by Internal Page Cache are to invalidate the cache tags associated with each cached page or to manually clear the cache.

Recommendations

Drupal.org recommends enabling this caching system on small to medium-sized sites or installations. If the site has a larger size or traffic, other caching layers, such as a proxy cache, should be considered.

In fact, if the module is enabled and a proxy cache exists, there will be two layers of caching performing the same function, so it is recommended to disable the Internal Page Cache. To do this, simply disable the module
 

Image
Luis Ruiz

Luis Ruiz

Senior Drupal Developer