HTTP ETags for Better Website Speed

January 15, 2013 Link to post  Permalink

HTTP 1.1 added a new header called an entity-tag that is shortened to ETag. This entity-tag was provided to allow a client browser to ask a server application if the content it already has is the currently available version, and for the server to quickly determine this and respond with either a simple ‘Yes’, or ‘No, here’s latest content’ response.

The Wikipedia Article for ETag goes into some of the technical details, but doesn’t really give any hint as to how to use this when building a web site of our own.

How to generate a good ETag?

A good ETag value is one where it is unlikely that different content produces the same value. Since ETags are specific to each URL of your website, this process doesn’t need to use the highest security mechanisms available to you, but it must be pretty good. It also needs to be fast. There’s no point having a performance feature that is slower than not using it at all.

For this use, the MD5 hash works really well. Although MD5 hashing can be broken, and different content can be created to have the same hash value, in our application, for ETag generation, these deficiencies aren’t important. MD5 digests can be generated very quickly, so add very little overhead.

Both Rack and Rails use MD5 digests when creating ETag values from the content passed to them

What to include in the ETag calculation?

Obviously, whatever data you get from your database to return to the client should be included, but if you are rendering the content in some way using templates, then you need to also include the contents of these files in the process to build a useful ETag. If all you are returning is a JSON rendering of the data, then all you need is the actual data to be returned.

How does Ruby on Rails deal with ETags?

Ruby on Rails has great support for both ETag generation and detection in a server request.

Since Ruby on Rails is a Rack application, the default middleware includes Rack::ETag and Rack::ConditionalGet that, between them, provide basic support without needing to change your application. I’ve previously written about this free Rails Performance feature

Can Rails do better?

Rails also adds methods to your Controller code to allow you to process ETags yourself and reduce the overhead of creating and sending a response, getting to Step 5 of my Six Steps to a Faster Rails App.

The main method is fresh_when, the other helper is stale?, which is implemented as a call to fresh_when and then return a boolean to indicate what that call did. fresh_when, if it decides that the ETag value in the request matches the ETag value it calculates from the object you pass it, sets the Response to be a Not Modified, which causes Rails to bypass rendering the view.

You can use either fresh_when or stale?, depending on how your code is structured. The Rails documentation example for using fresh_when looks like this:

def show
    @article = Article.find(params[:id])
    fresh_when(:etag => @article, :last_modified => @article.created_at.utc, :public => true)
end

If you have more processing to do after the fresh_when check, but only if you’re going to render the view, then you could write that same method like this.

def show
    @article = Article.find(params[:id])
    if stale?(:etag => @article, :last_modified => @article.created_at.utc, :public => true) 
      @sidebar_data = Sidebar.load
    end
end

What about the Layouts and Views that generate my HTML page?

The big problem with this example though is that, if this action is to be rendered as an HTML page using a Rails View, then just using the @article object for the ETag calculation isn’t enough. You actually need to provide some way to detect if the view code, the layout template, maybe even the CSS and Javascript files have changed since the last request. This isn’t a problem for API responses.

def show
    @article = Article.find(params[:id])
    respond_to do |format|
       format.html { 
         fresh_when(:etag => [ @article, {{layout}}, {{view}} ], :last_modified => @article.created_at.utc, :public => true)
       }
       format.json {
         if stale?(:etag => @article, :last_modified => @article.created_at.utc, :public => true)
           render :json => @article
         end
       }
    end 
end

We need something to make sure we include all the parts that make up a response in the creation of the ETag value. Currently, Rails does not have that piece.

In an upcoming post, I’ll show how I built a module to help me get closer to handling this last case.

Summary

  • What is an ETag
  • How do I make a good ETag?
  • How does Ruby on Rails deal with ETags?
  • Rails adds fresh_when and stale? so you can process ETags
  • Rails doesn’t use layout or view files when calculating an ETag

ETags are simple to generate and use. Understanding and using them may be the easiest way to make your website faster, and a faster site makes happier users.

The first issue of my Faster Rails Newsletter is due to be released at the end of January. Use the sign up form to the right now to ensure you get it when it is available.