Scaling Static Content

Scaling Static Content for Continuous Deployment within a Content Delivery Network

By Andy Michaelis, Jordan Piedt, and Crystal Augustus

Summary

This article describes how uShip Engineering configured tools and processes in a Continuous Deployment environment to refresh static content to the end user. The latest static content is loaded in real-time based off of the latest release.

Intro

We offload traffic and speed up our main production website (http://www.uship.com) and related websites using Edgecast as

  • a content delivery network (CDN) and
  • an application delivery network (ADN).

Edgecast helps us reap the following benefits

  • offloading requests from our web servers and
  • caching static content on the edge of the Internet.

We also use Strangeloop to predictively load, buffer, and combine static content based on popular user flows to our site.

Issue

However, through our Continuous Deployment process, we created an undesirable side effect that made it difficult to keep static content updated during each push. During our rolling deployments, our servers are updated asynchronously. An Edgecast request is not tied any one particular server. So, if a request for a new static item comes in the middle of a deploy, there is a high chance that Edgecast retrieves content from a server running the old version.

Developers had to fight with the CDN by making unfortunate and necessary secondary timestamp (internally called ‘whitespace’) pushes because sometimes our CDN made a request to a server that did not have a new copy of a static file (javascript or CSS, for example). This effort involved making a non-functional change to one of those files (adding a new line or space, for example) to ensure the timestamp changed so the CDN would pick up a new copy.

Solution Discovery

We looked into adding a query string by appending a version such as “9:1234″ to each static file based off of the timestamp of that file. The issue appears when a static content URL such as  

In this instance, we could not guarantee the pull was happening from the correct server which had the latest content. Typically, a user’s session is sticky to a server. However, when the session is routed through Edgecast, that particular request would not return to the same server.

When the CDN sees a request for a new file version, the CDN will need to fetch and cache that new file. To retrieve the file, the CDN must process a GET request to www.uship.com which will be processed via round robin to one of our servers. One of our first attempts at resolving this issue was purging Edgecast via their API, but the latency involved in updating all Edgecast servers still left us with out-of-sync content.

Another possible solution was to manually update the static files after a deployment. We forced a timestamp change so that Edgecast would retrieve a new version. The issue with this solution was waiting for the last server to update resulting in other users seeing errors before Edgecast would have the latest file.

The Fix

In our current environment, each load-balanced server in rotation hosts various sites, including both http://static.uship.com and http://www.uship.com.  We set http://static.uship.com to always contain the latest static content. 

1.) When deploying code, first we update the static site on every load-balanced server with all the new, timestamped content.
2.) Then, once the content for the static sites is updated, we continue to push all the new code asynchronously to all of the servers in rotation.

So, during a deployment, we update each server twice–first with updated static content and second with the latest code.

Before a server with the latest code is added, it builds up a list of static content versions based off of the file timestamp. If an end user then requests content that is not yet in Edgecast, that request will be guaranteed to propagate the newest version to our CDN. All previous requests to old content will be handled by Edgecast and not propagate to our servers.

The list of static content versions is cached and appended to the “9:1234″ file query string. This cache is destroyed and rebuilt on each server during its deployment. In the event we have to do a rollback, we simply revert the http://static.uship.com files and rebuild the static cache on each server with the old code.

Regarding the “9:1234” query string, we use a global version number “9” so we can blow away all static content on our site when the need arises by changing this number.  To build the latter part of this version, “1234” we use the HTTP Last-Modified attribute in the response header.  We also added the ability to bypass Edgecast at a configuration level in case of issues.

Join the Conversation

  • How does your Continuous Deployment environment successfully overcome challenges to updating static content?
  • What are other performance enhancing technologies or methods do you use?

Tweet us @uShipEng or comment below.

Tags: , , , ,