In a pair of blog posts, Tenni Theurer of Yahoo's performance team demonstrates that on many large Web sites, only about 20% of the time in serving a page request is spent on generating dynamic content, and that the remaining 80% is consumed by the browser fetching static page elements.
When setting out to improve a Web site's performance, most developers think of beefing up server-side processing of Web page request first. That strategy, while important, may miss a crucial fact of Web scalability: client-side caching of content.
Most performance optimization today are made on the parts that generate the HTML document (apache, C++, databases, etc.), but those parts only contribute to about 20% of the user’s response time. It’s better to focus on optimizing the parts that contribute to the other 80%.
Using a packet sniffer, we discover what takes place in that other 80%... [When measuring page loading of www.yahoo.com], only 10% of the time is spent here for the browser to request the HTML page, and for apache to stitch together the HTML and return the response back to the browser. The other 90% of the time is spent fetching other components in the page including images, scripts and stylesheets.
Reducing the number of HTTP requests has the biggest impact on reducing response time and is often the easiest performance improvement to make.
It’s important to differentiate between end user experiences for an empty versus a full cache page view. An “empty cache” means the browser bypasses the disk cache and has to request all the components to load the page. A “full cache” means all (or at least most) of the components are found in the disk cache and the corresponding HTTP requests are avoided...
Strategies such as combining scripts, stylesheets, or images reduce the number of HTTP requests for both an empty and a full cache page view. Configuring components to have an Expires header with a date in the future reduces the number of HTTP requests for only the full cache page view...
Theurer shows data to conclude that even if pages are designed for client-side caching, many users will still experience a site with an empty cache:
The performance team at Yahoo! ran an experiment to determine the percentage of users and page views with an empty cache on some of Yahoo!’s most popular pages...
40-60% of Yahoo!’s users have an empty cache experience and ~20% of all page views are done with an empty cache. To my knowledge, there’s no other research that shows this kind of information.
And I don’t know about you, but these results came to us as a big surprise. It says that even if your assets are optimized for maximum caching, there are a significant number of users that will always have an empty cache. This goes back to the earlier point that reducing the number of HTTP requests has the biggest impact on reducing response time. The percentage of users with an empty cache for different web pages may vary, especially for pages with a high number of active (daily) users. However, we found in our study that regardless of usage patterns, the percentage of page views with an empty cache is always ~20%.
What techniques do you use to reduce the number of HTTP requests to your pages?
I've done this kind of analysis of a web app using Firebug, a plugin for firefox. Its very visual representation of resource loading makes it very easy to see which resources need caching headers added. http://getfirebug.com/