Tuesday, April 15, 2008

Etags - Performance slowdowns / network bandwidth waste

A friend and ex-colleague of mine (Anand) introduced me today to the Firefox plugin YSlow.

I quickly installed it on Firefox since I already had Firebug installed. This is an awesome plugin!

I ran it on our internal website, and found a number of things, but what caught my eye was Etags. I never knew what they were and wanted to dig into it a little more.

So what are Etags?
From Yahoo:

Entity tags (ETags) are a mechanism that web servers and browsers use to determine whether the component in the browser's cache matches the one on the origin server. (An "entity" is another word a "component": images, scripts, stylesheets, etc.) ETags were added to provide a mechanism for validating entities that is more flexible than the last-modified date. An ETag is a string that uniquely identifies a specific version of a component. The only format constraints are that the string be quoted. The origin server specifies the component's ETag using the
ETag response header.

Heres the entire article: Etags

Yahoo further goes on to talk about what the problem is with Etags.

The problem with ETags is that they typically are constructed using attributes that make them unique to a specific server hosting a site. ETags won't match when a browser gets the original component from one server and later tries to validate that component on a different server, a situation that is all too common on Web sites that use a cluster of servers to handle requests.

Apparently, the problem exists with both Apache, IIS 5.0 & 6.0 servers. What does this mean to Load Balanced servers? It means that if the request switches from one webserver to another for any reason, then your cached image is now reloaded completely, instead of just sending a quick 304 - waste of bandwidth and perhaps worser performance (Basically by using an Etag on load balanced servers, your proxy caching and content caching don't work as well as intended!).

By removing the Etag completely (especially in LB scenarios), it can instead just validate with the Last-Modified header and not use the Etag.

I'm reiterating here, what's already said very eloquently. Be sure to read the article. They have code for fixing both Apache & IIS (link to Microsoft's article)

1 comment:

Anonymous said...

Can anyone recommend the top Remote Desktop system for a small IT service company like mine? Does anyone use Kaseya.com or GFI.com? How do they compare to these guys I found recently: N-able N-central automated deployment
? What is your best take in cost vs performance among those three? I need a good advice please... Thanks in advance!