Friday, February 25, 2011

Performance Tips of the RIA All-Stars




CODING and LOADING

This guide will summarize the performance-related DOs and DON'Ts of JavaScript and Web Site development as espoused by well-known pundits like Douglas Crockford. The publications listed below are surveyed, but since they overlap, this guide can act as both summary and cheat sheet for their catalog of techniques. The tips fall into 2 major categories: CODING and LOADING. i.e. making JavaScript and CSS performant, and minimizing the apparent time needed to load responsive user interfaces.

  • [HPW] High Performance Web Sites, Steve Souders, O'Reilly, 2007
  • [JGP] JavaScript: The Good Parts, Douglas Crockford, O'Reilly, 2008
  • [EFW] Even Faster Web Sites, Steve Souders, O'Reilly, 2009
  • [HPJ] High Performance JavaScript, Nicholas Zakas, O'Reilly, 2010

As with any optimization effort, the number one rule is to only optimize the portions of a system that are bottlenecks as actually measured at run time. Often, optimization techniques have serious downsides that require tradeoff judgments to be made. For example, Crockford [JGP, P.65] says both that "Regular expressions usually have a significant performance advantage over equivalent string operations in JavaScript" AND "regular expressions can be very difficult to maintain and debug". Souders [EFW, P.6] says "Everything is a trade-off. When optimizing for performance, do not waste time trying to speed up code that does not consume a significant amount of the time. Measure first. Back out of any optimization that does not provide an enjoyable benefit." Galbraith [EFW, P.9] says "developers seeking to create responsive, high-performance web sites can't, and shouldn't, go about achieving that goal by optimizing every single piece of code as they write it. The opposite is true: a developer should optimize only what isn't fast enough."

On the other hand, there are basic practices (like not putting operations inside a loop that can be done once outside the loop) that can be followed right from the start. As Zakas [HPJ, P.XII] says, "JavaScript forces the developer to perform the optimizations that a compiler would normally handle in other languages". This catalog will contain a bit of each category.

And finally, in a very few cases, these authors contradict each other; for example, bitwise operators are either really fast, or really slow, depending on who is judging. Crockford [JGP, P.112], in Appendix B: Bad Parts, says bitwise operators are "very slow", whereas, Zakas [HPJ, P.156], in Use the Fast Parts, says bitwise operators are "incredibly fast"!

Caveats:
  • This guide assumes that the reader is already familiar with basic but modern JavaScript/Web development, and leaves to another guide the task of bringing the reader up to that level.
  • While the books referenced above contain a wealth of development advice, this guide will only summarize those tips that are related to improved performance.
  • Each tip is a summary of a high level idea. Please refer to the book sections referenced for more implementation details as well as discussion of alternate techniques for more specialized situations.


Section 1: Coding Tips

  • "JavaScript forces the developer to perform the optimizations that a compiler would normally handle" - N. Zakas
Arrays are not Arrays...
[JGP, P.58,105] "[Conventional] arrays can be very fast data structures. Unfortunately, JavaScript does not have anything like this kind of array. ... their performance can be considerably worse than real arrays". Arrays in JavaScript are basically objects (i.e. dynamic bags of properties) whose property names are merely the string version of the integer subscripts.
...but String Arrays are faster than String Concatenation
[JGP, P.78][EFW, P.99] "If you are assembling a string from a large number of pieces, it is usually faster to put the pieces into an array and join them than it is to concatenate the pieces with the + operator." The more pieces that are concatenated together (think very big loops here), the faster it is to use Array.join.
    var a = []; 
    a.push('a');  a.push('b');  a.push('c');  a.push('d'); 
    var c = a.join('');   // c is 'abcd' and faster than c = 'a'+'b'+'c'+'d';
[EFW, P.99] While modern browsers have improved string performance to the point that this advice may be out of date, never the less, "The performance decrease of the array technique in other browsers is typically much less than the performance increase gained in [pre-v8] Internet Explorer".
Fastest implementation of the missing String.trim function
[EFW, P.101] JavaScript is missing a string trim function, and it is usually implemented by the application. After performing research on the fastest way to execute string trimming in JavaScript, the following function consistently performs better than other variations:
    function trim( text ){ 
        text = text.replace(/^\s+/, ""); 
        for (var i=text.length-1; i>=0; --i) { 
            if (/\S/.test(text.charAt(i))) { 
                text = text.substring( 0, i+1 ); 
                break; 
            }
        }
        return text; 
    }
Faster to process an array of properties than sorting an enumeration
[JGP, P.24] "The for in statement can loop over all of the property names in an object... [but] there is no guarantee on the order of the names." So, rather than having to sort them (not to mention filtering out all of the unwanted inherited properties), use a normal for loop against an array of property names in the desired order. E.G.
    var i, pn = [ 'first-name', 'middle-name', 'last-name' ]; 
    for (i=0; i < pn.length; ++i) { 
        document.writeln( pn[i] + ': ' +  personObject[ pn[i] ] ); 
    }
Use Simple Regular Expressions instead of string manipulation
[JGP, P.65, 69] "Regular expressions usually have a significant performance advantage over equivalent string operations in JavaScript...[however] regular expressions can be very difficult to maintain and debug...[and] the part of the language that is least portable is the implementation of regular expressions. Regular expressions that are very complicated or convoluted are more likely to have portability problems. Nested regular expressions can also suffer horrible performance problems in some implementations. Simplicity is the best strategy."
Eval is Evil (and so are its aliases)
[JGP, P.110] Slow performance is just one reason to not use eval() and its equivalents. Between the two lines below, line one "will be significantly slower because it needs to run the compiler just to execute a trivial assignment statement...The Function constructor is another form of eval, and should similarly be avoided. The browser provides setTimeout and setInterval functions that can take string arguments or function arguments. When given string arguments, setTimeout and setInterval also act as eval. The string argument form also should be avoided."
    eval("myValue = myObject." + myKey + ";"); 
    myvalue = myObject[ myKey ];
Always store multi-accessed values in local variables
[HPJ, P.20-33] [EFW, P.79-88] "Generally speaking, you can improve the performance of JavaScript code by storing frequently used object members, array items, and out-of-scope variables in local variables. You can then access the local variables faster than the originals. ... A good rule of thumb is to always store out-of-scope values in local variables if they are used more than once within a function. ... Literal values and local variables can be accessed very quickly, whereas array items and object members take longer. ... Local variables are faster to access than out-of-scope variables because they exist in the first variable object of the scope chain. The further into the scope chain a variable is, the longer it takes to access. Global variables are always the slowest to access because they are always last in the scope chain. ... Nested object members incur significant performance impact and should be minimized. ... The deeper into the prototype chain that a property or method exists, the slower it is to access."
Avoid the with statement
[HPJ, P.33] "Avoid the with statement because it augments the execution context scope chain. ... Also, be careful with the catch clause of a try-catch statement because it has the same effect." The net effect of each is that it makes the previously local variables no longer local, and hence, slower.
Optimize your loops because JavaScript won't
[EFW, P.93] "loops are a frequent source of performance issues in JavaScript, and the way you write loops drastically changes its execution time. Once again, JavaScript developers don't get to rely on compiler optimizations that make loops faster" Beyond the normal refactoring of code to be outside of a loop wherever possible, there are a couple of operations of the loop itself that must be hand optimized. The first version of the loop below runs much slower than the second because it is looking up the length property of values over and over, and the comparison of i-- to zero is much faster than comparing i to length.
    //traditional unoptimized for loop
    var values = [1,2,3,4,5];
    for (var i=0; i < values.length; ++i){ process(values[i]); }
 
    //over 50% faster version
    var length = values.length;
    for (var i=length; i--;){ process(values[i]); } 
Use Cooperative Multitasking for Long-Running Scripts
[EFW, P.102] "One of the critical performance issues with JavaScript is that code execution freezes a web page. Because JavaScript is a single-threaded language, only one script can be run at a time per window or tab. This means that all user interaction is necessarily halted while JavaScript code is being executed." In general, when attempting to simulate true simultaneous execution of multiple processes (like JavaScript and the user interface), each process must either volunteer to regularly yield control temporarily to other processes (cooperative multitasking), or have control regularly taken from it (preemptive multitasking). The only option with JavaScript is voluntarily yielding control to the UI thread by splitting your logic into pieces and having each piece schedule the next piece with a time delay in between. "Generally speaking, no single continuous script execution should take longer than 100 milliseconds; anything longer than that and the web page will almost certainly appear to be running slowly to the user". The setTimeout function takes a function and a delay amount as parameters. "When the delay has passed, the code to execute is placed into a queue. The JavaScript engine uses this queue to determine what to do next. When a script finishes executing, the JavaScript engine yields to allow other browser tasks to catch up. The web page display is typically updated during this time in relation to changes made via the script. Once the display has been updated, the JavaScript engine checks for more scripts to run on the queue. If another script is waiting, it is executed and the process repeats; if there are no more scripts to execute, the JavaScript engine remains idle until another script appears in the queue."
Use Memoization to cache intermediate calculations
[JGP, P.44] One of the benefits of doing true "functional programming" is that it is easy to avoid performing calculations that have already been done. True functional programming has the rule that the result of a function depends totally on its parameters; i.e. it will always give the same result given the same parameters. This means that a cache object could be kept by a function to store the result for each set of parameters as they are encountered. When a function is called, the first thing it does it try to look up a cached result for the parameters it is given, and return it if found. If the result is not found, then it is calculated and saved in the cache before returning the result.

A simple example is the Fibonacci function. The first version below is the traditional implementation and the second uses a memoization helper to prevent recalculation. Since Fibonacci calls itself recursively, the savings add up very quickly.

// traditional implementation
function fibonacci(n){ return n<2 ? n : fibonacci(n-1) + fibonacci(n-2); }; 
 
// simple memoization helper for functions-with-a-single-numeric-parameter
var memoizer = function( mementoArray, lambda )
{ 
    var memoizedLambda = function( n ) { 
        var result = mementoArray[n]; 
        if (typeof result !== 'number')
            mementoArray[n] = result = lambda( memoizedLambda, n );
        return result;
    };
 
    return memoizedLambda; 
};
 
// memoized version [ with preloaded memos for f(0)=0 and f(1)=1 ]
var fibonacci = memoizer( [0,1], function(myself,n){ return myself(n-1) + myself(n-2); } );
Avoid CSS expressions
[HPW, P.51] "CSS expressions are a powerful (and dangerous) way to set CSS properties dynamically. They're supported in Internet Explorer version 5 and later... The expression method is simply ignored by other browsers, so it is a useful tool for setting properties in Internet Explorer to create a consistent experience across browsers. ... CSS expressions are re-evaluated when the page changes, such as when it is resized... The problem with expressions is that they are evaluated more frequently than most people expect. Not only are they evaluated whenever the page is rendered and resized, but also when the page is scrolled and even when the user moves the mouse over the page. ... CSS expressions benefit from being automatically tied to events in the browser, but that's also their downfall."

The major technique used to avoid CSS expressions is to have your own JavaScript function registered as an event listener triggered by only the appropriate event(s). It should set the dynamic CSS property to the desired recalculated value.

CSS Selectors are Backwards
[EFW, P.194] "The impact of CSS selectors on performance derives from the amount of time it takes the browser to match the selectors against the elements in the document. ... Consider the following rule:     #toc > LI { font-weight: bold; } Most of us, especially those who read left to right, might assume that the browser matches this rule by moving from left to right, and thus, this rule doesn't seem too expensive. In our minds, we imagine the browser working like this: find the unique toc element and apply this styling to its immediate children who are LI elements. We know that there is only one toc element, and it has only a few LI children, so this CSS selector should be pretty efficient. In reality, CSS selectors are matched by moving from right to left! With this knowledge, our rule that at first seemed efficient is revealed to be fairly expensive. The browser must iterate over every LI element in the page and determine whether its parent is toc. This descendant selector example is even worse:     #toc A { color: #444; } Instead of just checking for anchor elements inside toc, as would happen if it was read left to right, the browser has to check every anchor in the entire document. And instead of just checking each anchor's parent, the browser has to climb the document tree looking for an ancestor with the ID toc. If the anchor being evaluated isn't a descendant of toc, the browser has to walk the tree of ancestors until it reaches the document root."


Section 2: Loading Tips

  • "Nothing else can happen while JavaScript code is being executed" - N. Zakas
  • "Rule 1: Make Fewer HTTP Requests" - Steve Souders
  • "Only 10-20% of the end user response time is spent downloading the HTML document.
    The other 80-90% is spent downloading all the components in the page." - the Performance Golden Rule
What is Fast Enough
[EFW, P.9] "It's fine to say that code needs to execute ''as fast as possible'', but ... exactly what is ''fast enough'' ... Jakob Nielsen is a well-known and well-regarded expert in the field of web usability; the following quote addresses the issue of ''fast enough'':
The response time guidelines for web-based applications are the same as for all other applications. These guidelines have been the same for 37 years now, so they are also not likely to change wit whatever implementation technology comes next.
0.1 second: Limit for users feeling that they are directly manipulating objects in the UI. For example, this is the limit from the time the user selects a column in a table until that column should highlight or otherwise give feedback that it's selected. Ideally, this would also be the response time for sorting the column-if so, users would feel that they are sorting the table.
1 second: Limit for users feeling that they are freely navigating the command space without having to unduly wait for the computer. A delay of 0.2-1.0 seconds does mean that users notice the delay and thus feel the computer is ''working'' on the command, as opposed to having the command be a direct effect of the users' actions. Example: If sorting a table according to the selected column can't be done in 0.1 seconds, it certainly has to be done in 1 second, or users will feel that the UI is sluggish and will lose the sense of ''flow'' in performing their task. For delays of more than 1 second, indicate to the user that the computer is working on the problem, for example by changing the shape of the cursor.
10 seconds: Limit for users keeping their attention on the task. Anything slower than 10 seconds needs a percent-done indicator as well as a clearly signposted way for the user to interrupt the operation. Assume that users will need to reorient themselves when they return to the UI after a delay of more than 10 seconds. Delays of longer than 10 seconds are only acceptable during natural breaks in the user's work, for example when switching tasks.
In other words, if your JavaScript code takes longer than 0.1 seconds to execute, your page won't have that slick, snappy feel; if it takes longer than 1 second, the application feels sluggish; longer than 10 seconds, and the user will be extremely frustrated. These are the definitive guidelines to use for defining ''fast enough.'' "
Put Stylesheets at the Top
[HPW, P.44] "In their effort to improve one of the most visited pages on the Web, the Yahoo portal team initially made it worse by moving the [CSS] stylesheet to the bottom of the page. They found the optimal solution by following the HTML specification and leaving it at the top. Neither of the alternatives, the blank white screen or flash of unstyled content, are worth the risk. ... If you have a stylesheet that's not required to render the page, with some extra effort you can load it dynamically after the document loads."
Put JavaScript at the Bottom
[HPJ, P.3] [HPW, P.45] "Put all <script> tags at the bottom of the page, just inside of the closing </body> tag. This ensures that the page can be almost completely rendered before script execution begins." "[While modern browsers] allow parallel downloads of JavaScript files...JavaScript downloads still block downloading of other resources, such as images. And even though downloading a script doesn't block other scripts from downloading, the page must still wait for the JavaScript code to be downloaded and executed before continuing." "Since each <script> tag blocks the page from continuing to render until it has fully downloaded and executed the JavaScript code, the perceived performance of this page will suffer. Keep in mind that browsers don't start rendering anything on the page until the opening <body> tag is encountered. Putting scripts at the top of the page typically leads to a noticeable delay, often in the form of a blank white page" "This is the Yahoo Exceptional Performance team's first rule about JavaScript: put scripts at the bottom."
Use External Scripts but Group them Together...
[HPJ, P.4] [HPW, P.55] "In Raw Terms, Inline [scripts are] Faster... [but] using external files in the real world generally produces faster pages. This is due to ... the opportunity for JavaScript and CSS files to be cached by the browser." "Group scripts together. The fewer <script> tags on the page, the faster the page can be loaded and become interactive. ... when dealing with external JavaScript files, each HTTP request brings with it additional performance overhead, so downloading one single 100 KB file will be faster than downloading four 25 KB files. It's helpful to limit the number of external script files that your page references. ... Typically, a large website or web application will have several required JavaScript files. You can minimize the performance impact by concatenating these files together into a single file and then calling that single file with a single <script> tag. The catenation can happen offline using a build tool or in real-time using a tool like the Yahoo combo handler."
...ON THE OTHER HAND, don't group them All together
(aka: Download Most JavaScript in a Nonblocking Fashion)
[HPJ, P.5-9] [EFW, P.22, 51, 73] "It turns out that Facebook executes only 9% of the downloaded JavaScript functions by the time the onload event is called." "Limiting yourself to downloading a single large JavaScript file will only result in locking the browser out for a long period of time, despite it being just one HTTP request. To get around this situation, you need to incrementally add more JavaScript to the page in a way that doesn't block the browser. The secret to nonblocking scripts is to load the JavaScript source code after the page has finished loading...[i.e.] after the window's load event has been fired." "Dynamic script loading is the most frequently used pattern for nonblocking JavaScript downloads due to its cross-browser compatibility and ease of use." "A new <script> element can be created very easily using standard DOM methods:
    var script = document.createElement("script"); 
    script.type = "text/javascript"; 
    script.src = "file1.js"; 
    document.getElementsByTagName("head")[0].appendChild(script); 
... [file1.js] begins downloading as soon as the element is added to the page. The important thing about this technique is that the file is downloaded and executed without blocking other page processes... It's generally safer to add new <script> nodes to the <head> element instead of the <body>, especially if this code is executing during page load. ... This works well when the script is self-executing but can be problematic if the code contains only interfaces to be used by other scripts on the page. In that case, you need to track when the code has been fully downloaded and is ready for use. This is accomplished using events that are fired by the dynamic <script> node. ... You can dynamically load as many JavaScript files as necessary on a page, but make sure you consider the order in which files must be loaded. ... [Many] browsers will download and execute the various code files in the order in which they are returned from the server. You can guarantee the order by [daisy] chaining the downloads ...[or] concatenate the files into a single file where each part is in the correct order. That single file can then be downloaded ... (since this is happening asynchronously, there's no penalty for having a larger file)." "The challenge in splitting your JavaScript code is to avoid undefined symbols." "Preserving the order of JavaScript is critical, and this is true for CSS as well."
Separate Bootstrap Code from the bulk of the JavaScript
[HPJ, P.10] [EFW, P.27] "The recommended approach to loading a significant amount of JavaScript onto a page is a two-step process: first, include the code necessary to dynamically load JavaScript, and then load the rest of the JavaScript code needed for page initialization. Since the first part of the code is as small as possible, potentially containing just the loadScript() function, it downloads and executes quickly, and so shouldn't cause much interference with the page. Once the initial code is in place, use it to load the remaining JavaScript [in a nonblocking fashion with a callback invoked on load completion]. For example:
    <script type="text/javascript" src="myScriptLoader.js"></script>
    <script type="text/javascript">
      myScriptLoader( "the-rest.js", function(){ MyApplication.init(); } );
    </script>
Place the loading code just before the closing </body> tag [so that] when the second JavaScript file has finished downloading, all of the DOM necessary for the application has been created and is ready to be interacted with, avoiding the need to check for another event (such as window.onload) to know when the page is ready for initialization." "The concept of a small initial amount of code on the page followed by downloading additional functionality is at the core of the YUI 3 design."
Prevent redirects due to missing trailing URL slashes
[HPW, P.76] "A redirect is used to reroute users from one URL to another. ...the main thing to remember is that redirects make your pages slower. ... One of the most wasteful redirects happens frequently and web developers are generally not aware of it. It occurs when a trailing slash (/) is missing from a URL that should otherwise have one. For example, http://yahoo.com/astrology results in a redirect to http://yahoo.com/astrology/ . The only difference is the addition of a trailing slash. ... Note that a redirect does not happen if the trailing slash is missing after the hostname. For example, http://www.yahoo.com does not generate a redirect."
Never put an inline script after a <link> tag
[HPJ, P.4][EFW, P.75] "Steve Souders has found that an inline script placed after a <link> tag referencing an external stylesheet caused the browser to block while waiting for the stylesheet to download. This is done to ensure that the inline script will have the most correct style information with which to work. Souders recommends never putting an inline script after a <link> tag for this reason."
Minify JavaScript
[HPW, P.69] "Minification is the practice of removing unnecessary characters from code to reduce its size, thereby improving load times. When code is minified, all comments are removed, as well as unneeded whitespace characters (space, newline, and tab). In the case of JavaScript, this improves response time performance because the size of the downloaded file is reduced. ... The most popular tool for minifying JavaScript code is JSMin (http://crockford.com/javascript/jsmin) ... Gzip compression has the biggest impact [about 70%], but minification further reduces file sizes [by about 20%]." "The savings from minifying CSS are typically less than the savings from minifying JavaScript because CSS generally has fewer comments and less whitespace than JavaScript. The greatest potential for size savings comes from optimizing CSS; i.e. merging identical classes, removing unused classes, etc."
Remove Duplicate Scripts
[HPW, P.85] "It hurts performance to include the same JavaScript file twice in one page. This mistake isn't as unusual as you might think. A review of the 10 top U.S. web sites shows that two of them (CNN and YouTube) contain a duplicated script. ... The two sites that have duplicate scripts also happen to have an above-average number of scripts (CNN has 11; YouTube has 7). ... One way to avoid accidentally including the same script twice is to implement a script management module in your templating system. ... While tackling the duplicate script issue, add functionality to handle script dependencies and versioning of scripts too."
Use a Content Delivery Network
[HPW, P.18] "The average user's bandwidth increases every year, but a user's proximity to your web server still has an impact on a page's response time. ... A content delivery network (CDN) is a collection of web servers distributed across multiple locations to deliver content to users more efficiently. ... CDNs are used to deliver static content, such as images, scripts, stylesheets, and Flash. Static files are easy to host and have few dependencies. That is why a CDN is easily leveraged to improve the response times for a geographically dispersed user population." Keeping in mind the Performance Golden Rule that says "Only 10-20% of the end user response time is spent downloading the HTML document. The other 80-90% is spent downloading all the components in the page", [Tests and Yahoo experience show that] components hosted on a CDN loaded 18-20% faster than with all components hosted from a single web server."
Use Server Compression
[HPW, P.39] "If an HTTP request results in a smaller response, the transfer time decreases because fewer packets must travel from the server to the client. ... [Using server-side preprocessing] to compress HTTP responses... is the easiest technique for reducing page weight and it also has the biggest impact. Gzip is currently the most popular and effective compression method. It is a free format (i.e., unencumbered by patents or other restrictions). ... Servers choose what to gzip based on file type, but are typically too limited in what they are configured to compress. Many web sites gzip their HTML documents, but it's also worthwhile to gzip your scripts and stylesheets, and in fact, any text response including XML and JSON. Image and PDF files should not be gzipped because they are already compressed. ... Generally, it's worth gzipping any file greater than 1 or 2K."
Add "Long Shelf Life" Headers
[HPW, P.32] "Browsers (and proxies) use a cache to reduce the number of HTTP requests and decrease the size of HTTP responses, thus making web pages load faster. A web server uses the Expires header to tell the web client that it can use the current copy of a component until the specified time. ... The Cache-Control header was introduced in HTTP/1.1 to overcome limitations with the Expires header. ... [it] uses the max-age directive to specify [in seconds] how long a component may be cached. ... If the browser has a copy of the component in its cache, but isn't sure whether it's still valid, a conditional GET request is made. ... The browser is essentially saying, ''I have a version of this resource with the following last-modified date. May I just use it?'' ... If the component has not been modified since the specified date, the server ... skips sending the body of the response, resulting in a smaller and faster response. Conditional GET helps pages load faster, but they still require making a roundtrip between the client and server to perform the validity check. ... Those conditional requests add up [easily adding over 50% to the response time for subsequent views of a typical web page.]" Configuring your web servers to generate headers that specify the predicted lifespan of each component will maximize cache effectiveness. Using versioning to make each component "immutable" will allow "far future Expires" (i.e. very long lifespans) to be specified thus avoiding unnecessary Conditional GETs.
Remove ETags
[HPW, P.89] "Entity tags (ETags) are a mechanism that web servers and browsers use to validate cached components. Reducing the number of HTTP requests needed for a page is the best way to accelerate the user experience. You can achieve this by maximizing the browser's ability to cache your components, but the [default] ETag header thwarts caching when a web site is hosted on more than one server. ... The problem with ETags is that they are typically constructed using attributes that make them unique to a specific server hosting a site. ... The end result is that ETags generated by Apache and IIS for the exact same component won't match from one server to another. ... If you have components that have to be [version-stamped] based on something other than the last-modified date, [customized] ETags are a powerful way of doing that. If you don't have the need to customize ETags, it is best to simply remove them. ... removing the ETag altogether would avoid these unnecessary and inefficient downloads of data that's already in the browser's cache."
Use Iframes Sparingly
[EFW, P.191] "Even blank iframes are expensive. They are one to two orders of magnitude more expensive than other DOM elements. When used in the typical way (<iframe src="url"></ifram>), iframes block the onload event. This prolongs the browser's busy indicators, resulting in a page that is perceived to be slower. ... Although iframes don't directly block resource downloads in the main page, there are ways that the main page can block the iframe's downloads. ... The browser's limited connections per server are shared across the main page and iframes, even though an iframe is an entirely independent document. ... With all of these costs, it's often best to avoid the use of iframes, and yet a quick survey shows that they are still used frequently. ... An alternative way to [use frames] with better performance would be for the main page to create a DIV to hold the contents of the [frame]. When the main page requests the [frame's] external script asynchronously, the ID of this DIV could be included in the script's URL. The [frame's] JavaScript would then insert the [frame] in the page by setting the innerHTML of the DIV. This approach is also more compatible with [frames] that take over a large part of the window and thus cannot be constrained by an iframe. The use of iframes is declining as these other techniques for inserting." ON THE OTHER HAND, iframes have been retained and even enhanced in HTML5.
Flush the Document Chunks Early
[EFW, P.191] While the backend server is generating its response to a web page request, the browser (and the user) must sit and wait. "In most cases, the browser waits for the HTML document to arrive before it starts rendering the page and downloading the page's resources." In order for the user to get some immediate feedback, and for the browser to start downloading images and the like, the backend server logic should flush an initial chunk of HTML back to the browser before starting any "slow" logic needed on its end to marshall up the remainder of the HTML. To do this, HTML 1.1 "chunked encoding" needs to be supported by both the browser and server, and the server logic needs to "flush" its output buffer back to the browser after an appropriate HTML split point has been reached. "This is exactly what's needed to combat the two shortcomings of a slow HTML document: blocked rendering and blocked downloads." ON THE OTHER HAND, early flushing and chunked encoding has many gotchas which can make it problematic in the real-world.
Reduce DNS Lookups...
[HPW, P.73] [EFW, P.171] "the Domain Name System (DNS) maps hostnames to IP addresses, just as phonebooks map people's names to their phone numbers. ... DNS has a cost. It typically takes 20-120 milliseconds for the browser to look up the IP address for a given hostname." Minimizing how many different domain names your page references reduces DNS lookup delays. "Google is the preeminent example of this, with only one DNS lookup necessary for the entire page."
...ON THE OTHER HAND, Shard your Domain
[EFW, P.161-8] Most browsers limit the number of resources that they will simultaneously download from any particular domain name. "Some web pages have all their HTTP requests served from one domain. Other sites spread their resources across multiple domains. ... sometimes increasing the number of domains is better for performance, even at the cost of adding more DNS lookups." Splitting resource downloads between multiple domain names (e.g. foo.com and www.foo.com) is known as Domain Sharding. Having a particular resource always associated with a particular domain name is needed to maximize caching. "Research published by Yahoo shows that increasing the number of domains from one to two improves performance, but increasing it above two has a negative effect on load times. The final answer depends on the number and size of resources, but sharding across two domains is a good rule of thumb."
Make Fewer and Thinner Image Requests
[HPW, P.10] [EFW, P.133] "Images are an easy place to improve performance without removing features. Often, we can make substantial improvements in the size and number of images with little to no reduction in quality. (1) If you use multiple hyperlinked images (say a row of buttons), [client-side] image maps may be a way to reduce the number of HTTP requests without changing the page's look and feel. An image map allows you to associate multiple URLs with a single image. The destination URL is chosen based on where the user clicks on the image. (2) Like image maps, CSS sprites allow you to combine images [onto one large compound image such that each individual image can be cropped via CSS] ... One surprising benefit is reduced download size. Most people would expect the combined image to be larger than the sum of the separate images because the combined image has additional area used for spacing. In fact, the combined image tends to be smaller than the sum of the separate images as a result of reducing the amount of image overhead (color tables, formatting information, etc.). (3) Optimizing images begins with creative decisions made by the designer about the minimum number of colors, resolution, or accuracy required for a given image [availing the use of lossy optimizations.] ... Once the quality choice has been made, use [automated] nonlossy compression to squeak the last bytes out of the image. ... fantastic open source tools exist for optimizing images. ● Start by choosing the appropriate format: JPEG for photos, GIF for animations, and PNG for everything else. Strive for PNG8 whenever possible. ● Crush PNGs, optimize GIF animations, and strip JPEG metadata from the images you own. Use progressive JPEG encoding for JPEGs more than 10 KB in file size. ● Avoid AlphaImageLoader. ● Optimize CSS sprites. ● Create modular sprites if your site has more than two to three pages. ● Don't scale images in HTML. ● Generated images should be crushed, too. Once generated, they should be cached for as long as possible. Convert images to PNG8 and determine whether 256 colors is acceptable."