The holy grail of modern and fast web application development

Finding an optimal strategy for implementing modern and fast web applications has been likened to searching the elusive Holy Grail.
03.09.2013
Tags

Finding an optimal strategy for implementing modern and fast web applications has been likened to searching the elusive Holy Grail: all of the current ways to implement such a type of applications come with their own set of drawbacks.

During a recent workshop that we had with a customer, we discussed the different approaches for implementing modern and fast web applications. After demonstrating what is possible with client- and server side rendering approaches the following question has been raised:

Why do some companies who evangelized an all client-side JavaScript approach tend to move back to server-side rendering?

This question started a very positive discussion that motivated us to look into the reasons of some of those companies in detail.

In this article we will merely focus on different approaches to render pages within web applications - a detailed view on the features of current web stacks (or the patterns used therein) is not covered by this article. To distinguish the basic principles of the different approaches we focus on page rendering and speed.

Let’s start looking at the basic principles: The pages of a modern web application can either be:

  • rendered completely on the server (server-side): no JavaScript is involved in the page assembly whatsoever;
  • rendered completely on the client (client-side): the server only delivers a vast skeleton of the application’s markup that contains directives for loading the initial JavaScript and CSS, that will then do the page rendering;
  • a mixture of both (mixed): this applies to various approaches that are not pure client-side or pure server-side: i.e. the initial page assembly is done on the server and only fragments of the page are reloaded / updated via JavaScript.

Server-Side

In the early times of web development this was the dominant approach of rendering pages of web applications: all templating and page assembly was done on the server. JavaScript played little to no role in page rendering.

The initial request loads all that is necessary to display the a page: all markup, content, JavaScript and CSS. Given only a little variation in the page’s content, every request results to a comparable payload for the page itself (JavaScript and CSS can be cached on the client however). Fragments of the page that have not changed (i.e. header, footer, navigation, sidebars) will have to be transferred repeatedly.

Server side rendering is still widely used among current web frameworks (i.e. erb-templating in Ruby on Rails). The major drawback of a pure server-side approach is the full round-trip necessary on a page change resulting in a performance impact: a user’s request (i.e. clicking a link) will result in the page being rerendered completely (server-side caching might speed up the response though).

Client-Side

Client side rendering works a little different: the initial request only loads the basic layout (skeleton) of an application’s page and the necessary JavaScript and CSS. The JavaScript will then be parsed and executed by the browser and takes over the assembly of the page (i.e. using handlebars).

With this approach the initial loading of a page might take a little longer since the necessary JavaScript is larger (compared to the previous approach) - it will have to include business logic such as models, controllers and routing logic as well as the page’s templates. Content can be inlined in JSON but will be usually loaded asynchronously i.e. from a RESTful API returning JSON via JavaScript.

Since routing is handled by JavaScript, subsequent requests will only load new / updated content, which allows for smaller payloads for all further requests compared to the server-side approach. Unfortunately pure client-side rendering approaches usually has some drawbacks:

  • SEO: The bots of search engines usually have a hard time indexing JavaScript-only applications. There are ways to overcome this (i.e. with PhantomJS but this is additional work that is usually not necessary when using server-side rendering approaches)
  • Memory: The memory needed by a JavaScript application can easily add up to more than 100MB on the client by accidentally creating memory leaks (i.e. by accidentally implementing ghost views in Backbone.js applications)
  • Caching: Caching on the client is limited in most environments (localStorage defaults vary from 5MB to 25MB depending on the user agent) and is also hard to invalidate manually

Mixed

Implementation is not limited to pure server-side or pure client-side approaches. A mixture of both is perfectly possible and the degrees of what is rendered on the server and what is rendered on the client varies between approaches: i.e. only the initial page is rendered on the server and from there on a client-side rendering approach takes over the scene.

Combining server-side and client-side approaches usually results in two different code bases in different languages that need to be maintained: i.e. a Ruby on Rails based application (or Sinatra based API) with a Backbone.js based JavaScript front-end.

This is where rendr, the new kid on the block chimes in. It intends to give developers the freedom to freely chose what should be rendered where. The idea is to implement a base for creating modern and fast web-applications that overcome the known issues outlined before (i.e. Performance, SEO, Maintainability) and combine the advantages of both worlds.

Performance

Let’s look at client-side vs. server-side performance first.

In a blog post Karl Seguin pointed out that client side rendering must be slower by definition when it comes to page rendering.

He states that the initial loading time of page utilizing client-side rendering has to be slower since more JavaScript has to be downloaded, parsed and executed. Also additional HTTP-requests are required to load the content (only if they are not inlined as JavaScript Object - which would add to the initial payload on the other hand). Considering the hard facts by comparing the time it takes to load and execute all code he is probably right.

But there are ways to lower the amount of JavaScript initially loaded by modularizing your JavaScript (i.e. RequireJS) or by using bootstrapped models (i.e. as implemented in Backbone). Apart from that, caching is possible on a more fine-grained level on the client and on the the server (basically you can cache “everything” apart from raw data that changes all to often).

He continues to state that it doesn’t really matter if you transfer gzip’ed HTML or gzip’ed JSON (i.e. for a search results page). Leaving aside that content + markup is larger then content alone - he might have a point here as well.

Still with the client side approach you only load the application’s templates once (i.e. inlined in your applications JavaScript) and any subsequent requests will only result in loading the content as gzip’ed JSON.

That being said, I would not completely agree with his conclusion. There are several other aspects that might add to the decision of a pure client-side vs. a pure server-side approach and more importantly the time that passes until the user is presented with a page that he can interact with:

  • user base: If you know your user base and can be sure that they will user modern browsers and hardware the additional JavaScript execution times can be put into perspective.
  • initial page load time: Having a page that relies on a huge amount of data might be better off with presenting something to the user fast and load data asynchronously in the backend - instead of loading a page including all content rendered on the server.
  • business logic if your site relies on a fairly large amount of business logic you might be better of with modularizing this logic without forcing the user to load all JavaScript within the initial request (again: modularizing the JavaScript).

What to choose then?

Well then, what to choose performance wise? Pure server-side? Pure client-side? Mixed?

Good news is that you don’t necessarily have to chose between pure server-side and pure client-side approaches. For most use cases something in the middle (a mixed approach) makes the most sense and allows for the highest degrees of freedom.

What probably matters most is in terms of user experience is the initial loading time of a page - if your page loads to slow users will wander away. Something that twitter described as “time to first tweet” (the time a user has to wait until he sees the first tweets in a timeline) or more general time to content - the time necessary until the user sees a page he can interact with (additional content may be loaded asynchronously in the backend).

When comparing the performance of the different approaches outlined above in regard to initial page load it should be obvious that the client-side approach is going to be slower (as Karl already pointed out). You have more JavaScript that the browser has to load, parse and execute and there even is at least one additional request to retrieve the content. Compared to the server-side approach this will most likely be slower.

If the requirement is to get a fast web application the answer is fairly easy: comparing pure server-side with pure client-side rendering this battle is mostly won by server-side rendering approaches - under the premise you refrain from using too much JavaScript and you leave the performance of the user’s browsers out of the equation. But we where talking about fast and modern web applications, so you’ll probably end up with something in the middle. Considering time to content there are at least two advantages with a mixed approach:

  1. The initial page load is faster (less markup, no content), the user sees the page earlier and
  2. since the application’s JavaScript has been already loaded and cached, subsequent page request are faster (only the content and little markup has to be loaded)

This means that even with more payload being transferred (in total until the page has been assembled completely) with client-side rendering approaches, the user sees content earlier. This is especially true for modern browsers running on modern hardware. For a mobile related context or a user base that is known to have less powerful hardware and limited browsers a server-side approach might be the better choice.

So how does such mixed solution look like? Let’s examine some examples from the field.

Tales from the field

airbnb (Relaunch of their mobile app in January 2013)

airbnb rewrote their current Backbone.js (client) + Ruby on Rails (server) based mobile application in January 2013. They used Node.js + Backbone on the server making use of their newly developed rendr library.

The advantage of the new approach was a lower initial page load because real HTML being served by the server on the first request that is also fully crawl-able.

In extreme cases their previous search results page took up to 10 seconds until the results has been loaded completely. With their new approach they brought that down to 2 Seconds. This has been achieved by loading HTML directly and loading most of the required JS asynchronously. The user can interact with the page even before everything has been loaded (the time to content we discussed earlier).

Let’s take our search results page, for example. Under the old design, before any search results could be rendered in the client, first all of the external JavaScript files had to download, evaluate, and execute. Then, the Backbone router would look at the URL to determine which page to render, and thus which data to fetch. Then, our app would make a request to the API for search results. Finally, once the API request returned with our data, we could render the page of search results. Keep in mind all of this has most likely happened over a mobile connection, which tends to have very high latency. All of these steps add up to a “time to content” that can be more 10 seconds in extreme cases.

Twitter (Performance improvement in May 2012)

In 2010 Twitter overhauled their architecture completely to an all client side approach (#NewTwitter) relying heavily on front end rendering with a REST API delivering the content. This offered a lot of advantages but lacked support for server side optimizations, so that in 2012 they moved back to a rendering on the server. After that the initial page load times could be cut to one-fifth of what it was before.

Twitter did not completely drop the client-side rendering approach, they still bootstrap a modular JavaScript application, that handles all the interactivity after having rendered the page initially. This application makes use of CommonJS and Asynchronous Module Definition (AMD) to only load those resources that are explicitly used. The assembly is done by a tool similar to RequireJS. Most of the client-side infrastructure is handled by a new framework that twitter developed called Flight.

It’s agnostic on how requests are routed, which templating language you use, or even if you render your HTML on the client or the server

The most important metric for twitter is time to first tweet (the time it takes from clicking a link until the first tweet is visible in the timeline). This has been measured utilizing the Navigation Timing API. With their previous architecture (pure client-side) the user did not see anything until the JavaScript has been downloaded and executed - older hardware and older browsers also had an important impact on the time to first tweet. By rendering the page completely on the server initially, they where able to cut the time to first tweet to one-fifth of what it was.

Basecamp (Basecamp Next in February 2012)

Basecamp used another approach that is a little different from the above outlined client-side rendering. They used Stacker (“an advanced pushState-based engine for sheets”). Comparable with the client-side rendering approach the very first requests loads all required CSS, JavaScript (and other assets). Every subsequent request will trigger a HTTP request that returns only the HTML that changed (not the content). Since the HTML is assembled server side, you save the JavaScript compilation step in the browser (so plain HTML instead of JavaScript MVC + JSON). The Stacker engine was built by Basecamp purposely for the sheet-based UI that they have. But the general concept is available with pjax (https://github.com/defunkt/jquery-pjax). They also make heavy use of caching page elements down to single list items to achieve the speed they currently deliver. Even though they did not go the full MVC JS approach with Basecamp, they are using an MVC approach for their calendar.

Conclusion

So what is the Holy Grail approach of developing modern and fast web applications? The most realistic answer is: it depends.

Leaving performance optimizations techniques (i.e. caching, domain shardingSPDY integration) aside you have to carefully look at the nature of your application: There is certainly no one-size fits all solution.

What kind of application are you going to develop? An application that consists of one single view? Something like Google Mail? Or a more complex application with a multitude of different views? What kind of performance is important to you? Overall page speed? Time to content? Scalability?

Apart from that other factors might be important, i.e. what are the existing skills of your team?

Event though the three examples from airbnb, twitter or beasecamp all outline different technological stacks they share common ideas:

  • get the time to first content (or time to first tweet respectively) down to a minimum, mostly by rendering the initial page on the server
  • present the user with a highly interactive interface by using a client-side approach where necessary. This happens to varying degrees (from replacing HTML partials within a page) to more complex client-side application.
  • try to reuse as much as possible on the client and on the server.

I guess the last point is probably the most important one. One of the nicest thing of the approach that airbnb chose is that you can use the same language on the client and the server and that you virtually shift parts of application freely between client and server by only maintaining a single codebase. From all the mixed approaches presented this seems to be the most interesting one.

As stated before the rendr library is still pretty new and it’s yet unclear where the journey ends. We still believe in the proven stack of Rails (or Sinatra) with a Backbone.js front-end for future applications but will definitely evaluate the rendr-appraoch in one of the next projects.