faster Wagns

Warning: this is going to get geeky. It's an exploration of current and future server optimization for hosts supporting multiple Wagn websites. Noone who does simple wagn installation should have to worry about such things.

The new pack/modules system is pushing us to reconsider our hosting architecture. Without packs it was relatively easy for multiple Wagn sites to share rails processes. On our old servers, all sites shared all the same code but used different databases (or, technically, different PostgreSQL schemas). With packs, each site can use different chunks of code, so the old way won't work anymore. Moreover, the new Cloudstore deployment framework, which gives us dramatically simplified installation, maintenance, migration, and scaling, has pushed us to have greater independence of each site's data and management.

In this article I'm going to describe a bit of what we've done so far in our own hosting set up and invite the Wagn community into a discussion of where we should go. If you're reading this because you want to speed up your own hosting setup, be sure to see the Wagn in production card, which outlines some concrete steps you can take to make your Wagn site run faster.

Stack

This discussion will center around a conventional Apache2/Passenger stack. There are many choices for hosting rails sites, and to be clear, we did not come to our current arrangement by testing them all. In fact, it is quite likely that there are some potential performance gains to be had by using the likes of nginx or thin. Our rationale for using Apache2/Passenger at this point is that:

Apache2 is currently the webserver supported by Cloudstore, which, as mentioned, buys a tons for the Wagn community.
Apache2/Passenger is a dominant rails hosting setup that we need to support strongly
The performance challenges within Wagn are greater than those in the server setup, so it makes sense to prioritize server simplicity until those challenges are addressed.

One very simple improvement that we'll be making very soon (this week or next) to our stack involves updating to Ruby 1.9.x. We did not take this on in our original migration to Cloudstore because we didn't want to conflate it with the 1000 other changes in that release, but the recent ruby versions offer dramatic performance benefits over 1.8.x versions, which our servers have been using to date. Judging by performance on our development servers, it appears be using about a third less memory per process.

If you are using other servers/stacks, please pay special attention to the static assets discussion below. And let us know what you're learning!

Spawning

A little background for those who see the word "spawning" and think of salmon.

When someone loads a webpage, the request is handled by a "process" on a webserver that is sometimes called a "listener" or a "handler". When there aren't enough such processes to handle the requests coming in, the server "spawns" a new one. Spawning can take a long time when starting from scratch. It entails loading up all of Ruby on Rails, all of Wagn, and all the packs. On our current production server, for example, it takes about 5 seconds to load our full environment, and that's before the server has starting to consider the specifics of the request. Obviously, we don't want folks to wait over 5 seconds for webpages.

The other option is, of course, not loading from scratch. It is much preferable that your spawning be largely "pre-loaded". Doing this well is key to optimizing our site performance.

Our current setup is very inefficient in this regard whenever a site has not been visited in a while. Basically, extra processes for any given site can spawned very quickly after the first one is spawned, but processes for separate Wagn sites make no use of the fact that they share tons of code with other Wagn sites. So Wagn site A doesn't make any use of the loading of Wagn site B.

This is happening because Passenger's default spawn method loads an entire application as one beast. When a request comes for a website with no current processes, it reloads Rails, Wagn, and the site's packs, even if Rails and Wagn are currently in use for several other sites. Since each website can have different packs and different environments, they are treated as if they weren't related at all.

We're currently trying to use Passenger's "smart" spawning, which will mean the Rails framework can be pre-spawned for all the Wagn sites. Rails has a lot of code – much more than Wagn – so this should be a big gain. With luck, we'll have this working within the week.

In the long term we could do even better and make sure that everything but the packs themselves are pre-spawned. This would mean some rearchitecting of our code load order to make sure packs and site-specific data are loaded entirely separately from the core. And it may well mean developing an Apache module (mod_wagn) to handle the nuances.

Assets

With Wagn 1.8 we introduced full permission checks on all files (including images, javascript, etc). This is fantastic, but it means that dramatically more requests are being handled by Wagn itself, whereas in traditional rails applications these are managed by the server (eg Apache) alone. This is well worth it for the added security for sensitive material but quite wasteful when you consider how much time Wagn is spending checking permissions on cards that everyone can see! Our next step here is to let Apache handle the public stuff alone while anything restricted goes through Wagn.

If you're using a webserver besides Apache, please note that this could be an area that needs some work. That's because Wagn uses an apache module called "xsendfile" to give the file-handling responsibilities back to Apache after it has checked permissions. This frees up Wagn to move on to the next request. It should be possible to do something analogous in other webservers, but this is not a case that we've developed for. If you are willing to give this some attention, we will try to support you in the endeavor!

Other

Once we've plucked all this low-hanging fruit, we'll likely be back to optimizing Wagn's internals. It's clear that there are many opportunities in there, but we'll need to do more profiling/benchmarking before it's obvious where to pluck next. Here are some hunches.

Our nested permissions-checking has tended to thwart simple page caching, so to date we have focused on honing our card cache. As we grow, we will want to dig deeper and find more and more opportunities for speed. For example, when a page nests no restricted cards, we should be able to cache the whole thing. The challenge moving forward is to figure out how all that should work! For example, Gerry Gleason has proposed that Wagn inclusions might be handled as nested Rack requests.

As our Rules system matures, there will be caching opportunities there, and I suspect the views system will only offer more. Trickier, potentially, is WQL caching, but that may grow more important as more and more sites feature more and more dynamically queried content.

Finally, some of this may be pushed to the Wagneering level. One could imagine Snapshot cards that allow wagneers to take manual snapshots of a result and save it as card content, for example.

sharks

monkeys

platypuses

everyone

faster Wagns

Stack

Spawning

Assets

Other