Web App Development - Systems Architecture - API Building - Security Audits

Surviving the Dragon’s Den: Vertical Scaling

Posted by David in APC, PHP, echolibre, industry, performance
Monday, April 6th, 2009 at 13:39

According to wikipedia, the Dragon’s Den is:

a venture-capitalist television programme that originated in Japan where the format is owned by Sony. The format, which now airs internationally, consists of entrepreneurs pitching their ideas in order to secure investment finance from business experts — the “Dragons”.

As some may already know, in 2009 the television show began in Ireland on RTÉ ONE. This post covers the technical considerations encountered when a web site / application appears on national television.

Some background

[Just want the tech tips without the intro?]

One of our clients, Rent Collectors.ie , appeared on the Dragon’s Den (#ddire) on 2 April 2009. This site is hosted on a single virtual private server (VPS) so it has limited memory, CPU and disk space. It runs a version of Ubuntu, PHP 5.2.3, Apache 2.2.4 and a MySQL 5.0.45 (The famous LAMP stack). Those are the stable versions with the operating system version we are running and in order to keep consistency with dependency management, we stuck to it but you should always use the latest stable versions.

RentCollectors is a rather simple service with a good few database interactions and a complex but well designed backend for multiple agents from anywhere in the country to work with. Prior to that day, extreme performances weren’t a concern. Developing with caching in mind, making the site scalable if it has to be scaled, making sure it could hold high loads was more like a theory we applied to the architecture when developing it.

So what happened? Well a few days before the television show, the client warned us that that website would be under high pressure as it was going to be on national television later that week. I decided to make sure that the server, considering it’s hardware capabilities, would hold the load of being publicised on TV.

I started by benchmarking the website with Apache Benchmark to find out how much it could take. Turned out to be handling about 35 pages per second. That made my bat phone ring. 35 requests per second is all but acceptable for a website - that is going to be on national television. So knowing that we hadn’t turned on different levels of caching I realized that in a rather short amount of time - meaning that no rewrite was needed, I managed to bump the server to up to 185 requests per seconds without hogging the server (and still being able to access the website from other computers from other networks). 185 requests per second is not extreme, far from it, but for a single vps server, being to handle the load was the most important things. We could have simply plugged in a few other webservers and scale horizontally/out but I was convinced that we could hold the load with that single server. It was a challenge and a test.

Vertically Scaling a single VPS

So instead of scaling out, we decided to scale vertically/up. That basically involves adding CPU or RAM (memory) to a single system. What we did is we created more daemon processes to Apache and added more reserved(spare) child threads. This gave us the buffer we needed as the load was going up.

Interesting, but then we decided to change the MySQL settings to be able to cache more database results. That way, the MySQL query cache wouldn’t have to do all the work when the exact same queries would be executed. Before you go on your server and increase the query cache on your MySQL servers, please make sure that you know what you are doing. Having the query cache enabled and running is good when the tables do no change too often. This is good for webpages that generate content pages, navigations, footers, etc. If a table gets modified, the cache gets flushed. One way or the other, before you start using the MySQL query cache, I suggest you read the mysql documentation about it and in more general terms you should also read the MySQL Performance Blog, you will most likely learn new things about MySQL and performance.

So after optimizing the MySQL settings we realized that each public page were highly unlikely to be modified during the television show. So we took a different direction and as well as caching at the MySQL level, we thought that if we could rule out the MySQL server completely, it could be even better.

So we setup APC and used our EcholibreApc class. We first used APC to cache the bytecode and then we set the apc.stat setting to “0″. apc.stat is set to “1″ by default and this means that each time a cached script is requested, it verifies if the cache has changed then returns the cached version. By making apc.stat set to “0″ we skip that step meaning that it doesn’t try to see if the cached entry has been modified. Beware! It’s much faster but requires you to restart your webserver if you make a change to a PHP file and you want it to take effect. APC runs by default with the “apc.shm_size” set to “30″. This means that APC will be using 30 megs of your shared memory. Just remember that memory is cheap and it’s easy to add, in our case we made the shared memory size bigger in order to make sure that we could store all what we needed. So after all this, we only had the database working every 30 minutes. Which was reducing the response time and server load/memory usage considerably.

Here’s the very simple EcholibreApc class — APC utility — in case you are interested or want to use it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
/**
 * Echolibre Apc Class
 *
 * A very very simplistic class that acts as a layer to the
 * advanced PHP Cache.
 *
 * One of the reasons of this is that APC doesn't store arrays
 * that are more than "1" level deep. This class wraps around
 * json_encode and json_decode. If you want to know why we use
 * json* instead of serialize well you should run your own micro-
 * benchmarks. You could be surprised ;-)
 *
 * @author  David Coallier
 * @see     http://php.net/apc
 * @version 0.1.2
 * @license New BSD
 */
class EcholibreApc
{
    /**
     * Fetch a variable
     *
     * This static method will fetch a value
     * from the apc cache and either return
     * false if the variable doesn't exist in the
     * cache or a json_decoded array/object.
     *
     * @param  string $name   The name of the variable to fetch from the cache
     * @return mixed  Either bool false or a json_decoded object.
     */
    public static function fetch($name)
    {
        $var = apc_fetch($name);
 
        if ($var == false) {
            return false;
        }
 
        return json_decode($var);
    }
 
    /**
     * Store a variable
     *
     * This static method will store a json_encoded string
     * in the apc cache for the amount of time supplied by
     * the third argument.
     *
     * @param string $name  The name of the variable to store
     * @param mixed  $value The mixed variable to store in the cache
     * @param string $time  The time length to store the variable in cache.
     * @return void
     */
    public static function store($name, $value, $time = 3600)
    {
        apc_store($name, json_encode($value), $time);
    }
 
    /**
     * Delete a variable
     *
     * This static method will delete a variable
     * from the apc cache.
     *
     * @param string $name  The name of the variable to delete from the cache.
     * @return void
     */
    public static function delete($name)
    {
        apc_delete($name);
    }
}

So after doing the Apache tweaks, MySQL tweaks and the APC/PHP tweaks, we simply made sure the sessions were not read and wrote to disk by making sessions saved to /dev/shm (shared memory) and that was it for backend changes.

Final Precautions

We looked at what else could be done in order to improve the user experience. Helgi Þormar Þorbjörnsson and his expertise on frontend caching came in very handy as we decided to cache more data on the user’s browser, GZIP encode the content being sent to the browser, reduce the footprint of the javascripts and CSS stylesheets. So after compacting the multiple javascript files we had and running a few stress tests, we ended up with a website and a server that was in very good conditions and most likely ready to be on television.

The decisive night arrived, the Dragon’s Den was on television, our turn came, we started seeing connections to port 80 (netstat –nat | grep :80 | wc -l) going up, and from a few servers around the world I was still able to access the site as if no-one was online. about 15-20 minutes later, the connection count was gone back to the usual and no sign of downtime whatsoever. We had made it, without a single glitch. Dragon’s Den is on at 22h00 and after this test, it was time to sleep. A very peaceful night, much more peacefully than the number of connections rising on the webserver :)

You can leave a response, or trackback from your own site.

Comments (7 Responses)

As an avid watcher of the Irish Dragon’s Den, I think this is the first example of a pitched site *hasn’t* buckled under the pressure. Well done to the EchoLibre team!

So how many requests per second was the machine able to serve in the end? Certainly more than 185? ;-)

Great article, I love such war stories!

Martin Fjordvald

If you want to squeeze even more power out of the machine then consider switching away from Apache to either Nginx or Lighttpd. I run a website that, on a busy day, will do upwards of 1 million dynamic page requests. (as counted by google analytics) We run it on just two medium-level servers, one for the database and one for the php code. Our main problem was that we ran out of memory and the server started swapping, so by switching from Apache to Nginx (which uses fast-cgi) we managed to drastically reduce our memory usage since not every process had php embedded.

Alternatively the same can be done using something like Squid to cache pages as well.

We’re using APC with stat=0, but restarting the server isn’t necessary, you can clear the file cache with the apc_clear_cache function.

The other thing is you don’t have to encode the variable when you store it into APC — it’s automatically serialized when written and deserialized when read.

@gasper_k Actually the randomness of APC can be sometimes quite tricky between versions.

The version we had initially on that server had a problem that a few other people had that it wouldn’t serialize (store) arrays that were larger than a certain depth size. Thus the need to encode it as a large string to store in the cache. Also microbenchmark will show you that json_encode and json_decode is actually faster than serialize/unserialize (at least in php5.3).

You are right about the apc_clear_cache() function, completely forgot about it. Thanks for your comment, I’m sure our readers will be glad to get a hint about apc_clear_cache :)

@MartinFjordvald You are totally right :) We use lighthttpd for another project of ours that has to serve about 1000 pages per second (in theory) and in a distributed environment it works fine.

Squid is an interesting alternative, and I think you’d like Varnish :) If you don’t know it already it’s located at http://varnish.projects.linpro.no/

[...] handle the sudden number of visits. That is something that could have been orchestrated with your web agency in advance. The kid next-door can’t do [...]

Leave a comment




About this blog

We like to blog about things we're passionate about. We love PHP, MySQL, CouchDB, Linux, Apache - web development standards. We also like writing about building web apps and working with web technology.
You can email us on freedom@echolibre.com

Follow us on Twitter

Eamon Leonard - @EamonLeonard
David Coallier - @DavidCoallier
Helgi Þormar Þorbjörnsson - @h
J.D Fitz.Gerald - @jdfitzgerald
Noah Slater - @nslater
Court Ewing - @courtewing

 

 

 

echolibre limited is registered in Ireland, company number 451576. Directors: Eamon Leonard, J.D Fitz.Gerald. Registered Office: 64 Dame Street, Dublin 2, Ireland.