Keep it Simple, Stupid.
We had the "luxury" of one week from design to implementation. As a result, the charming idea to demonstrate our "cloudlike" site based on modern technology was summarily put out of our minds. To be perfectly honest, given the time scale, the lucky chosen developer had a nifty precept to select the technology: "You have full choice of the technology, but you have one week." It would be PHP/MySQL, which isn't exactly everybody's favourite! It would, nevertheless, allow us to release a tested site within the tight time frame.
To adequately sustain the load generated by such an event, several servers would be needed. We then hit our first stumbling block: Gandi does not [yet] have a load balancing solution for the hosting solution! Nevermind, we'll use the old yet faithful round-robin DNS method, with a low TTL to be able to quickly remove a front-end server from production in case of an incident.
Our Linux distribution for this occasion would be Ubuntu 9.10 - because it is reasonably up-to-date, with the 2.6.27 kernel.
Small Servers - Go Forth and Multiply!
The best way to sustain the high loads for our solution is to split the functionality among several small servers. This way we would maintain a minimum level of "vertical" scalability (you can dynamically increase the memory and CPU allocated to a server), and the architecture provides "horizontal" scalability.
In this way we could easily add resources if we started to feel the pinch, and add or migrate shares from one server to another as load requirements necessitate. There are numerous advantages of using multiple small servers:
- each server gets a minimum of one core, burstable, of the CPU (yes, one whole core, even with one share -- that's new, by the way!)
- assured resilience, with shares spread somewhat randomly across a few hundred different physical servers.
- specifically in a virtualised environment, the memory performance is best with less than 1GB of memory.
- if you have 4 servers of one share each, rather than one big server of 4 shares, they can dynamically increase to 8x4 shares, or 24x4 shares with a reboot, and all of this without modification to the architecture. A big server, however, would only scale to 8 shares without a reboot, or 24 after reboot.
- resources may be easily moved towards servers that need it most
We commence with a simple architecture:
- 24 (!) servers of one share each, to manage the PHP website: 10 for the English site, 10 for the French site, and an additional 2 of each for IPv6.
- 2 servers of 4 shares each for replicated memcached to reduce database load and manage sessions.
- 1 MySQL server of 4 shares, which contains the pre-generated promotion codes (they actually all fit within memory, so the database itself should really be pretty bored doing nothing...)
After a couple of lovely overloads and a bit of code review, the database would finally be greatly spared by memcached (see the section "lightweight coding...").
One of our administrators would put his fingers to work on the site to create and configure 24 servers -- at the same time! Obviously the release of the hosting API or an admin interface function would have been welcome. (Thanks to in this case!)
Lock Down (somewhat) the Machines
A default installation always needs a few finishing touches. The very fact of opening a MySQL database on the "public" network made us a little edgy. So, swooping 'netstat' and shutting down non-critical services listening on public ports. With the help of tcp wrappers (hosts.allow, hosts.deny), all of the "private" interfaces are also locked down (sshd, mysql accessibly only from the web farm).
Finally it behooved us to pay close attention to the PHP code and MySQL queries; The safest way to avoid php code injections is to bind all the parameters after a prepare(). This also helps reduce load on the database when several execute() are called.
One important detail: since the site should allow a user to send an email to any "arbitrary" address, it was absolutely critical to limit its potential for abuse by some clever black-hat as much as possible. At the very minimum, the number of sent emails per promotion code was limited, in addition to very close monitoring.
Setup the Development and Deployment Environment
The sharing of data between the sites in effect adds a single point of failure, as well as a potential architectural bottleneck. As such, we decided to deploy the content of the site locally on each of the servers. We would use one server for developing and staging, and ultimately for the development and testing of updates. A quick script and some 'rsync' would allow rapid deployment across the entire front-end architecture. Simple! (some would say ;) )
A few moments before the operation, more as a precaution rather than a cure, all of the virtual machines from one to two shares. Using the statistics interface, from day one, one can see that the the virtual machines were essentially sitting "twiddling their thumbs" from boredom ;) :
It would have been cool, at this very moment, to reduce back to a single share per server, or make use of Gandi "Autoflex", or even given the actual load observed, set up scheduled flex for each hour to hand out the promotion codes! Unfortunately, with all hands on deck, we missed this opportunity to demonstrate this [econono-techno-ecological ;)] feature.
Lightweight Code is Worth More than a Thousand Beefy CPUs
Even though we physically had several thousand CPUs and a few Terabytes of RAM at our fingertips, Tuesday turned out to be somewhat chaotic and worthy of note here. After Monday, which managed the load very well, the "smooth" execution of our one and only SELECT COUNT brutally altered and became excruciatingly slow (300ms). We had naively thought that this "only" query, on a table held exclusively in memory, wouldn't be an issue. As such, it was executed on every page of the site. The multiple simultaneous accesses to the database, coupled with the UPDATE operations for the promotion codes, resulted in the database, despite the near-idle system performance, started causing database lock contention.
The usual knee-jerk reaction to such a situation is to increase the number of shares to support the load. It's great for a quick-fix temporary solution, but it's not enough!
A new analysis of the system, questions about the code, and the use (or salvation) of memcached resulted in recovering the optimal performance. Equally, a modification of the database queries used probably would have been prudent.
The moral of the story: the code, indexes, architecture (etc.) are the cornerstones of your ability to support usage load, and if they are "CPU friendly", they will save the day. Otherwise a catastrophe could be lurking, or at the very least, the unnecessary purchase of additional shares.
Also, as we said earlier somewhat tongue-in-cheek -- it's eco-friendly!
- 36 shares total, but we could have done it with less (*sniff*)
- 5% CPU usage at peak
- 4000 requests per front-end web server in the first minute of each hour (roughly 1400 requests/second total)
- a minimum of 11 seconds to hand out 1000 promotion codes.
- a maximum of 40 minutes to hand out the same number of promotion codes, during the Tuesday incident described above.