Details on the big rig

So now that it’s roughly working, here’s what we’re doing for LUGRadio Live this year. When you’re exhibiting shows can get a bit dull, just jumping out and asking folk about hosting.  With a community event like LUGRadio, a) this approach is pretty crass, and b) we probably know the answer from everyone there within about an hour.  So this time I thought we’d try to contribute to the event and show our studly Linux skills with a super large mobile gaming rig!

Nobody was going to be impressed if we wheeled in a rack of web servers, but I thought it wouldn’t cost that much more to turn our regular value server range, of which we build loads every week, into games clients, and to bring an old fashioned LAN party, monitors and all, along to the show.  And that’s pretty much what we’ve done – the plan looks a bit like this:

The plan is we would wheel in this rack, turn on the computers and they would instantly be ready to play the excellent Team Fortress 2 – a team-based shooter that is great to look at, and easy to pick up.  The problems I had to work out were:

  • the systems would be a pain to set up identically;
  • like most games, Team Fortress 2 is a closed source commercial product written for Windows, not Linux;
  • we have to run the whole thing, monitors and all, off one or two normal 13A mains feeds.

All of these problems pointed towards discless clients running the game under WINE – and it all seems to be working.

First I built the server – a dual-core Athlon BE-2400 with 4GB RAM, and a couple of extra network ports.  It runs the recently-released Ubuntu 8.04 Hardy release.  But it wasn’t quite a regular install – I made sure not to fill the whole hard drive because I wanted to make a second installation on the same machine which would be the client.  That way I could build and test both images on the same single piece of hardware just by rebooting and picking another partition.  Once I had the game working on the client image, and the server working on the server image, I got to the hard part – getting the game to boot and run from a discless system.

Our servers attempt to PXE boot themselves as a first boot option; in the data centre this lets our customers rescue their systems because they’re not reliant on the hard drive contents.  In the context of our gaming rig, the network boot is the only option to save power on the clients and so I don’t have to worry about systems getting broken on the day.  If somebody breaks a system, or it crashes, I just reboot it and there’s no harm done other than losing a seat for 3-4 minutes.  It also means I can bring along spares and there’s no set up to do – just plug in a system, turn it on, and it’s ready to play.

PXE boots work like this:

  1. the client system turns on its network interface and starts issuing DHCP requests
  2. the server replies with an IP address, and the address which the client should boot from (which will also be the server)
  3. the client does a TFTP transfer of a program to help finish the process, and runs it
  4. the boot program (we’re using PXELinux) transfers its configuration over TFTP, then (from the configuration) starts to pull over the Linux kernel and an initrd (all still over TFTP)
  5. PXELinux runs the kernel, the kernel runs the initrd, and the initrd mounts the root filesystem over NFS on the server.
  6. the kernel starts the system as normal, with no local storage required, hoorah.

As I said, I’d installed a completely normal Ubuntu system, and it needs a few alterations to run remotely over NFS.  A typicaly system needs to write to its PID files, log to log files and will complain or stop early if they can’t.  You can take a scalpel to your target root filesystem and just hack out the bits that generate errors but don’tThat way you end up with a system where basic functions like apt stop working and debugging becomes harder – you want the system to remain as normal as possible so that you can boot it normally to upgrade software at a later point.

So there are two main tricks I’ve used: the first is the linstub initrd which is an idea I had a while ago to make initrds a bit less painful to use.  Basically Linux initrds are usually built by a distribution to boot a single system in a single configuration.  I wanted to be able to boot lots of different machines on the same kernel, which would normally involve building lots of different initrds.  With linstub you can configure how the system setup works from the command line; so you can tell it where the root filesystem should live, what RAID arrays to expect and so on, and it tries to autodetect things where they’re not specified.  This is different from a normal initrd where most of these things are hardwired, and any slight change will result in the system not booting.  I am using it to NFS mount an otherwise regular Ubuntu root filesystem.

Next problem is that Ubuntu will start to complain very early on that it can’t write to the filesystem (which is what we want, remember, no writing to the disc means no broken clients), resulting in a broken X.  Fixing it is quite easy but took me a couple of attempts – I wrote a couple of horrible scripts to copy chunks of the filesystem into tmpfs mounts, then rebind the copied bits back to the original mount point.  Steam was a bit trickier as it has 7GB of game files, so my script needed to copy some of the files, but not the big game files – these were symlinked back to the read-only versions on the NFS share.  Yucky, but it works.

Finally we just use Gnome’s auto login on the client, stick a link into .config/autostart to fire up the game and the client machines power on, starts X, start Steam, start Team Fortress and connect to the game server for instant play with no Windows and no tedious installations.

This post has been a bit rushed, so I’ll make another one in a couple of days when we’ve finished building the client machines and have tested it out a bit more thoroughly, and make a more thorough run-down of the configuration in case anyone fancies making a similar rig.