Attention hosting companies: This software builds your business

Dear colleagues and competitors in the hosting industry,

Our businesses are built on an enormous foundation of free software.

Web hosting remains solid, but we’ve all lost customers in the last 10 years to proprietary email platforms, and have just collectively just accepted it. Mail solutions based on exclusively free software are looking second rate compared to GMail, Exchange and other vertical systems where we can’t add value through our expertise, or give users the choice of where they host.

Facebook and Twitter continue to use their muscle to wind internet messaging back to the 1980s. That was a time when sending a message between big commercial networks was a privilege and not a right. So you had multiple addresses, or people you had to pay to talk to, or people you just couldn’t talk to because those big networks wanted to lock people in. We’re getting back to that state again now.

In 2012, Mozilla publicly pulled staff from future development of Thunderbird. It used to be the best email client, bar none. But it had fallen behind before, and now looks like it’ll continue to do so, given how large a code base it is.

All of this bad news means hosting companies lose business. Users see a better experience with one of the big guys, moving towards them and their proprietary platforms. Among many other things, we all need to see a better free email client, and the prospect of anyone starting afresh seemed pretty remote.

Geary: A beautiful, modern, open source email client.

Geary: A beautiful, modern, open source email client.

Geary is a brand new desktop email client – in the early stages of development.

It’s being developed by a San Francisco-based non-profit called Yorba – you may already know them – they’re the talented and proven group of hackers who built the slick photo manager for the Linux desktop called Shotwell.

Yorba simply want to put beautiful, functional, software out there, for free, with no strings attached, and with no plans to lock away the best features for paying customers. They’re funded by donations and consulting work and have turned to a crowdfunding campaign to pay for the development.

That means Yorba are writing more of the type of software that will continue to build all our companies. When clearly talented people come together to help us, just for the love of what they do, we must support them to see results.

They’re looking for $100,000 to finish Geary. Bytemark have pledged $2500, and if only a handful of us do the same, we have a great chance of seeing an amazing new email client coming to fruition in 2013.

Geary will help us all to sell more servers, and to grow our industry on open standards, not limited interoperability with giants. So please take a look at their project and donate what you can.

Thanks for reading,

Matthew Bloch
Managing Director, Bytemark Hosting


AMD’s new fight, and why we love Bulldozer

The commentary around Bulldozer, AMD’s latest processor line, is that it’s disappointing, a catastrophe, absolutely positively awful and so on, for miles and miles. And it’s a shame for AMD that Ars, Extreme Tech and the usual suspects have no imagination beyond their benchmarks when it comes to judging processors.

The numbers don’t lie of course – the benchmarks show that Bulldozer’s best is slower than Intel’s, by a long way, and most of those benchmarks care about absolute amount of data processed, numbers crunched and so on. Ars Technica concludes:

AMD compromised single-threaded performance in order to allow Bulldozer to run more threads concurrently, and that trade-off simply hasn’t been worth it.

But everyone’s performance tests make an assumption: that the processor is working for one person at a time, and that person wants it to crunch through as many numbers as possible for their benefit. Those numbers might be rendering frames of Battlefield 3, editing a huge photo or ripping a CD. Benchmarks sum it up the same way – however many ways a single job can be split to use a processor’s cores, they’re only interested in who gets to the finish line fastest.

And Intel still wins, I get it. Even AMD admit in a recent interview that their older processors still perform better for a lot of people:

We understand our customers make purchase decisions based on how they use their PCs, and in many cases our AMD Phenom™ II processors are a great (purchase).

They are struggling to pitch themselves against Intel to the gamers, power-hungry desktop PC users and the benchmark sites. They know they can’t get the same excitable press any more, not for years and another generation of processor. That’s probably why they’re temporarily giving up this fight on pure brawn, laying off hundreds in their PR and marketing departments a couple of weeks ago.

But instead of doing one thing for one person, let’s instead assume that a processor is under siege from 300 warring factions, all wanting to run separate and unpredictable work loads. The benchmark that interests us is: assuming we are running 299 “hostile” jobs, how quickly does that 300th job complete? If we vary those 299 jobs in nature, how reliable is the performance of that 300th? The time of that one lonely, slow, job is what I’m interested in.

For BigV where we are running a massively “multi-tenant” system, we really are planning to put in the order of 300 different customers on a single server. It’s far more important to have a reliable average performance for a virtual machine than the absolute fastest possible performance. I’ve not (yet) seen any benchmarks that test in this way, but it seems to us that more separate hardware cores must achieve this goal better than a single software-switched core.

If that weren’t the case I find it hard to understand investment in massively parallel server systems like the 768-core Atom server or super low-power 480-core ARM systems. Big multi-tenant systems don’t need to be the fastest, they just need to perform consistently.

For BigV, we just want more cores. The speed is almost irrelevant. While AMD’s performance is within the same ball park as Intel’s, they’ll work out very nicely for us virtual machine-mongers. And the low price of Bulldozer chips is just icing – a 32-core, 128GB system is extremely affordable and helps us keep our pricing to customers low.

AMD are gearing up for a different fight, and they don’t need press from the benchmark sites to prove them the winners. So while my gaming PC will stick with a Phenom for a while yet, BigV is going to be using an awful lot of Bulldozer chips in the near future.

The cloud is your install script

Why people jump from shared to dedicated hosting

In the beginning, there was “shared” hosting. And that was all the hosting there was. A UNIX beard would give you an account on their big UNIX server, and set up Apache for hundreds of users to put their web pages, CGI scripts and PHP. The Apache server invented hosting, it still works the same today as it did 15 years ago, and it’s still everywhere.

But demanding customers eventually hit a problem – a shared hosting platform only has one Perl or PHP version on the system, and eventually the UNIX beard has to upgrade it to make new users happy. Older customers suddenly “feel the beard” behind their hosting and their stuff breaks. That makes them grumpy when the beard hadn’t needed to intefere for years. It happens all the time, and gets us a steady stream of new customers.

So for a few years, the solution was to go Dedicated – pick a big company who would buy dirt cheap servers and rent them, whole, by the month. You can ditch the UNIX beard who broke your site, hooray! He also probably backed the server up, helped solve your programming problems, and worried about the server’s up time, but you won’t notice that until your server goes wrong in 2-3 years time.

This is about where we started our business in 2002, and the very clever User-Mode Linux project. We used it to offer dedicated-style hosting at £15 per month, where much larger companies were charging 3 times that price. And because it was grungy and unpolished and you had to be a kernel wizard to use it, the big hosts stayed away from our business a good long time!

We even went into Dedicated when we had the money, because we wanted people to be able to grow beyond VMs (It also helped that Pete didn’t mind driving 400 mile round trip to London – a lot).

Why people want to jump from dedicated

So we’re really proud to still have some customers running Redhat 9 systems, doing old, important things for them. (But hey – you two guys – I’d check your firewall is nice and tight). And we’ve never had a “throwaway” attitude to dedicated servers. Lots of hosts tell you there’s nothing they can do for your dedicated server when it breaks, and here, have a new one.  We occasionally spent hours scraping a half-dead hard disc for a customer’s un-backed up data. Keeping the same system going is really important to us.

But that’s the reason customers run screaming from Dedicated servers: they break and fail and you suddenly miss the UNIX beard who might have got your files back from last week. So our dedicated servers have always been come with “added beard” when required, just to keep them going a little bit longer.

Help! Back to shared hosting again!

Sometimes customers add more dedicated servers, or hire their own beard. But plenty feel that there’s something new in “the cloud”, which means either:

  1. A “platform-as-a-service” cloud. Or our old friend shared hosting. It’s not as simple as the old days with Apache, but you are sharing web servers or databases with other customers, you’re back to relying on our old friend the UNIX beard. This time he has brought an complicated, proprietary set of hosting scripts with him. You can’t have them of course, but he promises that even though you’ll pay more, you’ll save by not employing a system administrator, or needing to worry about performance. Unlike most shared hosting, he may not even own or have access to the hardware he’s deploying for you.
  2. If you still like your own sovereignty, you might pick an “infrastructure-as-a-service” cloud. They provision virtual or dedicated servers whenever you ask for them, and route you IP addresses and run shared services for things like DNS. That’s what I used to call a “hosting company” and I reckon they’re selling, well, dedicated servers again.

There are no hosting services out there that don’t fall into one of these camps, but it’s a basic distinction as to how complicated your infrastructure is, and who can screw it up for you.

The UNIX beard running his university Apache server in 1995 was the original “platform as a service”. And anyone renting UK2′s cheap servers ten years ago, when they were the cheapest option, were “infrastructure as a service” pioneers.

But no no, my application is really in the cloud

You’ve got up-to-the minute live replicas of your important data? Backups too? Your install is scripted, and rehearsed such that when your hosting provider goes down, you can simply redeploy elsewhere at a moment’s notice? You use chef or puppet or cfengine and you never make any manual changes to your server, that you haven’t tested and pushed via a source control system? And you’ve got accounts with multiple service providers?

If so, well done, you must be 1 in 1000 hosting customers. You realise that the real cloud isn’t your hosting provider, it’s every hosting provider, and your ability to both use the best of them, and depend on none of them. The cloud is your install script.

In reality, most companies without a dedicated 24/7 operations team just aren’t ready. They bed their services down in a reliable hosting provider, have a critical process or two here and there, graft on a server for a new business function and pay for 24 hour support. Sometimes they can even run their entire business successfully for years on a server, or a single redundant pair, and nothing goes wrong!

And the reason isn’t because they’re slapdash, or lazy, or stuck in the past, it’s because most businesses aren’t Amazon and don’t ever need that kind of scale, and preparing for it reduces your overall reliability and up time.

Maciej Ceglowski, creator of antisocial bookmarking site pinboard.in said this about modern hosting practices in an interview a few months back:

I believe that relying on very basic and well-understood technologies at the architectural level forces you to save all your cleverness and new ideas for the actual app, where it can make a difference to users.

His site runs on a handful of very large servers, and managed to handle a huge influx of users (caused by the bungled announcements around the rival delicio.us bookmarking service being closed) without needing any fashionable just-in-time hosting provision.

And John Kozubik of rsync.net wrote a lament two years ago about the failure of a rival caused by overcomplicated architecture:

When you don’t have stars in your eyes, and aren’t preparing for your IPO filing and the “hockey sticking” of your business model, you can do sensible things like keep regular files on UFS2 filesystems on standalone FreeBSD systems.

This is, of course, laughable in the “real world”. You couldn’t possibly support thousands and thousands of customers around the globe, for nearly a decade, using such an infrastructure. Certainly not without regular interruption and failure.

Except when you can, I guess.

…and illustrates with examples of two of his servers that have been up for 350 and 950 days respectively.

BigV: proudly built for worst established practices

These are the people for whom we built BigV (not literally those two people, that would be a bit stalker-ish).

With BigV we might say it’s built for worst practices first, because we know they last longer than you might want. We know that relying on single servers for a while is more reliable than trying to build for Google-size overnight. We want to make it easy to throw up a server, back it up, back out of configuration mistakes and “push the walls out” when the server gets too small for the load. Right now you can go up to 120GiB RAM and 40TiB disc in a single server, and after the beta we’re not stopping there.

We can do that because we’re a lean company with lots of nice customers, not a huge number of staff, and we have plenty to drop on HP kit. HP do some very nice kit, and we want you to be able to build monster machines, and not be stuck having to glue together 8 or 16GiB machines with a cheaper hosting company. We want you to be able to use private VLANs, SSD storage, huge amounts of RAM, arbitrary disc snapshots, and all the toys that used to need expensive dedicated servers.

It’s not that you can’t build a minute-by-minute scalable cloud with lots of CPUs if you need them – you can! Look! But we recognise that most users won’t ever need to, and so we’re trying to replicate that “zen garden” isolated feel of a Dedicated server, without any of the overheads. We’re still doing the “big cloud” stuff, but looking at it from another angle: that you’ll start your site small and simple, and only maybe need to grow it big and complicated.

Our beta is still open for signups, and we’re both expanding our cluster and accelerating the pace at which we send out V-Keys. We’re looking forward to seeing what our customers build when they can shake off the vanity of scaling before they need to, and deliver their sites with a single cloud hosting company they trust.

On V-Keys, not trusting your security vendor

I know, I know, we’re still polishing BigV and it’s nearly August. Thanks to the folks who came and saw us last Friday – we had a marvellous party and briefly showed off some of BigV’s tricks.

While we’ve been hammering away at stability, documentation and bug fixes, I wanted to explain why we’re adding one last security feature.

BigV accounts will have access to shedloads of computing power – even in the beta, that’s hundreds of gigabytes of memory, and multi-gigabit network connections. And our beta is free-of-charge, so our users aren’t limited by how much money they might want to spend. Because of that, we didn’t want to allow our users accounts to be compromised - from day one. So we bought a load of these:

vkey shot

They are V-Keys and give us the magic of two-factor authentication.

One-factor authentication usually means you supply a secret to prove who you are. A username, password and certificate are all just data, and have to be stored in a file. If a bad guy can quietly copy that file they were stored in, he can pretend to be you straight away. With a physical token like our V-Key, you can’t copy it. You will need a V-Key to use BigV (at least for the early stages of the beta), and an attacker would need to physically steal it from you to "borrow" your account. 

V-Keys are easy to use. You retain a username and password, but our high-security accounts (which will likely be all of them, to start with) will require you plug in your key and press its button. It acts like a keyboard, and "types" a one-off password to our servers every time you activate it. The password proves to the server that you have a particular V-Key in your possession. The key can’t be copied, even if you’ve plugged it into a computer controlled by a bad guy. Each password can only be used once.

If you know about computer security, you’ll recognise these as Yubikeys with a sticker on. But they’re not factory-fresh. We’ve reprogrammed them to only work with BigV, because they’re more secure that way.

Yubikeys are a great product, but they trade off the best possible security they could offer for convenience. Instead of demanding that their customers manage their own keys, Yubico program them for you, so that they can verify them later over the internet, and save you the bother of setting up your own servers. That means Yubico keep a giant database of every key they’ve ever manufactured. They also have the albatross round their necks of keeping the giant database online at all times; if it goes down, none of their customers can log in. Worse, they also need to keep it secure from hackers. The more successful Yubico becomes, the more hackers will be interested in their database. If a hacker got a copy of Yubico’s database, he could fake any Yubikey that was ever issued.

It seems likely that when RSA, the market leaders in security tokens were broken into (only a few months ago), their own database of keys might have been stolen. RSA must have had a pretty good reason to offer to replace any token with a new one, but have still not publically admitted the depth of the breach – as a customer you have to take their word for it that they have responded in your best interests.

So if veteran security vendors can’t keep their customers’ secrets, I’m not going to trust anyone else to do the same.  Yubico’s servers are as big a target, and eventually we would start to worry that another company has the ability to compromise our customers’ hosting.

I’m not bashing Yubico – their keys are easier to sell when they do the programming, and it’s still more secure than a simple password. But the reason Yubikeys are a better product than RSA’s SecureID is that 1) Yubico tell you exactly how the keys work, and 2) they give you tools to reprogram them. We don’t need to trust the people that sold them to us as much. And that is something security vendors should compete on.

Back to the future (of hosting)

Thanks to the all the folks who’ve signed up for our beta at http://bigv.io/

We’re polishing BigV to a shiny finish before the beta software goes out in June.  It’s not the finished product, but we think we’ve got enough to show off how the future of hosting should work (cue trumpets).  We have 512GB of RAM to play with, 64 CPU cores, 12TB of regular discs, and 3TB of SAS discs.  We may be pushed for space to accommodate all the beta testers, because we want those people to have the run of the hardware.

I say "the future of hosting", but unlike lots of other hosts, Bytemark are still going to be selling servers, just like the old days.  But BigV will give you flexible billing, so you can fire up and pay for a server for only a few hours – if that’s all you need. It will bring flexibility, so you can change your servers’ RAM or disc space instantly too.  And it will bring resilience: so we will have the capability to shift customers’ servers around our cluster of hardware, if we think any of it is going to fail, or needs maintenance, or an upgrade.  If we’re feeling particularly clever this might become automatic.

Too clever by half

But as Amazon’s monstrous outage shows (an outage that would bury any other host’s reputation for reliability) it is possible to overthink failover protocols.  We recognised this as the biggest risk with BigV and I thought you might be interested to hear about our architecture in more detail.  The promise that "it’s a magic cloud, and you don’t need to worry" won’t persuade anyone for much longer. You’re going to want to know the risks you’re taking by subscribing to a big virtual hosting platform.

I will shock you to your core by telling you that Bytemark’s current virtual machine product is a set of hairy Ruby, shell & Perl scripts. It was originally written in about eight weeks while I was being under-stimulated in a temp job in 2002.  The scripts got passed around in "maintenance mode" for many years and have survived various attempts to "rewrite them properly", including one to use Xen (we dodged a bullet there).

But they do a lot, and we all know how they work, and how they fail. More importantly their virtual machines’ uptimes are mostly in the hundreds of days, spoiled only by the occasional hardware upgrade.

The same, but better

This is still what we want – long up times, permanent discs, and easy upgrades.  And we still think that’s what our customers will want.

The most important things we wanted to add were:

  1. reliable live migration – so we can upgrade our hardware without the laborious work of emailing customers, and spoiling their up times;
  2. VM snapshots – so a customer can "checkpoint" their whole system  before a major upgrade, and back out if it goes wrong;
  3. access to all of KVM’s great features – graphical consoles,  installations from CD, direct network access, and anything else they’d be able to do if they had the server in front of them;
  4. a handy tool for provisioning, server upgrades and maintenance; a uniform interface to the software, rather than the 1980s text  console you get at the moment (I should say "as well as");
  5. really flexible storage, so that servers could use terabytes, not  just a few tens of gigabytes;
  6. a sane software development process and test rig, so we could add features to our live system without errors.

With BigV we’ve turned our simple system into a distributed one, full of features and with (maybe) the minimum possible complication.  To do this, BigV has three types of servers instead of one:

The Brains hold the database of all virtual machines in a BigV cluster, and run the gateway for customers to issue requests for servers.

The Heads are packed full of CPU cores and memory, and run the KVM processes, aka virtual machines.  They don’t have any storage.

The Tails are high-spec servers, but have RAID cards, a normal amount of RAM, and lots of directly-attached discs which can be hot-swapped.

The heads and tails are always connected to the brains, and one of the brains takes on the role of master brain.  That’s the one that keeps a complete list of every virtual machine and disc in the cluster, and that’s what you (the customer) talk to when you ask to provision a new VM.

The heads and tails are also connected to a 10-gigabit storage network, so that the KVM processes can talk to their discs really quickly.

The brain can decide to move either virtual machines or discs between any pair of heads and tails, without having to reboot affected systems.  So that gives us our hardware nirvana - no live customer system need ever be tied to a piece of hardware again.

How it’ll screw up

I’m still mapping out the ways in which this system will break, and may find out a few more during testing – the main one is network segmentation.  We rely on a lot of different local networks between the heads, tails and brains.  If those get misconfigured, the worst hazards might be bringing down every VM at once or freezing disc I/O.

If a head is disconnected, the brain can simply spread its VMs around to other heads.  The rule at the moment is that this happens after an unexpected disconnection period of two minutes.  If the head gets back in touch, nothing happens.  If it doesn’t, the head is under instructions to kill all of its VMs, and the brain will assume this has happened.

With tails, if your data is stuck on a "broken" tail, your data stays disconnected until that gets fixed.  However there are humans in this system too, and well-tested RAID setups, and redundant power supplies, and mirrored memory, and SAS switches.  We know how disc-based systems break, we monitor them, and we’ll fix them.  We’d rather run the risk of a couple of hundred VMs going down in rare circumstances, for a few minutes, than risk people’s data with an automatic recovery mechanism.

In a situation where random cables are pulled out of a BigV cluster, and then put back again, we expect affected VMs (which could be all of them, depending on which cables are pulled!) to do one of two things: freeze or reboot.  Nothing worse.  And it should be a stable system – once it’s put back together, everything will reboot the way it was. There are a few safeguards to prevent two copies of a customer’s VM from running, but I’m still trying to imagine the kind of failure that would make this possible.

Soon, soon…

That still sounds really complicated, but still as simple as I could make it while fulfilling our objectives.  Assuming it works well in testing, BigV will open a lot of doors for us commercially, letting us offer servers that nobody else is.  Plus we have all the benefits of an in-house technology.

The beta is still on for June, so if you’ve not already expressed an interest, head to http://bigv.io/ and do it!

What if the web browser comes second?

The web has the largest reach, and is the front to most mass-market software projects – but have you noticed that even the newest web browsers still can’t do some really basic things?  I think that developers who focus only on the web browser are skirting round some great ideas just to avoid that old-fashioned, native software development.

What can you do in a browser?  Video, some basic games, a word processors, endless to-do lists; most of what you could do in the 1980s but without having to wait to load anything from a casette tape, and not worrying as much about how you were going to save your work.  That’s quite good, that’s a bit of an improvement (from the 80s).

Most advances in the 90s haven’t been implemented very well by web browsers yet, and so lots of software developers live without them.  Here are some things that my Acorn A3000 withcould do in about 1993, compared to web-based software now:

  • multitasking – Chrome and Firefox multitask about as well as in 1993, i.e. just fine until one program (tab) goes crazy, and all my programs (tabs) freeze up.  At least I only need to restart my web browser when that happens, and not the whole computer.
  • 3D graphics – we have WebGL! The linked demos worked on my bleeding-edge browser.  For a bit.  Then the browser froze and the tabs crashed.  Nobody has done as ambitious or exciting a 3D game on the web as Star Fighter 3000 (that’s not to say Chrome can’t render 3D graphics, but nobody has chanced their arm on a whole game – yet).
  • multiple programming languages – pick any one you like, as long as it compiles to Javascript! Google have NativeClient but it’s not a sure bet to succeed yet. Or you can run your own code on your server, and only have to do the “fast” bits in Javascript, which is pretty ungainly.
  • complex creative tools - if you want to lay out pages for print, or compose music, or draw complicated diagrams and multi-stage artwork, and have these programs work together to produce a finished product between them the Acorn still wins hands-down on quality and depth of software.  Even cut & paste between Google Docs is fraught; even when you use it in Chrome.

I’m 90% certain the web will be the way to go, in the long run. All this stuff is being worked on, reinvented.  But in the mean time software developers need to consider looking beyond the web.  Maybe the web browser can come second to a native application, not first?

By thinking about the actual computers people use, and not the wobbly Hypercard clone that sits on top of most of them, there’s still masses to be done. There’s rich, brilliant services to be delivered that haven’t even been invented yet because of the perceived difficulty of writing “native” software.  But look at companies who roll their sleeves up and got on with it:

  • Dropbox is making a killing on their file sharing service, which is delivered through a native client on stodgy desktop PCs and Macs.  That can’t work without little hooks all over a user’s PC, but they need those hooks to make the magic folder work. There’s no other way.  Off the back of this peerless integration, they effectively become a hosting company, photo sharing service and a data synchronisation API for programmers.
  • Skype‘s VoIP software and network have not been copied by anything on the web, at least not as well. Again, look at the necessary integration – webcams, subtle notifications, nice snappy interface – none of these things can be done cleanly in a web browser.
  • Mozy (and others) install backup software on your computer and slurp off every file so they can give it back to you when you lose your PC. Web browsers don’t let their downloaded programs near your local files, let alone allowing them to create shadow volumes or filesystem wizardry.

Windows & Mac specific knowledge is scarce compared to knowledge of PHP, HTML and Javascript.  But if you have an idea that can only function with some desktop integration, native software is still the only way. There are cross-platform toolkits, languages, and build tools. Sure you have to test a lot more. And learn a lot more. But that’s the price of your software being able to do a lot more.

While development for mobile phones is big business if you’re selling anything “social”, the laptop & desktop computer is still where any truly creative and complex art is produced. Those creatives with massive screens, massive drives and fast connections want the best software; while they’re relying on the web browser to deliver to them, they’re often being sold short. There are still big opportunities outside of the web browser.

Why Bytemark didn’t use Xen

We’re circling in on a test release of our new hosting system, called BigV.  Bytemark have never thrown more hardware or programmers at a project for so long, so if you’re in our fan club you might like the results.  I’m a little shy of making promises just yet, but I’ll write about a few of the problems we’re solving, and give you a bit of history, and why I think Xen is fading away.

Bytemark got started in the hosting business with virtualisation, and knowing how emulators work.  My personal background was contributing to the PC emulator product on Acorn’s RISC OS (back in 1997), as well as a couple of Java virtual machines, so I know how to build an emulator, and what’s magic about them.

VMWare was king of the hill for years – they pulled off a genuinely amazing technical feat, and had a monopoly on practical PC-on-PC virtualisation from about 1998 to 2002, building a huge business from it.  We couldn’t afford their fees when we started in 2002, so went with User-Mode Linux.  That was a very practical (and free) way of running Linux on Linux, and along with a few other companies, we sold lots of virtual machines to geeks.

In 2004, Xen looked a-mazing, providing virtualisation on Linux by some clever source modifications, and a new hypervisor.

If you jumped through the necessary hoops, did a lot of work rebuilding your kernel and selecting your hardware carefully, it absolutely flew.  We even built a trial version of our virtual machine platform in 2005 that used Xen, and tried it with real customers for a few months.  It almost worked, but we were never confident to roll it out.  There were a few reasons, and bear in mind I’m talking ancient history here:

  1. We’ve always been picky about our hardware, and habitually had to build our own Linux kernels to take advantage of newer chips.  Xen’s slow integration into Linux put the brakes on this, and made us nervous that we would have to stick with particular hardware.  For years, it was really hard to make boot reliably on a variety of hardware, and our virtual machine hosts changed fairly regularly.
  2. Xen was never happy providing its unique virtualisation magic separate from its other aspirations.  There was (still is?) a compulsory management layer, a program called xend which used to crash and left you with no idea what the hypervisor was doing.  It had undocumented layers of code, and the only safe way to interact with it was through the xm command line.  When xend died, we were lucky if it restarted, we normally had to reboot the whole system, and all the virtual machines.  So it seemed impossible to make our control software reliable, and I needed some help.
  3. I collared Ian Pratt, one of Xen’s creators, after a conference talk in 2005. I asked whether XenSource were interested in helping a real hosting provider solve some problems with Xen and our few hundred customers.  Knowing the answer I think, he asked “sure, how many data centres and how much money do you have?”  Pffft, thanks.  I’d chipped in a little but assumed they had all the help they needed.
  4. I spoke to a prominent kernel hacker at (I think) LUGRadio Live 2007.  I asked about virtualisation, and Redhat’s recently-announced integration of Xen into their operating system.  To paraphrase, he did not think the results pretty, and confirmed my suspicion that Redhat’s libvirt project was simply paving a migration path for something better to come along.  It was built to avoid a lock-in to a technology the developers simply didn’t like.  A year later, they bought Qumranet and the KVM developers.  That deal was probably already being prepared the year before.

It was after this last conversation that I felt OK about leaving our systems on User-Mode Linux for a bit longer; even though it was slower than Xen, I didn’t want to compromise our management tools.  We kicked off our migration to KVM in 2009 and apart from a scheduled reboot each, our customers only noticed a better speed compared to User-Mode Linux.  We got a few more VMs onto each host as well, and didn’t have to make huge changes to our tools either.

I’m not an economic crystal ball gazer, but I do know code, and coders and when something new like this comes along, a better-integrated solution will win over the fast one every time.

In 2011, Xen imposes too much overhead on developers and sysadmins compared to the alternatives.  It had two unique selling points, and they’ve not been unique for a while:

Firstly, Xen brought speed through paravirtualization, but integration into Linux came too late.  While it was still a unique feature, it was a pain to use.  Between virtio in 2008 and Intel/AMD’s on-chip virtualisation features, it’s now completely unnecessary, even if the integration work is done.

Secondly, live migration was brought by KVM, also in 2007/8.

Xen’s “missing” feature, unmodified operating system support (i.e. installing Windows etc. off a CD), was based on the exact same code that KVM used, the completely brilliant qemu project.  So with the state of those three features, there is just no technical advantage to Xen codebase. That’s why I think in the next few years, most companies with Xen deployments will be replacing them with KVM – because KVM can do everything that Xen does, but makes it easier to supervise and write tools around.

There’s even more options than that - VirtualBox works nicely for desktops and small setups, Hyper-V works better for Windows, and of course VMWare is a lot cheaper than it was; they clearly have the lead in management tools if you don’t mind paying for them.  At the hackier end of the spectrum, lguest appears to replace most of what User-Mode Linux did – I’d be very surprised if some hosting companies weren’t using that already.

So that’s where we are with Xen – it was a great idea, it didn’t work for us, but obviously works very nicely for Amazon and lots of smaller hosts.  I’m glad Bytemark gave it a miss; we intend to provide the absolute best of the KVM emulator to hosting customers when BigV launches later this year.

The bumpy ride to IPv6

The internet is nearly full.  The space that has been used up isn’t physical – there are still data centres to be built and new web sites to put up, and increasing demand for both – it’s the address space.  Vint Cerf and the other researchers who built the internet’s foundations simply had to pick a maximum possible number of computers that could possibly want to communicate on this mesh of 1970s computer networks that were to be joined into one – and that number was 4.2 billion – coincidentally, about the world population in 1977.  That’s one globally interconnected computer for every person on the planet, at a time when a computer was a hobbyist’s luxury.  If the internet’s pioneers had imagined 1.7 billion possible users, maybe they would have erred on the side of caution.  Between mobile phones, web sites themselves, connections into every sufficiently advanced home, we have less than 400 million left, and there’s little doubt that we will hit a wall some time in 2012 (as predicted accurately and consistently for the last few years).

There is a thoroughly designed long-term solution, and it’s ready to go.  IPv6 allows for a million trillion addreses for every square millimetre on the planet.  That ought to be enough.  Probably.  The reason we’re not using it is simply that every device “on” the internet must understand it, and not everything does yet.  Telecoms companies don’t supply upgraded routers to their subscribers because there are no web sites that are only addressable through IPv6.  And no web site owners are going to set up their sites exclusively on an IPv6 address because they would not see any visitors.

There is a way to avoid everybody on the internet having to upgrade their equipment overnight (without limiting growth), but it’s not a great one.

NAT – the engineers’ terror

Network Address Translation is a compromise – if the post office had the same problem, it would be delivering your whole street’s post to one box at the end, and leaving people to come and pick it up from there.  Or in the bad old days before mobiles, one household of five people would share a single phone.

In the same way, network address translation means several devices share a single IPv4 address.  In a domestic setting, it’s a well-trodden compromise – a £20 router in the home can happily allow a handful of devices to reliably communicate, but at the cost of none of them being addressable.  So you can’t run a gaming server, or accept voice-over-IP calls, without separate workarounds for each program you want to run.

As the address crunch takes hold, broadband customers can expect to have their real addresses taken away, or priced at a premium.  Instead your provider will do the NAT for you – now instead of one address per customer, it will be one address per hundreds of customers, mediated by a new and overstretched piece of equipment.  Gradually the addresses move away from the actual devices that wanted them, and into the centre of the network, stopping any device on the edge from communicating fully.

NAT is an engineer’s horror because it adds complexity to any kind of diagnostic process  Anyone trying to trace a fault from one point in the network to another must be able to interrogate NAT devices along the way, or else he may not be able to find it.  Telling one connection from one hundred becomes more difficult.  30 years of network tools built on a model of  direct “end to end” communication will need rapid improvement to help engineers make sense of this patchwork internet where “source” and “destination” simply don’t mean what they used to.  There’s no doubt that as long as we have widespread NAT, the internet will be at worst, less reliable, and at best, much more expensive to maintain.

But since it’s the cheapest immediate solution, NAT is inevitable.  The global IPv4 network is a unique and interesting upgrade – unlike Microsoft or Google rolling out a new software release, nobody will be forced to upgrade.  What must force the technical change is economics.

When it’ll be cheaper to give out IPv6

Firstly, equipment refresh happens.  Not suddenly, but over the years routers need replacing, both at home and in the heart of the network.  Newer equipment does support IPv6, usually whether you want it or not.  Network equipment companies don’t make money from being behind the times, and want to asssure their customers that they have the latest technology.  So IPv6 starts as another feature tickbox in an unused piece of equipment that you bought anyway – the is ready when everyone else is, and is creeping into the field (for instance, your home PC or laptop is probably the first completely IPv6-enabled device you own).

By 2012 or 2013, unused IPv4 addresses will be impossible or expensive to come by, and this will affect broadband suppliers (on the lowest margins, and traditionally requesting one IPv4 address per customer) first.  NAT might give them some advantage in preserving addresses, but will start to break in ways that users will start to see, and broadband suppliers will only be able to offer excuses, or an expensive upgrade to a “real address”.

Some network engineer at some broadband company is going to flip.  Instead of suggesting that they give out unrouteable IPv4 addresses to the next batch of subscribers, they are going to suggest giving out unrouteable IPv6 addresses to go with their new IPv6-supporting routers.  The difference is that IPv6 addresses might, one day, become reliably routable, and in the mean time, both can be disguised with NAT.

Behind their routers, customers can use a mixture of (translated) IPv4 and (global) IPv6 addresses, both of which are translated to IPv6 before being sent over the broadband line, and the ISP might translate back to IPv4 where the user is accessing IPv4 sites.  There is a point at which that will seem less ridiculous than multi-layered NAT!

IPv6 NAT diagram - one day this won't seem silly

One day the plan above won’t seem so silly (though someone might draw it better).

That gives both customers and providers the best of both world – real connectivity, both ways, for modern equipment that can support IPv6, and a workable (but steadily inferior) connection for older IPv4 devices.

The transition

Once that has happened with enough ISPs, the economic burden will start to switch to those that have held out with their underdeveloped IPv4-only networks.  They will be unable to take on new customers without degrading their service – so an increasing number of people will be accessing the internet from IPv6 addresses.

In the mean time, hosting companies have been upgrading their networks to support IPv6, and forming peering relationships with IPv6-supporting telcos.  This happens because:

  1. hosting is a higher margin business, so network engineers can more easily justify far-sighted work;
  2. hosting networks are a good deal simpler and newer than access (broadband) networks, so there is less to configure, and less ancient equipment to worry about;
  3. it will eventually become very expensive to start a new hosting network offering IPv4 addresses, and pointless to invest in doing that when enough broadband suppliers are using IPv6.

Most hosting providers (us included) will give you an IPv6 and IPv4 address already.  We might not advertise or document it very well, but the service is already there.  The next time we (collectively) refresh our service offerings, you should start to see IPv6 connectivity working “out of the box” rather than having to set it up yourself.

At some point, hosts and access networks who have both sites and customers on IPv6 see the light!  The telcos can leave their overcomplicated NAT to rot as more of their customers are accessing sites on IPv6 addresses, and demand for NAT starts to plateau.

With enough broadband providers running IPv6 to their customers, new hosting companies can flourish running IPv6-only services, and bold new broadband providers will simply “buy in” their legacy IPv4 connectivity.

Bumps

I’ve no idea how long that process will take – the only certainty is when IPv4 will run out (can’t link to that enough!) and kick off a set of consequences that push IPv6 traffic into the majority.

I’ve only sketched the events that I think will matter in getting there.  In the transition I’d expect to see all or some of the following happening:

 

  • Someone declaring a “next-next generation IP” that limps along with old routers, and tries to paint IPv6 as overcomplicated and unnecessary;
  • IPv4 address trading, probably at eye-watering prices that temporarily stifle new entrants to the market of IP services;
  • IPv4 address routing wars and instability, and a challenge to the legitimacy of the regional internet registries (“hey, whose addresses are these anyway?”);
  • A boom for new special-purpose NAT devices, all of which advertise just how deep you can stack them and/or clever IPv6 translations (this is probably already happening).

 

To hedge my bets, I could see the transition derailed if the market for internet services fails to expand without limit – e.g. oil shortage, economic instability, natural diasters – if the internet doesn’t need to grow for some reason (just ask Chris Martenson), we don’t need IPv6.

Or maybe Facebook, or Google, or Apple, or some new upstart, changes people’s internet usage such that they don’t want a diversity of sites or services addressed through an independent body (hosting services from Facebook, anyone?), and we grow to love one gatekeeper – if that happens – we don’t need IPv6.

Otherwise, we’ll start to need it next year, and will be desperate by about 2013 – economics must kick us into gear.

Ruby gems, and when we’ll be shot of them

There comes a time in every sufficiently ambitious Ruby programmer’s life that one will butt up against gems, Ruby’s own packaging system.  Outside of the cosy confines of one’s laptop, on servers and embedded systems, the system contains design mistakes that make them unmanageable.  Here’s how we cope at Bytemark.  But how did it get here?

The world of programmers

So computer languages have always had some idea of a library, a set of functions, classes and other extensions that make the language more useful  At the earliest and most basic, #include tells  a C compiler that you  wish to load a library of functions for input and output called stdio – printing lines of text, opening files and so on.

In Ruby, there are two sets of libraries which have shipped with the language.  These are the core and standard libraries, and Ruby programmers know that they’re always available.  If you want to program with network sockets, you say require ‘socket’ and you know that you can use TCPServer and friends to start writing high-level network programs.  In common with other scripting languages, if a programmer wanted to use another library on his system, he would install its files into /usr/local/lib/ruby, and could then ‘require’ those files in any program.

That’s the same simple method that Perl, Python and many other scripting languages have used, and it’s easy to describe and understand.  It’s also easy for Operating System vendors and Linux distributions to package.

The world of system administrators

Since the early 90s, the Debian project defined the state of the art in large-scale, long-term system administration.  Before Debian (and yum, and the other systems it inspired), admins either routinely did their own software builds, or ran crude packaging systems which were no more complex than a tar file.  Debian rejected the idea that system administrators needed to constantly reinstall to "blow away the cobwebs", and hammered out packages and a distribution structure that has allowed admins to avoid reinstalling some individual systems for 10+ years.

Debian have packages for everything from word processors to hardware drivers.  An administrator can define the state of a whole computer system just through listing what packages are installed, and a handful of configuration files – no clicking through installers, or manual stages.  Debian’s thousands of volunteers have built and tested stable packages based on the creme of free software.   Users of a Debian-based OS can have their pick of the best of it, and know it will work together.  So in 2009 you can say "I want a system with openoffice installed", and Debian’s package management system will perform a complex set of resolutions to find out what libraries openoffice depends on.  It will install those in the correct order, right down to the operating system kernel and graphics drivers, so that after downloading, unpacking and running setup scripts, the user goes from having an empty system to a working word processor.

This same process works the same on a hundred Debian servers, even when those servers have very different hardware underneath.  This is still an awesome achievement, and Debian are still leading the way in defining how software installation on computers should work.

Crucially, Debian package programming libraries for lots of languages, and authors writing for Debian systems can rely on scores of them being available in a predictable way.  But right now, in Debian and most other Linux distributions, there are some major gaps in coverage for widely-used Ruby libraries, which means dependent applications can’t be packaged.  How did this situation come about?

Blame the web app startups!

Well, at some point in the last 10 years, the rise of Google caused all programmers everywhere to lose their minds.  They decided that users shouldn’t install their own programs on their own computers, but instead should trust installation and data storage to the programmer exclusively.  Sophisticated, fast, reliable programs started being replaced by simple, slow flakey ones as a result.

Just a joke!  We all love web applications really.

No – more reasonably speaking, a handful of smart programmers, all at once, had found a way out of the dreary mire of web applications in 2004-5, and a frontier was formed with Rails and various other programming libraries leading the charge.

On this frontier, distributing and installing applications doesn’t matter.  Paul Graham was probably the first to say that when you’re selling a web-based application, you have an advantage that you can use any language you want because you don’t have to distribute it.  As long as the programmer can install his  program on his own little set of servers, and handle his users’ data storage, nobody else ever had to see his code, what libraries he used, or how it was put together.  But also it means he doesn’t have to conform to traditional system administration practices either, and I think this is the root of the problem.

Within the Ruby community, packaging for widespread distribution has been an afterthought to pushing forward the frontier of what the language can do.  Distributable Ruby applications are thin on the ground (because right now, most people writing web apps want to sell access to them, rather than distribute them for free), but there are lots of great libraries out there.

Problem solved?

But instead of settling on a simple method of managing local installations like Perl folk did with CPAN, Rubyists settled on Rubygems.

In theory, a system administrator types gem install hoopystuff to install a gem called ‘hoopystuff’.  The gem program goes away, finds the hoopystuff gem, and installs it on his system.  If there’s anything that needs compiling, gem compiles it and puts it in the right place.  Then any program that wants hoopystuff can start to use it.  There is a master list of gems maintained, a network of mirrors and a signing mechanism, a lot like other packaging systems.  There is a some wheel re-inventing going on, but it means Ruby programmers don’t have to worry about supporting every possible system, which seems like a win.  It even allows programmers to install libraries on a shared system, without needing full privileges over it, which is a useful feature.

But the biggest mistake made in Gems was to add to the language.  In Java, or C, or Python, or any other language, to include a library, you do the same thing, regardless of who installed the library, or where.  But in Ruby, a gem command was added to the language.  And you need the rubygems library included first in order to use that command.  So if a programmer wants to use the hoopystuff library he’s installed as a gem, the obvious doesn’t work any more:

require 'hoopystuff'

Instead he has to do:

require 'rubygems'
gem 'hoopystuff'
require 'hoopystuff'

But if he has installed hoopystuff through his system distribution, rather than Rubygems, this will fail!  So a thorough way of including the library has to be a full six lines of code:

begin
  require 'rubygems'
  gem 'hoopystuff'
rescue LoadError => no_gems_error
  # no Rubygems library installed, or no 'hoopystuff' gem
end
# either way we need to do this, if this fails the library definitely isn't here
require 'hoopystuff'

This is now the only portable way of asking for a library in Ruby – hardly the principle of least surprise.  And through the fog, many library authors assume that the ‘gem’ command is available where it’s not.  So packaging almost any contemporary Ruby application involves altering its code.  The Debian packagers are asking Ruby authors very nicely to bear this in mind, but to little effect.

To add to the confusion, the gem command also allows multiple versions of the same package on a single system.  This means that instead of simply asking for gem ‘hoopystuff’, the programmer can ask for ‘hoopystuff’ version 1.23.  Unfortunately another part of the same program can ask for ‘hoopystuff’ version 1.5, Ruby  will die at that point, saying that it can’t load two versions of the same gem in the same program.  I don’t think I’m going out on much of a limb when I say nobody needs this feature and if you think you do, you’re not clever enough to use it propely.  I have "fixed" plenty of conflicting gem invocations in live apps where both pieces of code are demanding different versions of a gem, where the same one works fine.

Why can’t we convert?

It’s this last requirement to allow multiple versions that makes gems fundamentally incompatible with every other package management system – Debian, Redhat, SuSE … all of them allow one version to be installed on a system.  So it’s impossible to do a clean mapping of gems onto .debs or .rpms or any other mature packaging structure, because once you add enough applications into the mix, they can all be demanding different versions of the same Gem, and these demands are expected to be met.

At the start of 2009, the talented guys at Phusion made an attempt called DebGem where they took every version of every gem they could find, and made a Debian package out of it, baking the Gem version number into the name of the package.  It looked weird, and appeared to work.  But the project has been silent since April, and the Linux distributions they supported are fading into irrelevance.  My guess is they couldn’t stomach the amount of manual work needed to tweak every Ruby programmer’s misunderstandings about Gems.  (but Phusion dudes, if it was just the expensive hosting, Bytemark will still donate as many mirrors & build hosts as you need).

In contrast, the Perl folk have a simple site that routinely converts Perl packages into compatible debs, and allows them to be installed, and integrated into official distributions easily.  All because the packaging system is simpler.

So what are the options left for a Ruby programmer who wants to ship portable software to a wide variety of users?

Option 1: A traditional build system

What I’m doing with a couple of our projects is to forget that Gems ever existed.  Every shared library is frozen and checked into the project under an external directory where they stay unless I need newer versions.  Then I have a Rakefile which runs two jobs more usually seen in compiled programs – a build, and an install.

build compiles any native extensions that the program needs, and install copies the whole program, and all its built dependencies into /usr/lib/myprogram.  Finally it adds the binaries that I want to run to /usr/bin/myprogram but these are just stubs which set up the load paths to my "pure" Ruby environment.

In addition, because I don’t want to have to fix the source code of all the gems I’m using (usually around 10-15), these loader stubs actually load a fake Rubygems library.  The library just ensures that the gem command does nothing, and stops me having to worry about changes to the libraries’ code.

The down sides are pretty well understood – my 2000 lines of code which would have been a tiny, architecture-independent package, has to be built a one large package, once for each architecture (we need two at Bytemark).  If I wanted to distribute it any further I would probably want a wide variety more packages, for a few more distributions.  But I have all the usual down sides of managing my own packages – checking for security bugs and doing my own code updates, much larger & slower-to-install packages, and so on.  But the major up side: I can use Debian and apt-get to install and maintain it reliably on hundreds of servers.

Option 2: Repackage each library

For the long term, Patrick is working out how to gently modify the source of around 50 popular Ruby libraries so that they form normal Debian packages without needing Rubygems installed.  That will help me kick all this duplicated library code out of individual projects and back into packages where they belong.

This is semi-automated but still has many manual elements that Pat is working through.  The Debgems folk had given up at doing this in the general case, so we’re focussing on just the set of Gems we need to run all our code, and trying to integrate our work with Debian’s.  I notice also an Ubuntu team has also got a bunch of reasonably new packages into Ubuntu’s universe package list – I hope this will make it back into Debian, or that we can use them for Bytemark’s systems. 

I can even see scope for going all the way back to the start, and making a small fork of the core Ruby language & interpreter to fix the packaging problems – that is an extreme possibility, but with at least two new Ruby implementations becoming more relevant lately, it might not be the MRI (Matz’s Ruby Implementation, the original one) that stays relevant in the long run, and leadership on packaging could easily change.

But right now this approach is going against the grain of what almost all Ruby authors are doing; surgery is needed to library authors’ code, which is the cause of all this rot.  But unfortunately that’s the price that we have to pay to go back to a simpler, working library system.

Option 3: Wait… about five years?

I think that the worst mistake of Rubygems is being undone – from the next major Ruby version, the gem command is no longer necessary in the common case.  So if you require  ‘hoopystuff’ in Ruby 1.9, the require statement will implicitly look for a gem called ‘hoopystuff’.  This might seem like a trivial timesaver for library and application authors, after all, it was only six lines I was complaining about.  But but but… it means that the sanctioned way of including libraries is back to how it used to be, just one require statement, one namespace.

That mean a lot less work in repackaging gems, but only when library authors have got the message and Ruby 1.8 installations cease to be relevant – so that’s where my five year estimate comes from.

Start now, avoid the gem traps

If you’re a Ruby author who cares about distributing your software to more than just other programmers’ laptops, you only need to take some simple action with your existing Gems to make them compatible that the Debian folk wrote years ago.  I’d add the following to these tips though:

 

  1. don’t use the gem command in your main code at all, use a loader program that pulls it in if you need it.  In almost all cases it is going away in 1.9, and good riddance;
  2. if you provide a Ruby library called foobar, make sure your gem is also called foobar, and preferably only provides a single module called Foobar;
  3. don’t use capital letters in your gem name – amazingly there are already some gems in the namespace that differ only by case!

And finally, try building a native package for your favourite system!  Debian at least is quite easy.  It will take you a few more hours, but your library will simple be easier to manage for system administrators when you’re done.  Let’s not wait years for these design mistakes to atrophy away – fix your libraries and help make Ruby a first-class component of every OS.

Broadband ban: wrong, impossible, boneheaded

This is expanded from a post to a mailing list in response to the recently announced government U-turn on broadband policy.  After having dinner with a movie studio boss (or maybe also listening to Andrew Lloyd-Webber standing up for his cash-strapped friends) Peter Mandelson is now stating that  "repeat offenders" who share films and music through their broadband internet connections will have their connections cut off.

I don’t understand why the music & film industry gets to suggest an extrajudicial process to fight their battles, and why any politician would taking them seriously (unless they’ve just had a smashing lunch with a lobbyist).  Kangaroo justice is easy, here are some more ideas!

  • People who run out of restaurants without paying their bill should have to pay double at supermarkets for a month?
  • Accused shoplifters should be excluded from parking in the city centre on a weekend?
  • Marijuana growers have to run their house on a single 13A fuse?

You can make up all kinds of "short sharp shock" punishments that sound fair to a medieval baron, but they amount to the same thing: meddling with people’s private contractual arrangements without due legal process.

The broadband cut-off proposal is not only unfair, it’s technically impossible.  PC are not yet closed systems which can tell copyright files from free, despite a spirited but failed effort from Microsoft.  So if your computer can’t tell which files are copyrighted, mine can’t either, nor can those at a media company.  To find out whether a file is copyright, you need to download it, then listen to or watch it.  But instead the evidence that is routinely presented to us is in the form of an infringement notice from a subcontractor of a movie/music firm, usually demanding that we immediately cease service to a particular customer, but threatening no specific action.  We used to pass them on as a bit of curio, now they come so frequently that we bin them.

Their implication is that of course you understand this is copyright material, of course we are trustworthy to impart this to you, and therefore of course you will accede to our demands.  Well, no, no and therefore – no.  Keeping up with movie releases is not my strong point, so a file called "District 9" could be someone’s thesis on town planning, or a user mod for a video game, or just about anything other than 90 minutes of copyrighted hokey sci-fi.  No I won’t download it to check, it’s neither instant nor obvious.  And the movie and music industries have been suing both children and dead people’s estates in years of litigation, so should I take these notices on trust?  Doesn’t seem likely.

If ISPs had all agreed that these infringement notices were terribly fair, that’s one thing.  But they are universally lashing out at Mandelson’s proposal.  The government can no more force ISPs to co-operate in abusing their own customers than force Lord Lloyd-Webber to stage Pee! The Musical!, the chorus prancing through the stalls urinating on the audience (and I have sat through By Jeeves, so I wouldn’t rule that concept out as an improvement, L-W).

I simply don’t believe there is a problem – true artists will always create, talent can always make a living, and their public will always find a way of being cheap.  If the middle men can’t hack being in the middle without resorting to legislating their right to a business model, they need to make a new one.  I’m not interested in helping them out.