Following our recent post about BigV, we’ve been working hard to improve I/O reliability, and have made some significant strides in that area – as well as picking off a few other issues as well. Here’s some of what we’ve been up to:
Splitting up storage pools
This is more of a systems administration task, but we’re in the middle of (invisibly) migrating people’s discs onto more, smaller storage pools. This limits the amount of damage any one disc can do, so it’s worthwhile all by itself.
Guaranteed I/O rates
Although we’ve not yet decided what rates are acceptable for each grade of storage, the development work necessary to enforce a maximum rate is now complete. While it’s still turned off at the moment, if a disc starts to cause trouble for its neighbours, we can now intervene to restore service for everyone much more effectively than before. Hooray!
Guaranteed I/O rates aren’t just about limiting maximum speeds, of course. While performing some “invisible” VM migrations at the end of last week, a few sharp-eyed customers noticed unusual stalls, leading to spikes in latency for some of their I/O requests, seemingly at random. We’ve fairly confidently tracked this down to an issue with the configuration of BigV’s storage network, which we’ve rectified.
The technical details are a bit obtuse, but BigV’s storage network is based on the next-generation Internet protocol, IPv6. Each disc has its own IPv6 address, which is an essential component of our unique live migration system. Each head has a neighbour table, which maps IPv6 addresses to Ethernet addresses – which correspond to a particular tail. That table is a limited (but configurable) size, and re-discovering entries (using IPv6 neighbour discovery) is slow, and subject to rate-limiting.
On our larger heads, the table was too small, so an unlucky request would have to wait for neighbour discovery to finish before it could be serviced – and that neighbour discovery was pushing another disc’s entry out of the table, setting up problems for later. We’ve changed the size of the table, and been unable to replicate the observed spikes as a result. Phew. And hooray!
Fewer head crashes
“No head crashes” would be better, but we still can’t promise that. A kernel panic on Friday gave us the information we needed to track down the cause of crashes on two of our newest heads; we shouldn’t see this issue again on them.
It’s possible our weekend’s tail crash is being caused by a similar memory exhaustion issue, but we’ve not finished investigating that yet – it’s much less clear than the head situation, as we don’t assign a defined amount of RAM to each disc to use as cache, relying on the kernel to manage the available memory instead.
bigv bug works again
The most recent client version (0.8) mistakenly sent “bigv bug” requests to a non-existent address. This has been rectified by making the address do what the client expects – so there’s no need to upgrade the version you have.
“Internal error” messages on some bigv-client operations
Finally, some slow bigv-client operations had recently begun failing with “Internal error” messages; although the requested operation would generally complete, this has been confusing for many of our customers, and me as well. We’ve tracked it down to a sneaky 15-second request timeout in Pound, a component we recently introduced to take over the SSL part of BigV’s web interface.
Fifteen seconds is still a long time for any request to take, so we’re not happy that we’ve had to increase it; a rework of some BigV internals that should allow us to do away with multi-second waits is in the works. In the meantime, the client should once again be allowed to hang on until the operation is complete.
We’re still working
Although the above represents some significant movement in terms of BigV’s stability and performance, we’re aware we’ve still got a lot to do, and won’t be slacking off any time soon. Do get in touch via email@example.com if you’d like to discuss any of our ongoing or planned work in this area, or anything raised in this blog post. Or, indeed, anything!