How we sync the time on thousands of servers from our York data centre

If you’ve ever struggled with the accuracy of your cheap digital watch, then you’ll know that electronic devices can be quite rubbish at keeping the time. Computer hardware is also terrible at keeping the time. Without constant correction, all our servers would gain or lose about one second per day.

It’s essential that all our servers need to know the correct time and remain synchronised – we often have to cross-reference logged events between different computers. Modern cryptography also relies on the time between systems being accurate – it’s the basis of most “one-time passwords” (think Google Authenticator) and allows users to make encrypted connections to websites.

That’s why Bytemark relies on time servers to establish the time and communicate it to other devices using the standard Network Time Protocol (NTP).

"Network Time Protocol servers and clients" by Benjamin D. Esham (bdesham) - Based upon Ntp.png by en:User:Kim Meyrick. This version created by bdesham in Inkscape. This image includes Appointment-new.svg and X-directory-remote-server.svg, public-domain icons from the Tango project.iThe source code of this SVG is valid.This vector image was created with Inkscape.. Licensed under Public Domain via Commons - https://commons.wikimedia.org/wiki/File:Network_Time_Protocol_servers_and_clients.svg#/media/File:Network_Time_Protocol_servers_and_clients.svg

A diagram showing the relationships between the various levels of NTP servers. The blue numbers are the stratum numbers; yellow arrows show a direct connection, while red arrows show a network connection.

NTP is designed to allow the time to be accurately transmitted from time servers across networks to clients, accounting for their inherent latency. It uses a hierarchical system of time sources.

Each level in the hierarchy is known as a “stratum”. A Stratum 0 time server is typically a high-precision clock device, e.g. an atomic clock (known as a “reference clock”). Stratum 1 time servers request the time from Stratum 0 servers; Stratum 2 time servers request the time from Stratum 1 and so on. The farther the Stratum from 0, then the farther the time server’s accuracy is away from the reference clock – though Stratum 2 time servers (and beyond) request the time from multiple servers at the level above.

We’ve always run our own Stratum 2 time servers (at ntp.bytemark.co.uk), so our servers don’t need to look outside Bytemark’s network for their time source. But when we built our own data centre, we saw an opportunity to install our own Stratum 0 server, using a simplified Global Positioning System (GPS) receiver as the reference clock.

This works because the 24 satellites in the GPS system contain synchronised atomic clocks, each broadcasting its time down to Earth. With only 4 signals, a GPS receiver can calculate the difference in time received from the time in the broadcast, allowing a location to be pinpointed. We can access that time data by installing a GPS receiver on the side of our data centre and use it as a source for our time servers.

Bytemark’s time server setup

  • Stratum 0 reference clock: GPS receiver at YO26 data centre.
  • Stratum 1: An NTP server based on a low-power PC.
  • Stratum 2: Bytemark’s publicly-accessible time servers that peer with each other and get the de facto time from Stratum 1.

At each stage of the installation, we’ve been careful to

  • minimise latency potentially introduced through cabling
  • secure services against attack and exploitation
  • maintain maximum uptime through ongoing monitoring.

Our weather-proof external GPS receiver is a syncboxRED. It provides both RS-232 (serial) connectivity & emits a PPS (pulse per second) signal accurate to 1 microsecond.

The cables from both outputs are fed into the data centre roof space, where we’ve mounted a Soekris 4501 communications PC on an immediately adjacent beam. The Soekris (codename Mr Wolf) is optimised for low-power and extremely long-life operation, so requires no special cooling nor maintenance.

Mr Wolf runs OpenBSD with ntpd (not OpenNTP) and is configured to do three things:

  • read the GPS time signal over serial and
  • use the PPS signal to keep its own internal clock accurate and
  • respond to our own Stratum 2 time servers (a pool of three servers in sync with each other).

Our firewall restricts public access to Mr Wolf and we use our managed system monitoring called custodian to alert us if anything seems amiss. Our existing Stratum 2 servers provide the time to any system that requests it, informed by our own reference clock and others across the world.

Results

Hosting a Stratum 0 server within Bytemark’s network is another way to decrease our reliance on external providers. That said, we were initially surprised to see that one of our time server pool preferred a time server located in the Netherlands over our local box!

ntpq -p c.ntp.bytemark.co.uk
remote refid st t when poll reach delay offset jitter
==============================================================================
LOCAL(0) .LOCL. 13 l 6h 64 0 0.000 0.000 0.000
-mrwolf.ntp.byte .GPS. 1 u 255 1024 377 2.334 9.405 2.386
+ntp2.rrze.uni-e .GPS. 1 u 341 1024 377 21.176 -1.654 2.366
*ntp1.nl.uu.net .PPS. 1 u 718 1024 377 14.258 -2.952 2.688
+ntp0.jonatkins. .GPS1. 1 u 361 1024 377 14.304 -2.169 2.337
-statler.bytemar 103.7.151.4 2 u 390 1024 375 2.343 -5.208 4.282
-2001:1af8:4100: 193.79.237.14 2 u 241 1024 376 20.216 -8.996 8.541

(delay/offset/jitter in milliseconds!)

Time servers are literally what keep the internet ticking. Why not try setting up a Stratum 0 time server yourself? With the right equipment, it’s relatively simple and could make a fun Raspberry Pi project. Let us know in the comments if you have any suggestions for how to do it.

Thanks to Næþ’n Lasseter for the technical details & Matthew Bloch for reviewing this post.