A Second that could make a difference and bring your server down!!! – The buggy lil ‘leap second bug’
The weekend was pretty disastrous for websites around the Internet. First, storms in United States knocked out power in Amazon’s data centres, and with it, around 1% of American websites. This included popular websites like Foursquare, Instagram, Pinterest and Netflix. Then, just as websites were hobbling back to life, the “leap second” bug struck.
Timekeepers had announced plans to add an extra second to June 30, to compensate for Earth’s movement around the Sun. This “leap second” is added to the International Atomic Time (TAI) to ensure that Earth’s clocks stay in-sync with “solar-time”. So far so good, where do the websites come in?
Many computers use Network Time Protocol (NTP) to keep their clocks synchronised with the International Atomic Time. When the atomic time keepers added a second at 23:59:59 on Saturday, just like they said they would, all hell broke lose. Servers, especially those running on some versions of Java and Linux, choked on the “leap second”, bringing down with them some of the most popular websites in the world.
According to BuzzFeed, Reddit, Gawker media sites, StumbleUpon, Yelp, FourSquare, LinkedIn, and Meetup were among the websites affected by the “leap second bug”. All websites mentioned have since patched their servers and are back running normally.
It is a little surprising to note there are still pieces of code struggling with the “leap second” phenomenon, since this is nothing new. Saturday was the 25th instance of a “leap second” being added to a day, with the last one on 31st December 2008. Companies like Google and Opera have in the past explained their strategies to workaround any potential “leap second bugs”, so there’s no excuse for major software vendors to be caught unawares.