AWS comes clean about recent Sydney outage

Public Cloud provider said both primary and backup power failed

Amazon Web Services has highlighted the issues behind its power-related outage in its Sydney availability zone on Sunday night.

At 4pm Sydney time, the company reported the power issue at its Sydney region datacentres delivering its EC2 and S3 services.

The blackout lead to disruption for Sydney citizens and AWS clients as the outage took out major websites such as Foxtel Play, Channel Nine, Domain and Domino’s Pizza.

Partners including Comunet, Bulletproof, RXP Services and Strut Digital were also affected as many worked through the night with clients to work through business-critical challenges.

In a recent blog post, AWS explained how every instance is served by the main utility power and a backup generator diesel rotary uninterruptible power supply (DRUPS), as two independent power delivery sources.

AWS said if either source provides power, the instance will maintain availability as the DRUPS as the secondary source, stores power and starts up if the main utility power is compromised.

However, during the severe weather, the instances that lost power lost access to both primary and secondary powers and consequently, the backup generator could not start up.

AWS described the power failure as an ‘unusually long voltage sag’, as opposed to ‘a complete outage’ and said that the unexpected nature of the voltage sag caused the set of breakers responsible for isolating the DRUPS from utility power, fail to open fast enough.

“Normally, these breakers would assure that the DRUPS reserve power is used to support the datacenter load during the transition to generator power. Instead, the DRUPS system’s energy reserve quickly drained into the degraded power grid,” the company explained.

“The rapid, unexpected loss of power from DRUPS resulted in DRUPS shutting down, meaning the generators which had started up could not be engaged and connected to the datacenter racks. DRUPS shutting down this rapidly and in this fashion is unusual and required some inspection.”

In remediation, AWS said it will add additional beakers to assure a quicker break to connections to degraded utility power to allow the generators to activate before the UPS systems are depleted.

The company added that it will also make fixing the ‘latent bug’ that disabled the automatic recovery systems in customer instances, a priority.

AWS said more than 80 per cent of the impacted customer instances and volumes were online and operational by 1 am PDT after power was restored at 11:46 am PDT.

According to Comunet chief executive, Mark Ogden, 100 of his clients in total were affected and issues across all clients, bar one, were resolved in three hours.

However, this was not the case for all. Strut Digital chief executive, Zack Levy, told ARN that his engineers were still restoring services at 3:30 am.

“We apologise for any inconvenience this event caused. We know how critical our services are to our customers’ businesses. We are never satisfied with operational performance that is anything less than perfect, and we will do everything we can to learn from this event and use it to drive improvement across our services,” AWS said.

AWS channel partners recently told ARN the interruption has proven that business should consider reviewing their architecture model and strategy before considering jumping on the Cloud bandwagon.


Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags Westpacdomaincommonwealth bankATMAWSZack Levyrxp servicesStrut Digital

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.
Holly Morgan
Show Comments

Cool Tech

Breitling Superocean Heritage Chronographe 44

Learn more >

SanDisk MicroSDXC™ for Nintendo® Switch™

Learn more >

Toys for Boys

Family Friendly

Panasonic 4K UHD Blu-Ray Player and Full HD Recorder with Netflix - UBT1GL-K

Learn more >

Stocking Stuffer

Razer DeathAdder Expert Ergonomic Gaming Mouse

Learn more >

Christmas Gift Guide

Click for more ›

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Edwina Hargreaves

WD My Cloud Home

I would recommend this device for families and small businesses who want one safe place to store all their important digital content and a way to easily share it with friends, family, business partners, or customers.

Walid Mikhael

Brother QL-820NWB Professional Label Printer

It’s easy to set up, it’s compact and quiet when printing and to top if off, the print quality is excellent. This is hands down the best printer I’ve used for printing labels.

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

George Khoury

Sharp PN-40TC1 Huddle Board

The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic

Featured Content

Product Launch Showcase

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?