The violent storms that rocked the mid-Atlantic states last night and early this morning with the second punch in a rapid one-two-punch that affected East Coast users of Amazon’s Elastic Compute Cloud (EC2) and related offerings on the eve of the unofficial July 4th weekend.
The Amazon facility that houses its Availability Zone for the US-East-1 region lost power just before midnight Friday night. Amazon staff managed to restore power within a matter of minutes and recovered roughly half of the affected EC2 instances and a third of the affected Elastic Block Storage (EBS) volumes within the first three hours. Employees continue their recovery efforts and warn that some EBS volumes may have inconsistent data due to the power outage, according to a posting on Amazon Web Services’ Service Health Dashboard.
Friday morning EC2 users also experienced connectivity issues with the same Availability Zone as multiple network devices in the same availability zone suffered routing table exhaustion.
“Some of the devices in this [Availability Zone] had not been reconfigured to handle the increase in required routing table space, and this exceeded the allocated capacity on those devices,” according to another posting on Amazon Web Services’ Service Health Dashboard. “This caused excess load on the router control processor which in turn caused high levels of packet loss. When this high level of packet loss occurred, EC2 instances and EBS volumes connected to those devices lost network connectivity. Once we discovered the root cause, the affected network devices were reconfigured to mitigate this problem and connectivity to impacted instances and volumes was restored. We are currently putting better monitoring in place across all of our network devices globally to prevent this from happening again.”
All of this comes on the heels of a similar outage suffered by the same facility on June 18.