Amazon says that they are “all hands on deck” trying to resolve the massive cloud server failure in North Virginia which brought down many big name social networking websites.
Yesterday Amazon’s AWS server failure crashed social websites such as Reddit, Foursquare, Quora, paper.li, Hootsuite (as well as their ow.ly and ht.ly link services), LiveFyre and ContactMe as well as many others. The website Ec2disabled.com claims to have a list of the sites brought down by the outage, including parts of the New York Times, parts of Sony’s online gaming ecosystem and parts of the Nvidia site, .
At 4 am GMT (11 pm PDT) Amazon’s AWS status site issued an update on the outage saying the company was “working hard” to resolve the problem.
Just a short note to let you know that the team continues to be all-hands on deck trying to add capacity to the affected Availability Zone to re-mirror stuck volumes. It’s taking us longer than we anticipated to add capacity to this fleet. When we have an updated ETA or meaningful new update, we will make sure to post it here. But, we can assure you that the team is working this hard and will do so as long as it takes to get this resolved.
The outage has caused widespread frustration for the social websites and their owners affected. Livefyre, who manage the comment system on this site, engaged their engineers to work through the night to get their ecosystem running with limited functionality. The company said,
The whole web has learned some lessons from this outage, and we will be building resiliency against situations like this the second Amazon fully recovers. We are currently architecting different scenarios, and working towards the best solution for preventing catastrophic outages like this. Stability is the most important thing to us, and it will be our first priority.
Speaking to this site on Twitter ContactMe said that they were getting little information from Amazon AWS,
@thesociable Unfortunately we know as much as you guys. You can find updates that they have been posting here:http://bit.ly/tDWlI
These site, as with others, are dependent on Amazon repairing the problem, although over 24 hours after the server collapsed Amazon AWS has not provided a time frame for when they will be fully back online. Only some sites have returned to full functionality.
Amazon AWS allows web companies to rent server space from Amazon at a fraction of the cost of setting up and managing a server farm of their own. Google and Microsoft offer similar services.