Data center's 'perfect storm' causes site downtime
It was one of those moments where all of the red warning lights started flashing, across all our systems, all at once. The significant start of the downtime incident portended the scope of the problem to come. The service interruption on 12 November, reported as it was happening on our status website, marks the longest downtime event we've experienced in four years. Our deep apologies for the downtime at this important time of year in retail sales -- not that there's ever a good time for this to happen.
This event's disruption is compounded by the other extraordinary power failure event the day prior, 11 November, first reported here on our Status website. This prior day's event marked the first service interruption caused by a failure of the data center's systems (networking, power, cooling, etc.) in over two years.
E-business Coach will be examining possible ways to diversify it's primary e-commerce server cluster, or otherwise make it more resilient to such an incident in the future.
Our data center has already launched an examination into how it's multiple redundant power and cooling systems didn't satisfy the chain of events that started with a freeway traffic accident outside the data center in Dallas / Fort Worth. Combined with the event on 11 November, you can be sure there will be changes forthcoming to harden the redundancy of emergency power and cooling systems.
At the end of the day, E-business Coach is responsible for seeing to your site's uptime service level target of 99.95%. These two incidents jeopardize that for the month of November. We will work with our clients to see that we can reach a remedy that satisfies you.
Comments