Outages
40-hour Cloudflare outage: Tier 3 DC power failure exposes unknown software dependencies, triggers rethink
"Dependencies shouldn’t have been so tight, should have failed more gracefully, and we should have caught them"
Outages
"Dependencies shouldn’t have been so tight, should have failed more gracefully, and we should have caught them"
Azure
Upstream utility disturbance triggers brief bout of sweating with "a small amount of storage nodes" needing to be recovered manually in the wake of the Azure incident,
AWS
" our engineers have been focused on mitigating the impact of the delayed replication through changes made to replication subsystems, resource adjustments and other modifications"
News
Fire-fighting was not helped by Global Switch’s fire suppression system “running out of water”. The incident also introduced water and soot contamination. Google Cloud’s affected racks had to be taken apart, thoroughly cleaned and reassembled before they could be restarted.
AWS
AWS admitted that “customers may also have experienced issues when attempting to initiate a Call or Chat to AWS Support” during the incident. What happened to recent architectural changes designed to avoid this?
Featured
Over 90 Google Cloud services were knocked offline in its Paris region after a major data centre incident – believed to have been caused by a water leak triggering a fire in a battery room of a co-location data centre. The incident also briefly triggered the complete global outage of its
Enterprise IT
“Even at a depth of five metres, our fibre optic cables are not safe from concrete drills” lamented Deutsche Telekom AG on February 14, sharing pictures of damage caused by construction workers in Frankfurt. Needless to say, the telco was not the only company wringing its hands, as downstream customer
Featured
4,341 trades cancelled. But why does DR system need manual shutdown?
Cloud
"The network connectivity issue is occurring with devices across the Microsoft Wide Area Network"
Featured
A serious IT outage at the Federal Aviation Authority (FAA) which forced it to halt all US departing flights on Wednesday 11 has been attributed by the transportation agency to a “damaged database file” “At this time, there is no evidence of a cyber attack. The FAA is working diligently
Cloud
“We are aware of an incident at Google’s data center in Council Bluffs, Iowa."
Cybersecurity
Critical care software outages hit NHS as a result