Outages

News | Dec 15, 2023

'Tis the season: Box gifts customers an early weekend with Friday outage

File sharing and collaboration service Box fell victim to an extended outage that affected all of its services on Friday

telco | Nov 08, 2023

Nationwide Aussie telco outage cause "too technical" to explain: The answer may be in a (heavily redacted) Canadian report

How not to share a root cause analysis: Lessons from Australia's Optus and Canada's Rogers...

ChatGPT | Nov 08, 2023

ChatGPT suffers major (but swiftly fixed) outage

Two outages back-to-back came the day of OpenAI's new models and services launch and appear to have grown more severe today...

Outages | Nov 06, 2023

40-hour Cloudflare outage: Tier 3 DC power failure exposes unknown software dependencies, triggers rethink

"Dependencies shouldn’t have been so tight, should have failed more gracefully, and we should have caught them"

Azure | Oct 23, 2023

Azure West Europe wobbles after generator failover fail

Upstream utility disturbance triggers brief bout of sweating with "a small amount of storage nodes" needing to be recovered manually in the wake of the Azure incident,

AWS | Oct 19, 2023

AWS’s S3 data replication falters in US-EAST-1 as hyperscaler tackles "backlog"

" our engineers have been focused on mitigating the impact of the delayed replication through changes made to replication subsystems, resource adjustments and other modifications"

| News | Jul 28, 2023

Revisiting *that* Google outage: Fire, flooding, (then running out of water) and a “regional Spanner” failure

Fire-fighting was not helped by Global Switch’s fire suppression system “running out of water”. The incident also introduced water and soot contamination. Google Cloud’s affected racks had to be taken apart, thoroughly cleaned and reassembled before they could be restarted.

AWS | Jun 14, 2023

"Plus ça change"? - Why did AWS Support fail with US-EAST-1 again?

AWS admitted that “customers may also have experienced issues when attempting to initiate a Call or Chat to AWS Support” during the incident. What happened to recent architectural changes designed to avoid this?

Featured | Apr 27, 2023

Fire, water, hit Google Cloud in Paris - with global impact

Over 90 Google Cloud services were knocked offline in its Paris region after a major data centre incident – believed to have been caused by a water leak triggering a fire in a battery room of a co-location data centre. The incident also briefly triggered the complete global outage of its

document.currentScript.parentNode.innerHTML = (parseInt(document.currentScript.closest('.iteration-container').dataset.length)).toString();