An entire top-level domain got knocked offline and nothing could be done because Arizona was asleep
It's always DNS.
The entire .club top level domain (TLD) and over a million websites using it got knocked offline for up to four hours this week after DNS issues at upstream registry GoDaddy. The issue also affected HSBC's .hsbc domain.
As customers fretted, one open source DNS software and services provider, Peter van Dijk of PowerDNS opted to do some basic sleuthing. But upon calling GoDaddy customer support, he says that he was told that "'the right team is in Arizona, they will wake up in a few hours, we have no way to escalate'". (There goes that SLA...)
"This is highly unusual. Entire TLDs do not typically just drop off the internet like this" DomainIncite's Kevin Murphy -- the first to report the incident on October 7-- noted on Thursday.
Cloudflare was among those responding to customers as the issue dragged on on Thursday (October 7): "We are currently experiencing connectivity issues with .club domains Registry. As a result, the DNS information cannot be queried for the domains. We have contacted the corresponding upstream registry, and as soon as we have any updates regarding the issue, we will update the status post" the CDN and security company said.
GoDaddy bought the .club TLD along with 27 others in April 2021 from Minds + Machines Group Limited for $120 million. It now has more than 14 million domain names under management, across 240 TLD extensions.
The issue was DNS-related, GoDaddy later confirmed to customers, as affected users noted that the TLD’s name servers, all found at nic.club, were not responding. (All six official names servers -- ns1.dns.nic.club; ns2.dns.nic.club; ns3.dns.nic.club; ns4.dns.nic.club; ns5.dns.nic.club; ns6.dns.nic.club -- use the same IPv4 and IPv6 addresses.)
GoDaddy did not respond to a request to a comment from The Stack.
UltraDNS identified the problem starting at 10:30 UTC and as being fixed by 14:15 UTC.
PowerDNS's van Dijk told The Stack: "Based on what I could observe publicly, my impression is that the configuration for two zones (.club and .hsbc, but not many people care about .hsbc) got broken in a way that managed to replicate to their six sets of name servers, and then failed.
"In most DNS clustering/replication systems, when data is allowed in, it is already correct, and won't break a whole setup. However, when two or three pieces of software need to agree on what's right, there's always edge conditions when something makes it past the first goalpast only to light the second goal on fire entirely."
They added: "The reason that I say 'for two zones' is that various other zones owned/operated by Godaddy and hosted by Neustar were unaffected, which, to me, makes it less likely that this was a Facebook-style 'we broke everything', or a 'we pushed a software update that broke everything'.
"Even on the six broken name server IPs for .club, the nic.club zone was still operational). It is also possible I'm entirely wrong - looking at a system from the outside only gets you so far."
The incident sparked fresh debate on Reddit's /sysadmin page about the credibility or otherwise of newer TLDs like .vip or .horse, which many admins said they had blocked outright for security reasons.
(Domain domain squatting/typo squatting startling effective still as a method to attack organisations, e.g. but setting up mail servers that capture emails sent to mis-typed domains. In one classic recent example of this, a penetration tester captured an email intended for IT support at a large pharmaceutical company, promptly picked up the phone masquerading as the helpful team there to fix the issue, gained remote access, and it was game over...)
The TLD-curious can see a full list of the delegation details of top-level domains here.