Microsoft’s “top notch” China hack post-mortem was "troubling" speculation

"The loss of a signing key is a serious problem, but the loss of a signing key through unknown means is far more significant... Microsoft’s customers did not have essential facts needed to make their own risk assessments."

Microsoft’s “top notch” China hack post-mortem was "troubling" speculation

“This report was top notch. It has reset the bar for what transparency in incident reporting looks like” said one joyous pundit in the wake of Microsoft’s Sep. 6 post-mortem into how a Chinese APT stole a powerful cryptographic key – then used it to hack multiple federal agencies.

But despite sounding boldly definitive, Microsoft’s post-mortem was in fact speculative and had no evidential grounding, a cutting new report by the Cyber Safety Review Board  (CSRB) has found – with the Board saying it was “troubled” by Redmond’s response when it aired its concerns.

Microsoft’s widely circulated suggestion that the key had been leaked in a “crash dump” was as devoid of hard evidence as any other of the 49 hypotheses it continues to investigate – “including some scenarios as wide-ranging as the adversary possessing a theoretical quantum computing capability to break public-key cryptography or an insider who stole the key during its creation,” the CSRB said in an April 2 report. 

See also: Microsoft roasted over “cascade of security failures” – authentication system revealed as utterly broken

“Customers (private sector and government) relied on these public representations in Microsoft’s blogs,” the CSRB emphasised in its report.

“The loss of a signing key is a serious problem, but the loss of a signing key through unknown means is far more significant because it means that the victim company does not know how its systems were infiltrated and whether the relevant vulnerabilities have been closed off,” it added. 

“Left with the mistaken impression that Microsoft has conclusively identified the root cause of this incident, Microsoft’s customers did not have essential facts needed to make their own risk assessments about the security of Microsoft cloud environments in the wake of this intrusion.”

See also: "Bring memes" NSA's new cybersecurity director told

“Microsoft told the Board early in this review that it believed that the errors in the blog were ‘not material.’ The Board disagrees”, it added.

The CSRB says Microsoft's "CEO and Board should develop, and share publicly, a plan with specific timelines to make fundamental, security-focused reforms across the company and its full suite of products, and then hold leaders at all levels of the company accountable...”

Just to be crystal clear: “As of the date of this report, Microsoft does not know how or when Storm-0558 obtained the signing key” it emphasised.

Break it down for me… 

“Our investigation found that a consumer signing system crash in April of 2021 resulted in a snapshot of the crashed process (“crash dump”)... In this case, a race condition allowed the key to be present in the crash dump…” wrote Microsoft on September 6, sounding decisive and definitive about how the powerful consumer signing key had found its way out.

It added: “We found that this crash dump, believed at the time not to contain key material, was subsequently moved from the isolated production network into our debugging environment on the internet connected corporate network… After April 2021, when the key was leaked to the corporate environment in the crash dump…” (The Stack’s italics.)

Microsoft caveated these findings in its blog, adding “due to log retention policies, we don’t have logs with specific evidence of this exfiltration by this actor, but this was the most probable mechanism by which the actor acquired the key” – but suggested this meant it merely lacked evidence of exfiltration; not that it had no evidence of the dump or key presence.

The CSRB said: “Soon after publishing that [September 6] blog, Microsoft determined it did not have any evidence showing that the crash dump contained the 2016 MSA key. This led Microsoft to assess that the crash dump theory was no longer any more probable than other theories as the mechanism by which the actor had acquired the key, which Microsoft chose to leave uncorrected for more than six months after publishing its September 6 blog. The Board is troubled that Microsoft neglected to publicly correct this known error for many months.” 

Six months later after “several written follow up questions from the Board regarding the blog” Redmond added an update to the blog. 

See also: Revisiting *that* Google outage: Fire, flooding, (then running out of water) and a “regional Spanner” failure