CrowdStrike bug maxes out 100% of CPU, requires Windows reboots
"Note: This is 100% of a single core. In an 8-core system for example, an additional 12.5% of unexpected total CPU load would be experienced..."
Looking for the latest on the outage of July 18/19? See our story here
Cybersecurity firm CrowdStrike has scrambled to roll back a detection logic update that caused its endpoint agent for Windows – in the company’s own words – to “consume 100%” of a machine's CPU core.
One customer alleged that the bug had bricked tens of thousands of endpoints across a hospital group’s IT estate. They claimed in a now-deleted and panicked-sounding post on Reddit that as some machines are being used for surgery “we cannot reboot them without killing patients.”
The Stack could not independently verify that allegation, but CrowdStrike has now confirmed the bug and its impact in a customer advisory that notes – in what is a deeply frustrating outcome for many users – that a fix requires system rebooting; something administrators overseeing machines running 24/7 or mission-critical workloads cannot easily do.
It told customers, in a gated note seen by The Stack, that “on June 26, 2024 at 8:27 PM ET (2024-06-27 @ 0027 UTC), CrowdStrike released a detection logic update for the Memory Scanning prevention policy capability found in the Falcon sensor for Windows. This logic exposed a bug in Memory Scanning that exists in sensor versions 7.15 and earlier. The result of the bug is a logic error in the CsFalconService that can cause the Falcon sensor for Windows to consume 100% of a single CPU core.
See also: Microsoft roasted over “cascade of security failures” – authentication system utterly broken
It added: “Note: This is 100% of a single core. In an 8-core system for example, an additional 12.5% of unexpected total CPU load would be experienced. CrowdStrike has rolled back the detection logic update.
“On hosts where the increased CPU usage results in significantly impacted system performance, sensor functionality may be degraded. We recommend rebooting immediately to ensure normal operations.
“Windows hosts can be fully remediated by rebooting the system. We recommend you take this step if possible. DO NOT attempt to upgrade, downgrade or uninstall the sensor without first rebooting the host, as: An attempted sensor upgrade will not address the issue, and the upgrade will fail as upgrade process is locked Disabling/reenabling the Memory Scanning prevention policy will not address the issue,” its advisory said.
One security professional empathised on X but noted the importance of testing updates on one machine “before rolling out to the whole fleet…” Critics meanwhile said the onus was on CrowdStrike to test its updates more robustly before deploying them to its extensive customer base.
The bug, however short-lived and swiftly rolled back, will frustrate executives over reputational impact after an impressive run of results that saw ARR grow 33% year-on-year to $3.65 billion, June 4 earnings showed.
CrowdStrike has been growing bullishly and taking market share from Microsoft in the wake of a string of damaging incidents at Redmond. (Microsoft's incidents were based on poor security practices and several breaches rather than bugs impacting downstream customer performance.)
As CrowdStrike’s CEO George Kurtz said on its last earnings call: “Following yet another major Microsoft breach… we received an outpouring of requests from the market for help. We decided enough is enough, there's a widespread crisis of confidence among security and IT teams within the Microsoft security customer base.”
CrowdStrike told The Stack: "CrowdStrike is aware of and investigating customer reports of Falcon systems consuming higher than expected CPU. The issue has been identified and isolated and a fix has been deployed. This is not a security incident – customer systems remain protected. We are working with affected customers to resolve this matter as quickly as possible."