Analysis Last week, at 0409 UTC on July 19, 2024, antivirus maker CrowdStrike released an update to its widely used Falcon platform that caused Microsoft Windows machines around the world to crash.

The impact was extensive. Supply chain firm Interos estimates 674,620 direct enterprise customer relationships of CrowdStrike and Microsoft were affected. Microsoft said 8.5 million Windows machines failed. The results beyond a massive amount of IT remediation time included global flight and shipping delays due to widespread Windows system failures.

The cause, to the extent so far revealed by CrowdStrike, was “a logic error resulting in a system crash and blue screen (BSOD) on impacted systems.”

That crash stemmed from quite possibly mangled data that somehow found its way into a Falcon configuration file called a Channel File, which controls the way CrowdStrike’s security software works.

Channel Files are updated over time by CrowdStrike and pushed to systems running its software. In turn, Falcon on those machines uses information in the files to detect and respond to threats. This is part of Falcon’s behavioral-based mechanisms that identify, highlight, and thwart malware and other unwanted activities on computers.

In this case, a configuration file was pushed to millions of Windows computers running Falcon that confused the security software to the point where it crashed the whole system. On rebooting an affected box, it would almost immediately start up Falcon and crash all over again.

According to CrowdStrike, Channel Files on Windows machines are stored in the following directory:

C:WindowsSystem32driversCrowdStrike

The files use a naming convention that starts with “C-” followed by a unique identifying number. The errant file’s name in this case started with “C-00000291-“, followed by various other numbers, and ended with the .sys extension. But these are not kernel drivers, according to CrowdStrike; indeed, they are data files used by Falcon, which does run at the driver level.

That is to say, the broken configuration file was not a driver executable but it was processed by CrowdStrike’s highly trusted code that is allowed to run within the operating system context, and when the bad file caused that code to go off the rails, it brought down the whole surrounding operating system – Microsoft Windows in this saga.

“Channel File 291 controls how Falcon evaluates named pipe execution on Windows systems. Named pipes are used for normal, interprocess or intersystem communication in Windows,” CrowdStrike explained in a technical summary published over the weekend.

The configuration update triggered a logic error that resulted in an operating system crash

“The update that occurred at 04:09 UTC was designed to target newly observed, malicious named pipes being used by common C2 frameworks in cyberattacks. The configuration update triggered a logic error that resulted in an operating system crash.”

Translation: CrowdStrike spotted malware abusing a Windows feature called named pipes to communicate with that malicious software’s command-and-control (C2) servers, which typically instruct the malware to perform all sorts of bad things. CrowdStrike pushed out a configuration file update to detect and block that misuse of pipes, but the config file broke Falcon.

While there has been speculation that the error was the result of null bytes in the Channel File, CrowdStrike insists that’s not the case.

“This is not related to null bytes contained within Channel File 291 or any other Channel File,” the cybersecurity outfit said, promising further root cause analysis to determine how the logic flaw occurred.

Specific details about the root cause of the error have yet to be formally disclosed – CrowdStrike CEO George Kurtz has just been asked to testify before Congress over this matter – though security experts such as Google Project Zero guru Tavis Ormandy and Objective-See founder Patrick Wardle, have argued convincingly that the offending Channel File in some way caused Falcon to access information in memory that simply wasn’t present, triggering a crash.

It appears Falcon reads entries from a table in memory in a loop and uses those entries as pointers into memory for further work. When at least one of those entries was not correct or present, as a result of the config file, and instead contained a garbage value, the kernel-level code used that garbage as if it was valid, causing it to access unmapped memory.

That bad access was caught by the processor and operating system, and sparked a BSOD because at that point the OS knows something unexpected has happened at a very low level. It’s arguably better to crash in this situation than attempt to continue and scribble over data and cause more damage.

Wardle told The Register the crash dump and disassembly make it clear that the crash arose from trying to use uninitialized data as a pointer – a wild pointer – but further specifics remain unknown.

“We still don’t have the exact reason, though, why the channel file triggered that,” he said.

The Register spoke with cybersecurity veteran Omkhar Arasaratnam, general manager of OpenSSF, about how things fell apart.

Arasaratnam said the exact cause remains a matter of speculation because he doesn’t have access to the CrowdStrike source code or the Windows kernel.

 » …
Read More