Dive Brief:
- CrowdStrike's ill-fated update was live for 78 minutes, the company said in new details shared Monday in a filing with the Securities and Exchange Commission. The defective software update it deployed Friday quickly rendered global IT networks non-operational.
- The sensor configuration update for CrowdStrike’s Falcon sensor software was released at 4:09 UTC on Friday, shortly after midnight in the Eastern time zone, the company said in the SEC filing. “We identified and isolated the issue and the update was reverted at 5:27 UTC.”
- The company reiterated the outages it caused for customers using certain Windows systems was not the result of a cyberattack, and it pointed to remediation information and updates published on its blog. Systems running Falcon on Windows version 7.11 and above that downloaded the updated configuration during those 78 minutes were “susceptible to a system crash,” CrowdStrike said.
Dive Insight:
Even though CrowdStrike pulled the update barely an hour after its release, the damage was already done for many customers on certain Windows systems that encountered an endless reboot loop and the blue screen of death.
CrowdStrike’s update impacted an estimated 8.5 million Windows devices, less than 1% of all Windows machines, according to Microsoft.
The state of automated cloud-based software, which follows a common practice of continuous integration and continuous delivery, is such that software updates are deployed at once for many customers at scale.
CrowdStrike’s Falcon platform runs on cloud-native architecture and a set of automated tools and processes in the CI/CD pipeline, according to the company's product descriptions. This includes components for automated testing and a staging environment for quality assurance and A/B testing before the application is deployed into production.
The company has not explained how the defective update, which triggered a logic error resulting in a system crash, made it into the hands of customers. CrowdStrike did not respond to a request for comment.
In the SEC filing, CrowdStrike CFO Burt Podbere said the situation is evolving, adding: “We continue to evaluate the impact of the event on our business and operations.”
Though CrowdStrike reverted the update, customer systems that were already upgraded and crashing due to the defect could not simply go back to the previous stable version. CrowdStrike demonstrated self-remediation steps for impacted customers in a video released Monday.
Many impacted customers had to manually fix the issue in a multistep process internally.
“Almost all of our systems were hit and that meant more than 26,000 computers and devices had to be manually fixed by technicians, one at a time, at each of our contact centers and 365 airports around the world,” United Airlines CEO Scott Kirby said Monday in a post on LinkedIn.
Kirby described the event as “the most widespread technology outage the world has ever experienced.”