The recent Crowdstrike Falcon Sensor software update that caused utter chaos across the world, with Microsoft machines getting stuck in an endless boot loop, was certainly a wake-up call for many.

Firstly, a work around for anyone still experiencing the ‘blue screen of death’, put out by Crowdstrike before releasing the revised patched update. Follow the steps below:

  • After a few restarts, you should see the Recovery Screen.
  • Click on: See Advanced Repair Options > Trouble Shoot > Advanced Options > Startup Settings > Restart
  • Once a new page is shown – Click on: Enable Safe Mode Startup
  • In Windows Explorer find the drive your Windows Operating System is on (maybe C drive)
  • Go to: Windows > System32 (folder) > Drivers (folder) > Crowdstrike (folder)
  • Search for a file called: C-00000291*.sys and delete that file.
  • Restart Computer

Hopefully that’s got you back to something more familiar.

Reference & further reading:

https://www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/

https://blogs.microsoft.com/blog/2024/07/20/helping-our-customers-through-the-crowdstrike-outage/

https://www.crowdstrike.com/wp-content/uploads/2024/07/Tech-Alert-Windows-crashes-related-to-Falcon-Sensor-2024-07-19.pdf

 

As the dust settles – What have we learnt

Winston Churchill once said ‘never waste a good crisis’ and he was right! Every crisis comes with a set of lessons that should be looked at very carefully and acted upon.

Eggs in one IT basket

Relying on a single system is asking for trouble! Although only 1% of Windows devices were affected, it was the major businesses using Crowdstrike, due to being the No.1 company for end point security that were hit the hardest. This is why the effects were so massive. It doesn’t matter how good the reputation or size of a company is, there are and never will be absolute guarantees from vulnerabilities to systems. If a monopoly fails, all fails… not good.

Code can be dangerous

We don’t know what exactly was wrong with the code in this update, we may never know. But what we do know is that it should never have been shipped to customers. The chaos that ensued from this code error, could have been a great deal worse! Bad code is dangerous, it can destroy entire systems, wipe data and be very expensive to fix. Although some may not think so right now – We got off lightly with this error in code and sloppy rollout.

Test, test, test… until theres nothing left to test!

If a Quality Assurance team doesn’t do their job properly, this is what happens. I’m sure there will be some very nervous people at Crowdstrike after the impact of this bad update. End users must feel confident that updates from software companies will not impact their lives in anyway but for the better.

Staged rollouts and roll back measures

A global simultaneous rollout of an update to all an organisation’s systems shouldn’t happen! Of course, there are issues related to staging rollouts such as teams working on different versions of some software. However, when it comes to mission-critical systems, caution must be exercised with upgrades when a failure is completely unacceptable.

Procedures to roll back to a working version should be implemented and made available to users making it easy to rectify such problems swiftly.

Disaster Recovery plan – is a must have!

If your business doesn’t have a robust data backup and disaster recovery plan in place, get one as soon as possible. As mentioned earlier, we got off lightly with this outage. Having your companies’ digital assets backed up in trustworthy storage is vital to safeguard its continuity and ensure a swift rebound to any disaster that may befall it.

How can we protect ourselves from an IT Disaster?

This incident is certainly a stark reminder that even a well-prepared company with robust, scheduled IT maintenance can be taken unawares through no fault of their own. The way our modern IT systems are interconnected can have massive consequences.

Employing an expert IT support team can go a long way to preventing extended downtime and ensuring that a businesses digital assets are safe and sound. They can maintain and monitor critical systems, and ‘react’ to potential issues quickly.

Having a business audited for potential flaws or weaknesses in its systems and protocols should be a priority for any company. Prepare an IT Disaster Recovery plan, make sure everyone is aware of it, update this plan regularly and test that it works. Remember it’s not just issues like code errors to protect against! Natural disasters, hardware failures, infrastructure failures, cyber-attacks or even human error can be to blame.

Also, backup everything, even cloud data needs to be backed up to protect it. Consult experts to scope out the specific requirements for your business.

 

Qdos Digital Solutions – We will be your IT department. We have it covered.

For more information, go to: https://qdos.digital/networks-it-support

Contact us at: https://qdos.digital/contact-us

Or call us on: 020 8763 8732 and ask about getting IT Support for your business.

 

Supporting articles:

https://qdos.digital/blog/it-support-sos-what-do-when-disaster-strikes

https://qdos.digital/data-backup-disaster-recovery

https://qdos.digital/networks-it-support

Get Protected with Qdos Digital

CrowdStrike IT outage continues to cause global disruption | BBC News

20 Jul 2024

A massive tech failure that caused chaos around the world on Friday is continuing to cause disruption into the weekend. Cyber-security firm CrowdStrike has apologised after an update to its antivirus software - which is designed to protect Microsoft Windows devices from malicious attacks – instead caused a global outage. The outage caused thousands of flight cancellations and delays across the world, while banking, healthcare and payment systems were also affected. But, while the software bug has been fixed, experts say the manual reboot of each affected Microsoft computer will take a huge amount of work – and may take some time.

For more news, analysis and features visit: www.bbc.com/news