Fault in CrowdStrike caused airports, businesses and healthcare services to languish in ‘largest outage in history’

Services began to come back online on Friday evening after an IT failure that wreaked havoc worldwide. But full recovery could take weeks, experts have said, after airports, healthcare services and businesses were hit by the “largest outage in history”.

Flights and hospital appointments were cancelled, payroll systems seized up and TV channels went off air after a botched software upgrade hit Microsoft’s Windows operating system.

It came from the US cybersecurity company CrowdStrike, and left workers facing a “blue screen of death” as their computers failed to start. Experts said every affected PC may have to be fixed manually, but as of Friday night some services started to recover.

As recovery continues, experts say the outage underscored concerns that many organizations are not well prepared to implement contingency plans when a single point of failure such as an IT system, or a piece of software within it, goes down. But these outages will happen again, experts say, until more contingencies are built into networks and organizations introduce better back-ups.

  • ByteOnBikes@slrpnk.net
    link
    fedilink
    arrow-up
    23
    ·
    4 months ago

    I’m actually pretty excited to go to work on Monday.

    We have spent the past few years hardening our security and simplifying our critical systems. One way to doing that was to move a much off Microsoft as possible.

    And since I’ve been on vacation for the past week, I’m either going to walk into a nightmare shit show or everyone is going to be cheering that we are fully operational since we don’t depend on Microsoft.

      • ByteOnBikes@slrpnk.net
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        4 months ago

        I’m not on that team. For all I know, our password manager might be on a random window server. Or some middleware.

        In a major company where each team does their own thing and communicate through endpoints, It’s impossible to know every configuration.

      • ByteOnBikes@slrpnk.net
        link
        fedilink
        arrow-up
        2
        ·
        4 months ago

        Was thinking bout this (a week later).

        Following up, our partner/affiliate sites were down. Each partner connects to us to submit data, and half were government contracts that were down. It didn’t affect our systems, but it affected how we provide services to them.

        So it was a mild shit show.