The CrowdStrike Windows outage that hit the world this week stems back to an EU-Microsoft deal from 2009 that meant Microsoft had to give antivirus vendors the same Windows API access it had.
We all hate Microsoft for turning Windows into an ad platform but they aren’t wrong.
They are legally required to give Crowdstrike or anyone complete low level access to the OS. They are legally required to let Crowdstrike crash your computer. Because anything else means Microsoft is in control and not the software you installed.
It’s no different than Linux in that way. If you install a buggy device driver on Linux, that’s your/the driver’s fault, not Linux.
You are not wrong, but people don’t want to hear it. Do we want to retain control over what goes into kernel space or not? If so, we have to accept that whatever we stuff in there can crash the entire thing. That’s why we have stuff like driver signatures. Which Crowdstrike apparently bypassed with a technical loophole from how I understand it.
The thing is, Microsoft’s virus-scanning API shouldn’t be able to BSOD anything, no matter what third-party software makes calls to it, or the nature of those calls. They should have implemented some kind of error handler for when the calls are malformed.
So this is really a case of both Crowdstrike and Microsoft fucking up. Crowdstrike shoulders most of the blame, of course, but Microsoft really needs to harden their API to appropriately catch errors, or this will happen again.
I’m an idiot. For some reason, I was thinking about the Windows Defender API, which can be called from third-party applications.
I don’t believe there was any specific API in use here, for virus scanning or not. I suppose maybe the device driver API? I am not a kernel developer so I don’t know if that’s the right term for it.
Crowdstrike’s driver was loaded at boot and caused a null pointer dereference error, inside the kernel. In userspace, when this happens, the kernel is there to catch it so only the application that caused it crashes. In kernelspace, you get a BSOD because there’s really nothing else to do.
I stand corrected. For some reason, I was thinking they used the actual Windows Defender API, which can be called programmatically from third-party applications, but you’re correct, it was a driver loaded at boot. Microsoft isn’t at all at fault, here.
Nope. It’s a lower level kernel API that has to be accessed at boot via a driver. The API I was thinking of - and I use the term “thinking” loosely, here - is an API that userspace applications can take advantage of to scan files after boot is already complete.
I actually agree, I own my computer / OS and I should be able to do what you’re saying (install and break things). But Microsoft is a trillion dollar multi national corporation and I am certainly going to give them grief about this because I owe them less than nothing, let alone any good will.
But what if Windows have something similar to eBPF in Linux, and CS opted to use it, will this disaster won’t happen at all or in a much smaller scale and less impactful?
If you load hacky shit into the kernel it can always find a way to make a nasty surprise. eBPF is a little bit better fence, not some miracle that automatically fixes shitty code.
We all hate Microsoft for turning Windows into an ad platform but they aren’t wrong.
They are legally required to give Crowdstrike or anyone complete low level access to the OS. They are legally required to let Crowdstrike crash your computer. Because anything else means Microsoft is in control and not the software you installed.
It’s no different than Linux in that way. If you install a buggy device driver on Linux, that’s your/the driver’s fault, not Linux.
You are not wrong, but people don’t want to hear it. Do we want to retain control over what goes into kernel space or not? If so, we have to accept that whatever we stuff in there can crash the entire thing. That’s why we have stuff like driver signatures. Which Crowdstrike apparently bypassed with a technical loophole from how I understand it.
The thing is, Microsoft’s virus-scanning API shouldn’t be able to BSOD anything, no matter what third-party software makes calls to it, or the nature of those calls. They should have implemented some kind of error handler for when the calls are malformed.So this is really a case of both Crowdstrike and Microsoft fucking up. Crowdstrike shoulders most of the blame, of course, but Microsoft really needs to harden their API to appropriately catch errors, or this will happen again.I’m an idiot. For some reason, I was thinking about the Windows Defender API, which can be called from third-party applications.
I don’t believe there was any specific API in use here, for virus scanning or not. I suppose maybe the device driver API? I am not a kernel developer so I don’t know if that’s the right term for it.
Crowdstrike’s driver was loaded at boot and caused a null pointer dereference error, inside the kernel. In userspace, when this happens, the kernel is there to catch it so only the application that caused it crashes. In kernelspace, you get a BSOD because there’s really nothing else to do.
https://youtube.com/watch?v=wAzEJxOo1ts
I stand corrected. For some reason, I was thinking they used the actual Windows Defender API, which can be called programmatically from third-party applications, but you’re correct, it was a driver loaded at boot. Microsoft isn’t at all at fault, here.
Isn’t that API what the article is talking about?
Nope. It’s a lower level kernel API that has to be accessed at boot via a driver. The API I was thinking of - and I use the term “thinking” loosely, here - is an API that userspace applications can take advantage of to scan files after boot is already complete.
I actually agree, I own my computer / OS and I should be able to do what you’re saying (install and break things). But Microsoft is a trillion dollar multi national corporation and I am certainly going to give them grief about this because I owe them less than nothing, let alone any good will.
You are going to give grief to Microsoft for allowing what you want?
???
That doesn’t make any sense. How does arguing against your position do anything but harm it?
Maybe just give them grief over the myriad negative things they do that don’t counter your position?
But what if Windows have something similar to eBPF in Linux, and CS opted to use it, will this disaster won’t happen at all or in a much smaller scale and less impactful?
Crowdstrike managed to fuck up Linux through eBPF just as well.
https://access.redhat.com/solutions/7068083
If you load hacky shit into the kernel it can always find a way to make a nasty surprise. eBPF is a little bit better fence, not some miracle that automatically fixes shitty code.
But these eBPF loader bugs are fixed now. Windows drivers are still causing BSODs and will continue to do so until Microsoft adopts eBPF.
🤦♂️
Sorry, how is that related to the stability of the kernel?
I explained in my second sentence.
“They are legally required to give Crowdstrike or anyone low level access to the OS.”
If you install a buggy driver into Linux and it crashes, that’s not a problem with the Linux kernel.
https://www.redhat.com/sysadmin/linux-kernel-panic
I fully agree with you on that front, but ads have nothing to do with kernel access, so how is that relevant to their legal requirements?
I was explaining why everyone hates on Microsoft but the Crowdstrike crash had nothing to do with the reasons people hate MS.
Gotcha.