Hacker News new | past | comments | ask | show | jobs | submit login

So what about the BSODs that device drivers cause? Are those not device driver issues, but OS issues, or are they unfixable?



They could be any of those.

In a sense, BSODs aren't anything special -- all a BSOD means is that some code running in kernel mode has crashed or raised some exception that went unhandled. The same thing, when it happens in a user-mode program, gets you the error dialog box 'Program has stopped working'.

So the causes of BSODs and user application crashes are the same. The reason Windows has BSODs is that it's dangerous to keep the system going when something in kernel mode crashes. Things running in kernel mode have access to everything (think - all memory) and are deemed important enough to the operation of the whole system that a crash in one of those is a significant event that's worthy of special logging and rebooting. You can't guarantee, for example, that a display driver crash hasn't corrupted other parts of memory, cuasing potential for data loss if the system were to continue operating.

So, back to the original point. Device-driver BSODs from the big vendors are probably rare enough in general that you should suspect a hardware problem or glitch if you suddenly see one out of the blue. Graphics drivers, given their complexity, are a bit more prone to crashing though. Also, things running on the system can interact and cause the driver to crash.

Windows has lots of infrastructure in place for making sure device drivers behave safely. There's also good facilities for figuring out exactly what caused a BSOD beyond the usually cryptic-looking error code you see on the screen.

Resplendence WhoCrashed is handy: http://www.resplendence.com/whocrashed

Though if you really want to dig deep, the tools with the Windows SDK (particularly WinDbg) can let you achieve the same thing; they are developer tools though, so targeted more to that audience.

EDIT: Just to add in answer to your original comment, big-vendor graphics drivers are VERY often updated. I'd bet they're the most often updated drivers on a system. There are myriad reasons for this, both technical and competitive. That doesn't mean that long-standing problems are necessarily fixed, but both AMD and Nvidia have very regular releases with fixes and performance improvements.


Along a similar line, in my own experience, since around Windows 2000 (not ME) it's extremely rare to see a BSOD that isn't related to either bad hardware or drivers, more often than not hardware related to a driver than the driver itself.


Another subtlety is that the term 'driver' on Windows tends to be used for any loadable module that runs in kernel mode. So a driver often isn't actually related to running a particular piece of hardware. Rather, it's a piece of software that needs kernel-mode access to the system.

Two examples that demonstrate this point well:

- There are various tools out there that you can use to perform a live memory capture on a Windows system; not just doing a memory dump of a single process, but doing a live memory dump of the whole system without having to halt or reboot. I've used one of these and it works by loading a 'driver' component when it is run that does the memory capture from kernel-mode (it requires Admin elevation to run, obviously).

(For examples, see: http://www.forensicswiki.org/wiki/Tools:Memory_Imaging

I don't remember if it was one off this list that I tried though).

Another example: A friend of mine had a system that would inexplicably BSOD if he left it running for a long while, unattended (especially overnight). We initially suspected perhaps a heating issue (it was a small Intel NUC). After setting up for full memory dumps and then analyzing them after a BSOD occurred using WinDbg, we actually found out that the BSOD was being caused by a kernel-mode component of the anti-virus suite that he had installed -- I think at the time it was BitDefender, but not sure. When he consulted the AV vendor support website, I believe it turned out to be a known issue with a fix.

On my own systems, by far the largest cause of BSODs (of the few that I've seen over the last couple of years) has been RAM going bad. These typically manifest as BSODs out of the blue that seem to come from different modules each time they happen, or they come from a module deep in the system that 'shouldn't' have crashes. My personal rule is, if I see one, be vigilant. If I see another one, reboot and run MemTest86.


In practice, other than the Windows Kernel, the only things running in kernel space tend to be drivers, a/v software and malicious code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: