Hacker News new | past | comments | ask | show | jobs | submit login
Nvidia GPUs can break Chrome's incognito mode (charliehorse55.wordpress.com)
474 points by charliehorse55 on Jan 9, 2016 | hide | past | favorite | 166 comments



Previous discussion on the same subject, about a post written by me: https://news.ycombinator.com/item?id=9245980

Basically, this issue is not restricted to NVidia GPUs or specific operating systems - This can be reproduced on Windows, Linux and OSX. Basically the concept of memory safety does not exist in the gpu space - which is the reason why the webgl standard is so strict about always zeroing buffers. The issue of breaking privacy and privilege boundaries on a multiuser system is very real, and there is no workable solution. This seems to be one of those problems where a lot of people are aware, but no one is sure how to fix it and so it just stays how it is.


This is an issue I raised with colleagues back in the late 90's. At the time only high end graphics systems (SGI) had hardware graphics contexts. So one openGL app could crash or scribble all over another openGL applications canvas.

The truth is a GPU is an entire second computer attached via PCIE bus. As far as security is concerned this will continue to be a shit-show until we accept that fact and act accordingly.


How do we "act accordingly"?


By providing memory safety to the GPU in either software or hardware, or most likely some combination of both just like with standard memory now.


This isn't an issue anymore, nowadays there are MMUs which isolate shaders from memory they aren't supposed to touch.

OP's problem stems from the fact that some video buffer used by his browser hasn't been cleared after deallocation. At some later time this buffer either has been erroneously displayed instead of the game's buffer or has been allocated to the game which erroneously displayed it without filling with new content.


Perhaps you might be able to use the APIs designed for implementing DRM. There is a flag called D3D11_RESOURCE_MISC_HW_PROTECTED when creating a surface.


> there is no workable solution

Forgive my ignorance (not a graphics programmer), but why can't the drivers simply clear the buffer before handing it off to another application?


Pretty sure it has to do with benchmarks and the cut-throat competitive environment GPU-manufacturers exist in where you cut all corners to proclaim "We're the fastest!"

Not zeroing a buffer cuts a big constant out of overhead. If you know which of the benchmarks will fail if you don't zero the buffer, you code in an "exception" so the benchmark doesn't fail and other applications act wonky. This isn't the "first time" nVidia has been caught doing this, see:

http://www.geek.com/games/is-nvidia-cheating-on-benchmarks-5...


Apparently, AMD also partakes in benchmark-specific ''optimizations''[1]. Transparency is why many of us push for open source drivers.

http://www.cdrinfo.com/Sections/News/Details.aspx?NewsId=288...


Both AMD and NVidia drivers have special code paths for different applications. I don't think it's anything sinister since these are mostly fixes for the game bugs and the rest are resolutions for API ambiguities.

To give an example, consider the difference between memcpy() and memmove(). On most systems memcpy() is as same as memmove() in the sense it works even when the source and destination overlap. Then you decide to optimize memcpy and to prevent bugs like this https://bugzilla.redhat.com/show_bug.cgi?id=638477 you will need to set a flag USE_MEMMOVE_INSTEAD_MEMCPY for every app that you know to memcpy between overlapped regions. You could call this "cheating" or could be a reasonable person and say something like this https://bugzilla.redhat.com/show_bug.cgi?id=638477#c129 instead.

As for the original question. I am not an expert on the windows driver model but have written some GPU drivers and can tell that a) memory release is asynchronous i.e. you cannot reuse the memory until the GPU finishes using it and b) clearing graphics memory from CPU over the PCIe is slow and drivers, in general, do not program GPU on their own. Taking these into account, it seems the driver is not well positioned to do this and this is a task for the OS instead.


"I don't think it's anything sinister since these are mostly fixes for the game bugs and the rest are resolutions for API ambiguities." I think it is big problem. It is the same as forcing intel to change their CPU to workaround bugs in your application.


This analogy is pretty spot on, actually. It's a result of a long process of software evolution that went awry and this is a big reason why we need new APIs like Mantle, Vulkan and DirectX 12.

See this fascinating post:

http://www.gamedev.net/topic/666419-what-are-your-opinions-o...


You mean something like this https://en.wikipedia.org/wiki/A20_line ?


One could argue that address overflow above 1MB was not a bug, but a feature of the early real-mode CPUs and hence (ab)using it wasn't really a bug either.

Probably even Intel didn't anticipate protected mode with its 24 bit address bus when designing the 8086. 1MB was enough for everyone at this time.


Exactly my point. Intel was "forced" to fix a bug in software by changing its hardware. The A20 gate was not to prevent programs from accessing "HMA" it was to fix programs, which generated addresses above 0xfffff and expected it to wrap around.


This probably won't happen, but it seems that games programmers are a large cause of problems for driver writers. Having to workaround bugs in games is bad for everyone.

Games Studios, IMO, should be made to fix their bugs themselves. They all have patching mechanisms these days, so it's not like it isn't impossible, or even unfeasible.


Not having this much problems fixed in the API is being currently worked on with DX12 and Vulcan. The point being removing a huge bunch of the abstraction provided by dx/opengl and thus forcing the dev to write more sensible code.

Currently the engine developer in graphics programming writes something and in reality he has no way of knowing what actually happens on the hardware (the API is just too high level to able to really know much). From there it is the hardware providers job to take out their own debugging tools and make sure correct things happen by having a custom code path in the driver.


It's a bit of the opposite, actually. There was a great article posted here (titled "Why I'm excited for Vulkan") where they explain how proprietary "tricks" GPU vendors use account for much of the necessity for game specific driver updates and optimizations. Game patches are to game bugs what driver updates (or "game profiles") are to what?

Lower level APIs like DX12 and Vulkan remove the competitive advantage vendor dependent performance creates, so well-coded games can perform consistently with lower overhead across ranges of hardware without having to rely on vendors to patch in the shortcuts through their drivers.

Currently, it's like filming a movie with IMAX specifications, then finding out that at different cinema chains it played with quality aberrations because their projectors didn't truly follow IMAX spec. The chains can fix it, but you're already getting blamed for the movie's issues. However, for a little money, on your next film they offer to work closely with you to ensure it shows the way you intended in their theaters. And no, they can't just tell you how to fix it-- their projection technology is a trade secret.


Eh given the constraints as you spell them out it seems a clear at the release moment driven by a shader could work.


This is probably because my explanation is very brief. I don't see how a shader (a program running on the GPU) can detect that the OS has killed a process and initiate a clear.


shader is a program executed by the gpu and can manipulate the memory, driver can create a fake surface out the freed memory and run the shader on it (which would avoid the need of zeroing the memory from the cpu trough the pcie)


Well, this is the whole point - how driver knows which memory is freed and how driver runs a shader by itself?


when you do a release on a texture object, when the context is destroyed, when the glDeleteTextures is called.. you just have to enumerate it all, but eventually all functions are passed to the graphic drivers to be translated into gpu operations.


It's as same as saying that a HDD driver can zero deleted files and delete temporary files when a process is killed because it translates API calls into HDD controller commands.


The comparison is apt because it would be exactly like the TRIM operation, retrofitted into the protocol for supporting drivers.


So, in your opinion the driver issues TRIM, not the OS? Then it would be possible to get it on, say, Vista with a driver update, would not it?


Yes, the idea is that the manufacturer is in the position of knowing the most efficient way to talk to it's gpu and the driver knows everything it's happening memory wise. It'd be interesting to have a prototype done in some open source linux driver. Tbh, I'm probably not good enough for that.


So why there is no TRIM support in Vista, or in Win7 for NVMe? Vista and Win7 use the same drivers, JFYI.


sorry missed 'So, in your opinion the driver issues TRIM, not the OS' from the previous reply. I never said that and that was not my point

my point was that an optional post delete cleanup feature was added to the protocol ready to be used, which is a perfect example on how to evolve long term features. then I said the post cleanup feature for the GPU should sit on the driver, since the GPU driver is the one knowing how to talk to the hardware, as there is not a shared protocol between boards (except vga modes etc but those contexts are memory mapped and os managed) and knows when a clear is performed, since all operations go trough it.


As I said, the driver does not know when to clear, same as HDD driver does not know when a file is deleted.


I don't get it, though. How can the reason be due to benchmarks / performance seeking?

The driver simply has to zero the buffer when the new OpenGL / graphics context is established. It's once per application establishing a context, not per-frame (the application is responsible for per-frame buffer clearing and the associated costs). At worst this would lengthen the amount of time a GPU-using application takes to start up and open new viewports, but that hardly seems like it would matter or even register on any benchmarks.


The thing is it probably not once per application. I'd imagine using multiple frame buffers in an application is actually quite common and could change quite often while an application is running; especially in complex applications like games. It's probably not enough of a hit to really justify not clearing the buffer but it's enough to make it noticeable in the benchmark race.


Across an entire computer system? For all applications? Even games? Sharing data across process boundaries is undesirable, but is something most computer users would accept if the alternative was reduced performance.

Why not just fix this in the browser? The real issue here is that this data isn't just being shared across processes but potentially with websites through malicious webgl.


If you can waste the time on allocating a buffer, you can waste the time on zeroing it. If you're in a hot loop you shouldn't be allocating giant chunks of memory.


It would be interesting to know how the cost of allocating a new fbo would compare to the cost of zeroing it out. My guess is that the cost of getting into the kernel to do the allocation in the first place would dominate, but by how much would be something neat to measure.


If it's costing a lot of time to clear a buffer, doesn't that tend to indicate that's something the video card manufacturer should design an enhancement or fix for?


> This seems to be one of those problems where a lot of people are aware, but no one is sure how to fix it and so it just stays how it is.

Well, MMUs on GPUs have been standard for a while. They just need to use them properly and at least have an opt-in mechanism to enforce zeroing of newly allocated pages.


If your mechanism is to be opt-in then it should clear deallocated pages :)


Indeed, you'd think each "app" (set of interacting processes; container; whatever) could have its own virtual GPU. But how would a compositing window manager work in that setup?


That _already_ requires sharing ownership of those GPU mappings, and you need to keep track of ownership at that level, or how else will the compositor know that the app is done drawing its window? If you don't wait, you can end up with tearing or even garbage in windows.


The same way shared memory works on general purpose computers.

Your porn browser declares it wants to share its window buffer with the WM, the kernel maps the buffer into WM's GPU address space and now WM's shaders can read from this buffer.


Indeed, depending on the drivers used, the VRAM contents can even survive across reboots. I saw this with some drivers on Linux some years ago, not sure if they still don't clear VRAM on boot.


It is still possible to inspect previous framebuffers after rebooting.


Oddly enough, this used to be a feature—in the system architectures of 30-40 years ago.

A lot of consoles had no concept of the CPU being the final arbiter of the system being running or halted—things like the PPU and SPU and so forth would just continue to merrily loop on even if the CPU halted, or was executing a CPU-reboot instruction.

You can see this in many NES and SNES games, where the game will "soft-lock": the CPU crashes, but the music (being a program running on the SPU) keeps playing, and the animations on the screen (being programs running on a PPU, or dedicated mappers feeding it) keep animating.

But this isolation can also be used deliberately, especially where the framebuffer is concerned. Since systems up until the 1990s-or-so had extremely small address-spaces ("8-bit" and "16-bit" are scary terms when you're trying to write a complex program), and since console games were effectively monolithic unikernels (even the ones "running on" OSes like DOS: DOS would basically kexec() the game), frequently a game's ROM or size-on-disk would exceed the capacity of the address space to represent it.

The solution to this was frequently to actually have several switchable ROM banks or several on-disk binaries, and to effectively transparently restart the system to switch between them. This isn't equivalent to anything as "soft" as kexec(); you wanted the CPU's state reset and main memory cleared, so your newly-loaded module could immediately begin to use it. Any state you wanted to preserve between these restarts would be stored on disk, or in battery-backed-RAM on a cartridge.

This is how C64 games managed to fit a rich-looking splash screen into their game: the splash screen was one program, and the game was another, and the splash screen would stay on the framebuffer while the C64 was rebooting into the game.

This is also the architecture of games like Final Fantasy 6 and 7: when the credits list developers' roles as something like "menu program", "battle program", or "overworld program", those aren't mistranslations of "programming"—those were literally separate programs that the console rebooted between, hopefully finishing in the time it took for the console to finish executing a fade-out on the PPU. When a battle starts in a Final Fantasy game, the CPU has been reset and main memory has been entirely cleared; everything the game knows about to run the battle is coming from SRAM. (And the reason the Chrono Trigger PSX port feels so laggy is that CT has this architecture too, but reading a binary from a CD takes a lot longer than switching a ROM bank. Games designed in the PSX era took that into consideration, but ports generally didn't.)

I've always thought it'd be cool to re-introduce this idea to game development, through a kind of abstract machine in the same sense as Löve or Haxe. You'd have a thin "graphical terminal emulator" that would contain PPU and SPU units and the SRAM, and would be controlled through an exposed pipe/socket (sort of like a richer kind of X server); and then you'd write a series of small programs that interact with that socket, none of them keeping any persistent state (only what they can read out of the viewer's SRAM), all of them passing control using exec().

(There's another thing you'd get from that, too: the ability to write strictly-single-threaded, "blocking" programs that nevertheless seemed not to block anything important like frame-rendering or music playing. You know how "Pause" screens worked in most games? They just threw up an overlay onto the PPU and then stuck the CPU into a busy-wait loop looking for an 'unpause' input. The game's logic wouldn't continue, but the game would still "be there", which was just perfect. This also allowed for "synchronous" animations—like Kirby's transformations in Kirby Super Star, or summoning spells in the FF games, or finishers in fighting games—to just run as a bit of blocking code on top of whatever was currently on the screen, without worrying that something would change state out from under them.)


The C64 showed an similar effect as described in the original post. To to a full reset you had to turn it off, wait about 5 seconds and turn it on again. If you did not wait the RAM was at least partly preserved. The C64 had no dedicated video RAM, the VIC just read from regular RAM. So if you did the power cycle quickly enough the screen was preserved.

Another effect of the screen RAM being regular RAM was that you actually could run programs in it while it was being displayed. You could watch a program run in the literal sense. This was often used by unpackers. The unpacker run in the screen RAM, filling the rest of the RAM. After its job was done the game started and filled the screen RAM with graphics destroying the unpacker.

EDIT: Found a video showing activity in the screen RAM of a C64. I'm not sure if this is the unpacker or this is really code execution, but it looks similar how I remember it.

https://www.youtube.com/watch?v=5nDzFsCEZT8&feature=youtu.be...


That's neat. But did that only work then with games that had writable storage media in the cartridge? I know that was rare on NES games.

Or was there secondary memory besides the RAM and disk that allowed for data to be passed between resets?


Yes, there was battery-backed SRAM, and then there was regular SRAM. Regular SRAM was volatile: it would be guaranteed to survive the CPU reboot instruction, but wouldn't survive poweroffs. Only some consoles had it; these were usually the same ones with a really slow bus speed for the battery-backed SRAM. (And other consoles, like the C64, actually didn't clear main memory on CPU reboot; on these consoles, you'd instead manually cycle through clearing everything except what you wanted to keep, and then reboot.)

In later consoles that had their own MMUs, like the PSX, this wasn't a full hardware feature anymore, but rather a simulated convention. You'd "reboot" by dropping all your virtual-memory mappings except one, then asking the disk to async-fill some buffers from the binary you wanted to launch and then mapping those pages and jumping into them on completion. (Basically like unloading one DLL and then loading a different one, except you're also forcefully dropping all the heap allocations the old DLL made when you unload it.)

In both the hard and soft implementations, the "volatile SRAM" page could be thought of as basically a writeback cache for the state in the actual battery-backed SRAM. You wouldn't want to do individual byte-level writes to SRAM (writes to SRAM were slowwww), so when the game booted, you'd mirror SRAM to your state-page, and then update the state-page whenever you had something you wanted to persist—finally dumping it out to battery-backed SRAM when the player hit "Save". Basically, most games were "auto-saving" from the beginning—but they were auto-saving to volatile memory.

But even games that had true "auto-saving", like Yoshi's Island, still kept an "SRAM buffer" like this; the write-to-SRAM event was just managed as a sequence of smaller bursts of memory-mapped IO done by modules that sat there playing music+animations and not doing any logic, like YI's "field/title card" submodule when re-entered from a loss-of-life event, or its "field/score card" submodule entered by completing a stage. If there is ever what seems to be a "pointlessly long" animation in an auto-saving game of that era near a state-transition, it's probably by design, to cover for an SRAM cache-flush. (The fact that flashy "rewarding" animations turned out to also be good game design, favored by slot-machine and casual-game designers the world over, is mostly coincidence.)

---

ETA: when you had neither kind of SRAM, but still wanted to preserve some state across a reboot, what could you do? Well, write it to video memory, of course!

On a system with a PPU, the PPU owned the VRAM; it wasn't the CPU's job to reset it, but rather the PPU's. On a system with only a framebuffer, nothing owned the framebuffer (or rather, the framebuffer was an inherited abstraction from the character buffers of teletypes: the "client" owned the framebuffer, so it was up to the "client" to erase it. Restarting a mainframe shouldn't forcibly clear all its connected teletypes; disconnecting from an SSH session shouldn't forcibly clear your terminal, but rather optionally clear your terminal as a way your TTY driver is set to respond to the signal; etc.)

Either way, if you could get your data into VRAM generally or the framebuffer specifically, you could very likely read it back after reboot.

VRAM is also where many "online" development suites—the BASICs and Pascals of the time—expected you to write exception traces. Rather than trying to "break into" a debugger (i.e. cram a debugger into the same address space as your software), you'd simply have your trap-handler persist the stack-trace to VRAM, switch your tape drive over to the devtools disk, and reboot. The "monitor" would load, notice that there's a stack-trace in VRAM, parse it, read the pages it mentions from your program (now on the slave tape) and display them.


Thank you, that was incredibly interesting!


Scanning the comments, it seems broken that there's no way for an end user to clear GPU memory. Particularly a way that works across platforms, or at least across drivers would be ideal.

If it can persist across restarts... does a shutdown do anything differently? I'm not too familiar with GPU programming, but there has to be something we can do.


Yeah, I remember reading that when it was first posted. Glad to see other people are aware, but it's disappointing nothing has been done.

The fix is pretty simple, the GPU manufacturer just needs to update their driver to zero the VRAM like an OS would with RAM.


Or the window manager should zero things out? It seems more consistent for the OS to be responsible for this then say the driver which also does things like supporting OpenGL and GPGPU usecases.


You can generally allocated GPU buffers without the window manager. So to be effective this would have to be in the kernel driver. Opengl/GPGPU is generally done in the user space driver so it'd still be separate.


When graphics card vendors want to protect their competitive edge and not do OS integration beyond dropping horrible and opaque binary blobs every now and then, they'll also have to mop up the security issues such as this on their own.

Technically this should be a OS responsibility, but practically the vendors have made that all but impossible.


> The fix is pretty simple, the GPU manufacturer just needs to update their driver to zero the VRAM

You seem to be using OSX (judging by the screenshots).

You should be aware that OSX's GPU drivers are written by Apple (or at least they act as the gatekeeper). You need to send the bug report to them. And perhaps update the title of your post to "Apple breaks..."

I've seen this exact same behavior on OSX with an Intel GPU.

As mentioned elsewhere in this post, e.g. Windows WDDM drivers require memory to be zeroed out.


Are device drivers ever updated, much less for security issues?

Seems like an obvious way to increase the take, in any event: "Chip X has a security flaw. Chip Y does not. Buy chip Y now or Evil People will own-zor all your cash-zors."


Since we're talking about GPUs... search the NVIDIA archive [1], and look for the WHQL drivers:

    Version: 359.00 - Release Date: Thu Nov 19, 2015
    Version: 358.91 - Release Date: Mon Nov 09, 2015
    Version: 358.87 - Release Date: Wed Nov 04, 2015
    Version: 358.50 - Release Date: Wed Oct 07, 2015
[1] http://www.nvidia.com/drivers/beta


So what about the BSODs that device drivers cause? Are those not device driver issues, but OS issues, or are they unfixable?


They could be any of those.

In a sense, BSODs aren't anything special -- all a BSOD means is that some code running in kernel mode has crashed or raised some exception that went unhandled. The same thing, when it happens in a user-mode program, gets you the error dialog box 'Program has stopped working'.

So the causes of BSODs and user application crashes are the same. The reason Windows has BSODs is that it's dangerous to keep the system going when something in kernel mode crashes. Things running in kernel mode have access to everything (think - all memory) and are deemed important enough to the operation of the whole system that a crash in one of those is a significant event that's worthy of special logging and rebooting. You can't guarantee, for example, that a display driver crash hasn't corrupted other parts of memory, cuasing potential for data loss if the system were to continue operating.

So, back to the original point. Device-driver BSODs from the big vendors are probably rare enough in general that you should suspect a hardware problem or glitch if you suddenly see one out of the blue. Graphics drivers, given their complexity, are a bit more prone to crashing though. Also, things running on the system can interact and cause the driver to crash.

Windows has lots of infrastructure in place for making sure device drivers behave safely. There's also good facilities for figuring out exactly what caused a BSOD beyond the usually cryptic-looking error code you see on the screen.

Resplendence WhoCrashed is handy: http://www.resplendence.com/whocrashed

Though if you really want to dig deep, the tools with the Windows SDK (particularly WinDbg) can let you achieve the same thing; they are developer tools though, so targeted more to that audience.

EDIT: Just to add in answer to your original comment, big-vendor graphics drivers are VERY often updated. I'd bet they're the most often updated drivers on a system. There are myriad reasons for this, both technical and competitive. That doesn't mean that long-standing problems are necessarily fixed, but both AMD and Nvidia have very regular releases with fixes and performance improvements.


Along a similar line, in my own experience, since around Windows 2000 (not ME) it's extremely rare to see a BSOD that isn't related to either bad hardware or drivers, more often than not hardware related to a driver than the driver itself.


Another subtlety is that the term 'driver' on Windows tends to be used for any loadable module that runs in kernel mode. So a driver often isn't actually related to running a particular piece of hardware. Rather, it's a piece of software that needs kernel-mode access to the system.

Two examples that demonstrate this point well:

- There are various tools out there that you can use to perform a live memory capture on a Windows system; not just doing a memory dump of a single process, but doing a live memory dump of the whole system without having to halt or reboot. I've used one of these and it works by loading a 'driver' component when it is run that does the memory capture from kernel-mode (it requires Admin elevation to run, obviously).

(For examples, see: http://www.forensicswiki.org/wiki/Tools:Memory_Imaging

I don't remember if it was one off this list that I tried though).

Another example: A friend of mine had a system that would inexplicably BSOD if he left it running for a long while, unattended (especially overnight). We initially suspected perhaps a heating issue (it was a small Intel NUC). After setting up for full memory dumps and then analyzing them after a BSOD occurred using WinDbg, we actually found out that the BSOD was being caused by a kernel-mode component of the anti-virus suite that he had installed -- I think at the time it was BitDefender, but not sure. When he consulted the AV vendor support website, I believe it turned out to be a known issue with a fix.

On my own systems, by far the largest cause of BSODs (of the few that I've seen over the last couple of years) has been RAM going bad. These typically manifest as BSODs out of the blue that seem to come from different modules each time they happen, or they come from a module deep in the system that 'shouldn't' have crashes. My personal rule is, if I see one, be vigilant. If I see another one, reboot and run MemTest86.


In practice, other than the Windows Kernel, the only things running in kernel space tend to be drivers, a/v software and malicious code.


> Are device drivers ever updated, much less for security issues?

Yep, all the time. http://i.imgur.com/8k3ffLa.png


People who often play 3D games always update their graphics drivers :)


Open source gpu drivers on Linux clear all allocations the kernel driver hands to userspace. And where it exists, different clients are also isolated from each another through the gpu MMU. On top of that all drivers guarantee that no gpu client can escape the gpu sandbox to general system memory, and on chips where hw engineers didn't provision any useful hw support to enforce this it is done by pretty costly gpu command stream parsing in the kernel.

You can't opt out of these security features on upstream/open source linux drivers either.

Now of course this won't insulate different tabs in chrome since chrome uses just one process for all 3d rendering. But GL_ARB_robusteness guarantees plus webgl requiring that you clear textures before handing them to webpages means that should work too. On top of that webgl uses gl contexts (if available), and on most hw/driver combos that support gpu MMUs even different gl contexts from the same process are isolated.

This really is a big problem with binary drivers, and has been known for years.


WebGL always seemed too low-level for reasons like this - same way Gmaps crashes my Firefox browser all the time.


Back in 2000, a very similar problem was my first lesson in frame buffers.

I was watching some adult material using Quicktime on Windows 98. A few hours later, I wanted to show my mom something on my computer. As it loaded the new video in Quicktime, the last frame of the porno sat there in inverted colours until the new video began to play.

I had closed Quicktime hours ago... what was that still doing there in memory?

Needless to say it was very awkward.


I was a Linux user in the early 2000s and learnt about framebuffers in a similar but less embarrassing way: sometimes X would crash, and before the root weave was drawn, the last thing you were doing when X crashed appeared for a moment.

X crashed a lot back then, so everyone learnt pretty quickly.


I dual booted Win 98 and various Debian flavors. Once I got OpenGL going, I would sometimes see the "last" frame of a Windows session flash as X started up.


Similar happened to me, except it was the entire family staring at a 50" TV waiting for me to start a movie Christmas eve.


I remember the first time I played porn on my TV. I felt like I had tainted in an irreversible way. Moreover, I had a nagging fear that somehow it would show up later when I didn't expect it (you know, like a file you forgot in your Downloads directory). Thanks for letting me know my fears weren't entirely unfounded...


From bash.org (which appears to be down, so linking to Google cache):

http://webcache.googleusercontent.com/search?q=cache:_JGpv1r...


Ouch. The only thing I could think of to fix this is to burn a different image overtop and hope the resultant burn is unrecognizable.


I learned about it by accident too, but fortunately in a less embarrassing way - when I was first learning OpenGL I've noticed that displaying uninitialized framebuffer could sometimes cause "random noise" to show up. "Random noise", that when the window size was just right, suddenly turned into a screenshot of the game I was playing moments ago. :).


Serious kudos to the author for posting this even though he mentions viewing pornography.

I had a similar problem on iOS. When I load Safari, there's usually a flash of the previous screen (probably cached as a PNG), then the page loads. I think it looks junky; I'd prefer a "loading" screen. It would flash the previous screen whether I was in private mode or not. So porn would flash on my screen. I didn't file a bug report or mention it on my twitter because I'm a little afraid of the reception. So, again, thank you charliehorse55.

edit: i said "cached as a PNG" but that's just what I thought prior to reading this article. it could be many things, including this bug.


That is indeed "cached as a PNG." iOS caches previews for the multitasking switcher. Sometimes I could see screenshots of pages that were opened a month ago.

Apps like 1Password make their screen blank when they go into background to prevent this… Firefox for iOS does this too!


I've seen that too, sometimes the tab previews in the tab overview often show outdated thumbnails even if you wait a while - it never settles to the correct thumbnail.


This seems fixed in iOS 9 - if Safari is in private mode, the 'screenshot' for multitasking/restore is of empty page.

My e-banking does the same thing - as it requires password when switching back into tab, they show just a big bank logo to hide your account history from multitasker.


Believe it or not everybody watches porn.


It isn't that everyone watches porn, it's that watching porn is just as moral as watching any other form of entertainment.

Anti-porn crusaders aren't necessarily hypocrites. They also don't need to be hypocrites to be wrong.


Morals of watching pornography? I thought we were past their point already.


Careful, you're showing your hand.


I also wear gray shirts and occasionally listen to ska.

Nothing makes people quite so alien as differing moral codes. I'm "showing my hand" as regards something that simply does not matter to the majority of people on this website. Trying to make a big deal out of it simply makes you look strange.


So are you.


This is totally not true. :/


Of course excluding the blind


Does believing that unsubstantiated (and irrelevant) claim make you feel less of a failure?


This sounds like fun, especially if webpages can use WebGL to read old buffers back into javascript variables - and then AJAX them out silently in the background. (preserveDrawingBuffer + canvas.toDataURL() + ajax ?)

Edit: Also, "google chrome incognito mode is apparently not designed to protect you against other users on the same computer".. what? Isn't that the only thing it can and should protect against? It's not like it can protect against non-local users (i.e. HTTP network interceptions)


This is exactly the reason why the webgl standard strictly forbids allocating buffers without clearing them first. Otherwise anything the user looked at since the last power cycle - including emails, passwords, private keys, ... -could be extracted by visiting a website.


How long until we see the first infoleak bug where some combination of OS+driver+browser+webgl-command-sequence misses a buffer to clear - or "optimizes" it away - or fails to bounds-check a texture coordinate - etc? :)


We've already had these kinds of issues with webgl. Here's one that I found through some googling: http://www.cvedetails.com/cve/CVE-2014-3173/

You don't need webgl for this kind of infoleak either, regular good old 2d canvas also supports allocating memory. It also supports reading the current state of all of the pixels in the buffer through Javascript, so if you have an exploit that gets you an uninitialized canvas you can easily send whatever memory contents you got back to your server for later analysis.


Not always. I was trying to write an android application to serve as a frontend to a site by launching a background webview, drawing the elements I'm interested to a canvas and sending the pixels back to the application. (Un)fortunately, after you draw an HTML DOM element to a canvas, you're forbidden from reading the canvas pixels back and there's no flags you can set on the webview to let you do it.


Indeed, if the data in a page's canvas has a different origin, you're not allowed to read pixels back (http://www.w3.org/TR/html/scripting-1.html#security-with-can...).

If the DOM element you draw has the same origin as your canvas it seems like (from my reading of the spec) you should be allowed to do what you describe.


It's called a bug. It gets fixed. There's nothing special about this type of bug any more than any bug in the browser.


Remote screenshoting of hours old content across distinct local user accounts is perhaps more serious than many other bugs, especially when there seems to be a blame game going between the app/os/gpu vendors.


And in the mean time you share your porn with NSA for few years.


Well, the problem is it isn't a single bug, WebGL is an entire minefield of bugs. OpenGL drivers were generally never written with security in mind, and now all of a sudden we've got untrusted code able to poke away at them.

WebGL being enabled by default is insanity in my opinion.


RE your edit, there is a reason incognito mode is also known as "porn mode". It's primary use case - that obviously can't be stated officially - is to let you browse porn without fear that your mom / girlfriend / boss will find out by checking your browser history or having the site's URL show up as a suggestion when typing something in the address bar. It has never been a serious security tool.


Obviously. It does have another use case too, for developers, it's an easy way to run parallel login sessions on webapps without stomping over cookies. :)


True; I often do exactly that :).


And it would be great to have more than one incognito mode for that purpose precisely (to be able to run more than two sessions).


Chrome has "profiles" you can use to log into multiple sessions simultaneously. Not sure if you can combine it with incognito mode.


I use QupZilla because this is its default configuration (every window is a separate multi-tab [incognito] session). Not so great last time I checked on the Mac but one of my favorites on the PC.

https://www.qupzilla.com/


AFAIR on Chrome the session does not propagate between the tabs that don't share history (i.e. separately created, not one spawned by the other).


Session is shared across all incognito tabs. Firefox behaves the same way. Even if you close any incognito tabs and then open a new one, it's still on the same session. You need to completely restart the browser (or explicitly clear history) to clear.


Are you sure of that? When I have an amazon tab with a basket full, create a new tab and go on amazon I see the same basket.


I don't have Chrome on the machine I'm using now so someone else needs to check it. But I vaguely remember running at work several sessions to the same service in separate incognito windows and maybe even in multiple tabs of the same window. Now that I think of it, I'm not sure about the latter scenario.


Both scenarios use the same session.


Shares them across all incognito tabs for me.


Yeah. I wish all apps had an incognito mode for this reason.

Imagine being able to open an incognito terminal to type commands that won't get saved to history or pollute what you already have.


In bash (and probably others) commands prefixed with space are removed from history. (it's a setting - google HISTIGNORE )


You can:

    unset $HISTFILE


I thought incognito mode was introduced as a “solution” to a security problem, namely browser history leak via :visited link style that is nearly impossible to fix without without violation of one of the oldest specifications of the Web. At least that's what it looked like then.

People considered it a profile switching for dummies, now no one remembers that browser can have multiple profiles.


> read old buffers back into javascript variables

As I understand it, this is not how frame buffers work. All they contain is the rastered data to drive the a given frame.


> Google marked the bug as won’t fix because google chrome incognito mode is apparently not designed to protect you against other users on the same computer.

That's nonsense. For most users, that's exactly what it's used for.

I really think Google is dropping the ball here. I know it's not their bug, and they shouldn't have to work around it in an ideal world, but this is a pretty clear leak of data outside of private mode. It wouldn't impact performance in any noticeable way (you're closing the window anyway at this point), and would just be an extra safeguard.

Very short-sighted of them to ignore this bug. Perhaps we could ask distro maintainers to add patches for this to their builds of Chromium.


Interesting. I've never thought of it as a security issue. But it's something that's around since forever. I've seen old framebuffers containing stills from games or videos showing up when resizing OpenGL applications 15 years ago. Video cards don't clear memory for the same reason nothing is ever deleted by default - it's a waste of time.


> Video cards don't clear memory for the same reason nothing is ever deleted by default

Modern operating system zeroes memory pages all the time. It is a security measure, and ensuring security is by no means a waste of time.


WDDM 2.0 gpummu is supposed to ensure that the memory has been zeroed between different applications that use virtual GPU memory.

If this is the case there might be a compliance issue on nVidia side which makes me wonder if webgl is vulnerable also.

WebGL was amended to request a zero when provisioning or disposing of a buffer but it relies on the API which is handled by the driver if nVidia is taking some shortcuts to save time it might be possible to leech stale memory this way.


> WDDM 2.0 gpummu is supposed to ensure that the memory has been zeroed between different applications that use virtual GPU memory.

Which Windows version introduced this WDDM version? Could it be that OP is running an older version?

> WebGL was amended to request a zero when provisioning or disposing of a buffer but it relies on the API

This is indeed a tricky situation. All modern GPUs do "zero bandwidth clears" which means that upon clearing, nothing gets written to the actual framebuffer, the memory is just marked "cleared" (by writing some special bits to the L2 cache, for example). This makes it difficult to reason whether there's any sensitive content left in the framebuffer.

edit: nevermind, the OP seems to be using OSX, so it's not WDDM. Additionally, the OSX GPU drivers are written by Apple.


Yeah this was confusing he said that it was an Nvidia issue which is why I thought it was on Windows.

As far as WDDM goes 2.0 requires that for sure I'm pretty sure this was part of the original WDDM GPUMMU spec also but I can't really find those details anymore on MSDN since most of the pages refer to 2.0 atm.


Sometime around 1995 I had a brand new 80486 Dx-100 I was putting together, and just for kicks I decided to pull the CPU out while it was running. I expected some kind of epic C-64 style rainbow gibberish crash, but it actually just froze, with the Windows 95 desktop still on the screen.


Reminds me of when network card drivers would use random bits of memory to pad out minimum Ethernet frames. Oh hey, there's your sensitive data going out in an ICMP ping request.


Do you have more info about this? Sounds crazy!


For example, http://www.securitytracker.com/id/1008910

It's basically heartbleed in your ethernet driver in 2003.


Out to the boundary of your (wired) LAN.


That didn't excuse it then, and it doesn't excuse it now. Sniffing the keys to the kingdom in the clear is bad news bears.


> Google marked the bug as won’t fix because google chrome incognito mode is apparently not designed to protect you against other users on the same computer (despite nearly everyone using it for that exact purpose).

What's the purpose of incognito mode then? It doesn't protect you from your ISP, websites, or users on the same computer. I'm not sure what other use case there is.


Saves you from clearing your history after browsing porn.


It could trivially also clear the framebuffer history. However they marked it as wontfix.


I'm not so sure this shouldn't happen according to the wddm spec.


OP, it looks like this is on OS X (from the screenshots), in which case you should probably report it to Apple as well. The driver stack on OS X is a mix of Apple and NVIDIA code.


This should be easy to fix at the driver level. Window close and and GPU resource release are not operations that occur often enough that memory clearing would affect performance.


> Of course, it doesn’t always work perfectly, sometimes the images are rearranged.

I've seen the same behaviour on OS X with an Intel GPU: https://i.imgur.com/3fagsYx.jpg (screenshot of the contents of a browser tab - pretty sure it was Chrome. The Rooster Teeth page you can see parts of had been closed hours prior)


Googling shows that this bug previously received a $1000 bounty in 2012 and should already been fixed: https://code.google.com/p/chromium/issues/detail?id=152746


This is a hack that is caused by hardware, due to some way in which its currently designed. As such, I doubt if software patches would be able to fix it. A while back I read an article on arstechnica that described stealing encryption just by touching exposed metal parts of laptop.

http://arstechnica.com/security/2014/08/stealing-encryption-...


Check out this paper if you're interested in a solution to this and other problems (somewhat amusingly, using the GPU): https://www.cs.utexas.edu/~sangmank/pubs/lacuna.pdf

In particular, they refute via counterexample the arguments that VMs or secure deallocation alone are sufficient.


And this is why Linux folks hate proprietary drivers. At least the developers of open source drivers would fix this. Probably very promptly!

There's a reason Linus Torvalds flipped the bird to NVidia. Here is a perfect example of the reason why closed source drivers suck.


It would be nice if a software exists that allocates many many GPU memory pages and clears them, so we can be safe after viewing p-something.


awesome -- also I'm glad you censored the porn but not the porn title ;)


Same issue with Samsung GS5... on Snapchat... with disastrous results...


Is this issue specific to Google Chrome? Does Safari has the same problem? In principle, I'd expect so, but if Safari surprises me, maybe I will jump ship.


they could just fill the buffer with images of the nvidia logo free commercials for themselves?


There's performance overhead to clearing memory.

If you were them would you take the performance hit?


GPUs have loads of memory bandwidth. I can't imagine a framebuffer taking more than a few microseconds to clear.

For example, Nvidia claims that the GTX 980 has a memory bandwidth of 223 GB/s. (1920 * 1080 * 3)/223e9 = 27us. Clearing all 4GB of VRAM would take 4/223 = 18ms. This would have a negligible impact on user experience in most cases.

I guess the driver could also erase memory in the background as soon as it is deallocated, with zero user impact.


Performance is important because people use benchmarks to choose between vendors.

And even if it's <0.1%, there is strong "optimization mentality" in those companies (because perf matters) so it's unlikely to happen in the current climate.


It's high bandwidth but also high latency. If a page was cleared every time it was allocated, it would cause very unpredictable performance because the CPU would have to tell the GPU to clear memory and then wait for the GPU to finish before any other operations on the buffer could be done. This definitely isn't something that should happen for every allocated page. It might be acceptable if this happened only to pages previously used by other processes. But it would still be unpredictable and could cause unwanted stalls in the middle of a game session, for example.

Also note that memory bandwidth is typically the bottleneck in modern games.

The best place to do this would be in the browser, clearing out any textures and buffers before deallocating them if the contents are deemed private.


>because the CPU would have to tell the GPU to clear memory and then wait for the GPU to finish before any other operations on the buffer could be done

If you're doing write-only operations, the CPU can queue them behind the clear. If you do a read, then the CPU has to wait whether you clear or not.

Latency doesn't matter. Clearing can be slotted in with other operations, such as first use.


As the article points out, every modern OS clears (main) memory before handing it over to a new process. The cost is often mitigated a bit by using spare CPU cycles to zero out free pages, and by keeping a buffer of such pages. You only need to pause to zero pages if you have sustained 100% CPU usage for a long time - and that's pretty rare on most machines. GPUs probable even more so.

GPUs generally have less memory, many fewer (but larger) allocations, and way higher memory bandwidth than CPUs, so it shouldn't be a problem for them to do this.


That's a loaded question.

If I (with training in how to design an OS and the risks of handing nonzeroed pages to another process) were them? It'd be part of my standard process for designing a memory repurposing library. But I can 100% understand how this mistake gets made; I wouldn't be surprised if it wasn't an explicit performance decision.


There's performance overhead to doing everything, but one way to simulate zero'd virtual memory is to simply map them to a zero page and, when a full page write occurs to simply write the page, and when a partial write occurs to zero the rest.

I am not familiar with GPU internals enough, but my understanding is that the GPU should be smart enough to know that a given texture or framebuffer will occupy n full pages, and so when either is written in its entirety, the zeroing only has to occur at the edges. (I would assume that the write would start on a page, but I don't know anything about GPU internals.)

Caveat emptor: I will reiterate I know very little about memory internals. It seems like a bigger issue is that GPU memory is not virtualized and all users get access to the same memory. It's as if three decades of understanding the utility of virtual memory were forgotten.


> It seems like a bigger issue is that GPU memory is not virtualized and all users get access to the same memory. It's as if three decades of understanding the utility of virtual memory were forgotten.

I think you're forgetting the most important thing here - GPUs are meant to be fast. Virtualization will add like what, an order of magnitude to the access times?


GPUs have had MMUs for a while (though they don't recover from page faults the same way CPUs do, I don't believe).


Last I checked, they pretty much don't recover from page faults, they just abort whatever "program" you're trying to run, so you can't really use the MMU for clever things like demand paging. But that's not the point of the GPU's MMU in the first place.


> Last I checked, they pretty much don't recover from page faults, they just abort whatever "program" you're trying to run

Correct. When the GPU page faults, it causes a CPU interrupt and the driver will handle the interrupt. It's not possible to resume execution on a GPU in a timely manner so the only option is to terminate the process that caused the page fault.

> so you can't really use the MMU for clever things like demand paging

Recent GPU generations support "sparse" or "tiled" memory where the GPU can detect if a load or a store would access non-resident memory and then act accordingly. This requires a specialized shader and some CPU-side logic to actually stream in the memory. This can be used to on-demand paging for textures and buffers as well as implement workarounds to reduce visual artifacts from streaming.


GPUs don't have a page fault handler; when there's a page fault, it's an unrecoverable crash. Accordingly, zero-on-allocate (or potentially zero-on-free, but that makes assumptions about startup and teardown that may not be true) is the only way to do it.


Yes. Security and correctness should always come before performance. Performance first is the thinking that got us security vulnerabilities everywhere.


It's also the thinking that got us Doom. ;) There are use cases where performance trumps security; the only issue here is that "multi-app semi-trusted computing environment" isn't one of them.


What security compromises do you claim Doom made to achieve greater performance?


I don't think the parent is claiming there are, the point is that Doom wouldn't have been possible without coding with performance as a priority, but that's "ok" because in a lot of applications there isn't a permissions differential to worry about.


I'm curious about this as well.


Interested in details about this; where were Doom's major security issues? I love reading about Carmack/Doom development in general, maybe I've missed the part where caution was thrown to the wind and security was ignored.


Seems like there never were security issues. At least, none that were talked about widely. I think that should be expected for a single player game. Almost all games are hack-able in some way of course but hacking a single player game is mostly an exercise in replay-ability.

On a related note: Doom apparently does contribute to security proof of concepts though. http://www.techtimes.com/articles/15606/20140916/security-ex...

Which makes me wonder if the non-clearing memory issue exists for the printer's video driver and whether that could be used to retrieve something like a saved password or ssh key.


[citation needed]


It only needs to do it when first handing out a particular bit of memory to a particular process. The vast majority of the time, a process will be receiving memory it's had before (and no clearing is required). When not, it will be initializing the memory as part of the creation step (and no clearing is required), or it will be doing something it's not going to be doing all that often, such as creating a whole new frame buffer (and the clearing isn't a problem). I'm not convinced this would be a huge performance hit. Modern GPUs are not exactly slow at clearing memory either.

Maybe the system doesn't pass enough information through to the driver to let it determine this, though...


Uh, anyone else think this title is a bit sensationalist? I was expecting something a bit more along the lines of actually leaking usable private data, not just displaying a rastered frame.

Even further this has very little to do with chrome. The only way chrome could actually fix this issue would be if it nuked the frame buffer when it released it. This is a fine idea, but if I was a dev in that context I would assume the OS would make stronger guarantees than that??

If anything, this is an edge case Chrome devs (and other developers) could protect themselves against if they were so inclined, but I'm not surprised they didn't assume they needed to protect against this.


The primary use case for Incognito Mode, as far as I know, is so a user can casually use a browser without leaving inadvertent artifacts of their usage on the machine. Having a page you visited in Incognito Mode be visible in Chrome after Incognito Mode is closed seems to be precisely the kind of thing users expect the feature to prevent.


If you associate the Chrome browser with your Google account (not just logging in to a Google site, but going to settings and putting the information in there), the history will be synced across several devices. Incognito can be used to prevent pages viewed on your phone or laptop from going to your desktop's history or vice-versa.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: