Has anyone tried to simply point a camera at an old screen, say a CRT screen, take a photo of one pixel at a time at a few different levels of brightness, then add all the photos together to render arbitrary images? As far as I can imagine, it should capture the behavior of colors and the fuzzyness between pixels very well. You could even set up a small still life around the monitor, and get accurate reflections and ambient light in the room.
Stuff like the smoothed out flickering that you mention would still need to be emulated, of course.
It's actually a great idea, assuming that background light levels are properly taken care of and the camera is well calibrated to a linear colorspace. Unfortunately "pixels" don't really blend linearly when transferred over a composite cable and there are some games that use NTSC artifacts to achieve certain colors.
Brightness is a non-local effect. i.e. if you change one pixel, the rendering of the adjacent pixels can change.
Also, the rendering is different depending on how the signal made it into the CRT; in particular composite video does not preserve certain things about the image, and in some cases developers took advantage of this/
Most famously the genesis had a poor composite encoder, which was taken advantage of by many games; Sonic 3 in particular looks bad even on a CRT with a real Genesis if you use RGB or component output.
Even color is a non-local effect. Turning two colored pixels on side-by-side on an Apple II makes them white. I can give you an unbounded amount of detail why :-)
There are shaders to render such 'artifacts' programmatically at about a thousandth of the cost of what you describe. It could potentially be more accurate in the way you describe it, but then what kind of blending mode do you use?
I think you would first find the two closest brightness levels for each pixels, and blend appropriately between them - then simply use additive blending to composite all the pixels together. A naive approach would require compositing as many images as the number of pixels in the input signal, which seems extremely expensive, but in practice I’m sure you could optimize it quite a bit, as most of the area of each photo would be almost completely black.
If you can arrange lighting perfectly in your CRT studio, you could take only the difference against the all-off state. You could extract the pixel islands' positions and composite only these smaller chunks afterward.
It could work out, but just not sure if it's worth all the hassle. It seems akin to the way you would do a stop-motion video, I suppose.
You would need to be in a pitch black room with only one pixel at a time lighting the scene. If you want additional light sources in the room they could be photographed and blended in with the same technique.
I think you would need to get the reflections in the same pass. It’s basically ray tracing in real life: emit light from a bunch of points (screen pixels), “calculate” how they bounce through the room (ie adjust the brightness of the corresponding photo), and combine them all into a single image.
Stuff like the smoothed out flickering that you mention would still need to be emulated, of course.