Hacker News new | past | comments | ask | show | jobs | submit login
A Pixel Is Not a Little Square (1995) [pdf] (alvyray.com)
74 points by dceddia on March 9, 2023 | hide | past | favorite | 69 comments



If you’re wondering what it’s talking about when it says the triads of red/green/blue rectangles on a CRT display don’t correspond to pixels, here’s a Technology Connections video that explains it:

https://youtu.be/Ea6tw-gulnQ

But on an LCD, the display really is made up of a bunch of solid-colored rectangle-ish shapes, and if the LCD is using standard RGB pattern, each red/green/blue triplet does correspond to one pixel. So if you zoom in, “a pixel is a little square” is very close to the truth. In other words, the article shows its age…


Except that even on an LCD the boundaries between individual pixels (RGB triplets) are imaginary. An RGB triplet forming an individual pixel is just a useful abstraction. But physically when looking at the screen an RGB triplet is not any special than a BRG triplet on the same display (ignoring the edges of the display and that some displays do have slightly larger gap between subpixels from different pixels).


And this is not just a theoretical concern. Exploiting that fact has practical applications for sharpness, aliasing, etc. Subpixel smoothing is based on that idea, and I think it's under-exploited in graphics in general because shading pipelines are, as far as I know, stuck addressing full pixels.



Except in games, where it seems common to run games at resolutions other than native, which means that you have a choice between some pixels being different shape than others (nearest-neighbor) or even overlapping (interpolated).

And when you dig in to LCD displays, you'll discover that treating pixels as uniform squares might get you in trouble still, because that blue vertical line turning into a red one is actually shifting horizontally.

And if you look even deeper, you'll find specialized displays (like on cameras) which don't use a square grid at all.

Oh, and that only covers the display side. Cameras also have pixels, but they are different from little uniform squares in new, exciting ways. Typically Bayer.

To make matters worse, designers decided to hijack "pixel" to mean something else in the context of scaling, but don't be fooled. Those measures are not pixels.


>Except in games, where it seems common to run games at resolutions other than native

Not just games. Same for desktops and apps, OS-wide with a compositor and whatever scaling factor ("Retina") etc.


Is there a true 'native' resolution for 3D games? Seems the display resolution is arbitrary since the content is projected and filtered regardless.

I'm surprised you consider retina 'non native' since it originally was intended to exactly double the resolution so old application could still render pixel-perfect without changes.


>since it originally was intended to exactly double the resolution so old application could still render pixel-perfect without changes

Originally, because it just did pixel-doubling (or resolution-halving if you prefer).

Later (since like 3-4 OS releases ago) macOS does rendering in higher resolutions and downscaling to non-pixel-double too. This non-pixel doubled, slightly higher than native/2 "Retina" is even the default since at least 2-3 years on Mac laptops.


IIRC some Retina displays use fractional scaling like 2.1. I can't find any confirmation though.


And if you go all the way down the rabbit hole, oh shit the rods and cones in my eyes have different spacing!


Well, that's the part of the transform in the receiving end. Up to this point the discussion has been on the signal from the emission surface.


Yes but eventually you will run into weird artifacts or limitations which no matter how hard you debug you can never seem to resolve. It will take profound insight to notice the error is in the receiving end.


You can even take it further.

Did my dad ever even love me?


Pixels always being perfect squares is of course not true but the square model is still a lot closer than the point samples on a grid model - that is, each pixel models the average color over an area. Cameras work that way because sampling light at a point would get you no signal (if you could sample a point, which you can't) and display's work that way because you can't physically emit light from a single point. In most cases that area is squarish or at least as much as the technology allows (e.g. subpixels and bayer pattern are used because you can't have the colors cover the same are, at least not without creating bigger issues).

So yes, if you know the details of your display or sensor then you can use a better model to eek out more signal. But if you don't, the little squares model is close enough.


I am keenly interested in everything you wrote here. Any links or even search terms for further reading?



I understand his point from a computer graphics perspective, even if that didn't stand up to the test of time. But why is it written with so much bluster and claims of generality? Pixels are also a data modeling concept (without a graphical representation per se) and his bold proclamations on pixels in general just don't hold water.


Well, "pixel is 3 rectangles in a trench coat" would be closer approximation.

And there are other layouts than 3 rectangles:

https://crast.net/21193/the-first-qd-oled-screens-hide-a-pec...


Or predicted we would now be using OLEDs


This is about pixels being essentially point samples and not geometric shapes. In my opinion the geometric shape model is still a very useful mental image when dealing with pixels and does not contradict the point sample model.

When dealing with pixels as shapes the shape is not necessarily a square though. Pixels can be rectangles and this was common in the home-computer era. The article even acknowledges this (in a footnote):

"In general, a little rectangle, but I will normalize to the little square here. The little rectangle model is the same mistake."


Many of you will be too young to know this, but back in the 80s and 90s, pixels weren't even incorrectly "little squares", they were very often "little rectangles". Your display was almost always 4:3, but the resolution of your screen was very often not 4:3, and so if you needed to display squares or perfect circles, you needed to compensate for that.

This is why the Utah teapot looks squashed, compared to its real-life counterpart: https://en.wikipedia.org/wiki/Utah_teapot


> the Utah teapot looks squashed, compared to its real-life counterpart

That’s quite interesting indeed!

I was wondering if it’s possible to buy a teapot similar to the original real life one still.

According to someone on Reddit, it’s still being sold

https://www.reddit.com/r/computergraphics/comments/8nmhky/co...

The website where it is being sold:

https://frieslandversand.de/teekanne-1-4l-weiss-utah-teapot

Wikipedia also says:

> The original teapot the Utah teapot was based on is still available from Friesland Porzellan, once part of the German Melitta group. Originally it was given the rather plain name Haushaltsteekanne ('household teapot'); the company only found out about their product's reputation in 2017, whereupon they officially renamed it "Utah Teapot". It is available in three different sizes and various colors; the one Martin Newell had used is the white "1,4L Utah Teapot".

https://en.wikipedia.org/wiki/Utah_teapot#Original_teapot_mo...


> This is why the Utah teapot looks squashed

Although it says on that page:

"The real teapot is 33% taller (ratio 4:3) than the computer model. Jim Blinn stated that he scaled the model on the vertical axis during a demo in the lab to demonstrate that they could manipulate it. They preferred the appearance of this new version and decided to save the file out of that preference."


Had a flashback to QBasic screen mode 8, which had a resolution of 640x200. Mode 9, 640x350, was also useful because it was the highest resolution you could get while having more than one back buffer.


I had the same thought. I really yearn for a simple real-time OS machine to play with and relive my early DOS experiences.


> Some programming libraries, such as the OpenGL Utility Toolkit,[2] even have functions dedicated to drawing teapots.

The joys of having a decent library environment, makes setting up a Hello World example really easy.


The same is also true if you drop down to a single dimension - audio data is a discrete sampling of a continuous signal, and again, it is commonly misrepresented as a step function (a staircase) which is very misleading and leads to many mistakes when considering signal processing.


As I understand it the popular Audacity software fixed this, if you zoom in you'll see that nope, it's always a sine wave. But yes, lots of older software acted as though there was "really" a step function here.


Really good video on the topic from xiph.org

https://youtu.be/cIQ9IXSUzuM


I recall reading (again, a hard-to-google topic!) that before Edison invented the (cylinder) phonograph, there was a debate over whether sound could even be meaningfully mapped to one wave, as it was known that in any real environment there are actually multiple independent sound waves overlapping from different angles.


If you try to sample continuously and eliminate that discretization error, I have some bad news for you about the nature of reality itself...


I think you're confusing 'discrete' with 'stepped': reality can only be sampled at a bunch of discrete points; however, we can decide how to interpolate 'in-between' those samples in whatever way we like.

We could use a 'stepped' interpolation (i.e. nearest-neighbour/voronoi cells); however, the result will sound pretty crap. That's because 'steps' are a very unrealistic model: sound is made of pressure waves, which vary smoothly (at least, at the resolutions we tend to sample at); sound waves do not instantaneously jump between flat levels (again, ignoring microscopic effects like phonons, etc.).

A better approximation is to interpolate smoothly between the points, e.g. using sine functions (i.e. Fourier series). That's a much better approximation of the way air actually moves (and also ear-drums, loudspeakers, etc.); whilst it's still completely discrete. As the parent says: "discrete" does not mean "stepped".


I think you're not familiar with quantum mechanics. If you try to sample to infinite precision, reality itself is stepped and in fact does instantaneously jump between flat levels. Its called wave function collapse. Your implicit premise that there's a continuous, real valued, sound wave form which can be sampled to arbitrary precision is false. Discretization isn't just an artifact of the machine, its present in the underlying reality too!


> reality itself is stepped and in fact does instantaneously jump between flat levels

That's why I said "at least, at the resolutions we tend to sample at", and "ignoring microscopic effects like phonons"

> reality itself is stepped and in fact does instantaneously jump between flat levels. Its called wave function collapse.

That's pretty-much right, but that phenomenon is "quantisation". "Wave-function collapse" is a different (but related) thing.

Values like energy are "quantised", meaning they can only occur in certain amounts/steps (for energy these are called "energy levels"). A classic example is a "particle in a box" ( https://en.wikipedia.org/wiki/Particle_in_a_box ), which acts a bit like a guitar string: each "energy level" is like a resonant frequency of a string. These are discrete, since (a) frequency is related to wavelength, and (b) resonance only occurs when a half-integer multiple of the wavelength fits inside-the-box/along-the-string (i.e. half a wavelength; a full wavelength; one and a half wavelengths; etc.).

Guitar strings can also vibrate in more complicated ways; which can be described as adding together several of the resonant frequencies (possibly with different amplitudes). This "adding together" creates a "superposition", where the different waves can interfere, to produce the complicated vibrations we see/hear. The same is true for quantum systems: their behaviour can be a complicated adding-together and interference-between multiple energy levels (possibly with different amplitudes).

SPOILER ALERT: Describing a complicated wave by adding together a bunch of simple, discrete waves is called a Fourier series; and it's exactly what the parent was talking about for audio sampling!

Wave-function collapse is a separate thing: this adding-up and interference perfectly describes the behaviour of quantum systems; but we've never actually measured a system to be in such a mixed-state. Instead, every measurement shows a particular energy level (or, more generally, "eigenstate of the measurement operator"). We don't know why, but one explanation is that mixtures "collapse" when measured, with the probability of each outcome being the square of its amplitude. (Although that can't be the full picture, since it's observer-dependent; e.g. see https://en.wikipedia.org/wiki/Wigner%27s_friend )

> I think you're not familiar with quantum mechanics

I have a Masters degree in Physics ;)

---

Regarding the issue of discretely sampling a sound: you still seem to be conflating "discrete" with "stepped". Let's stick with the "particle in a box" example above, which is actually a good model of sound waves in a solid (where the particles are called "phonons" https://en.wikipedia.org/wiki/Phonon ). This system has discrete energy levels; meaning that two neighbouring states have no states "in-between". Yet each of those energy levels is a smooth, continous function over space; in fact, they're perfect sine waves!

This is the heart of the confusion: if we sample a system at time/position 0 (let's call that sample S0), and we sample it at time/position 1 (to get S1), we can model that system using any function f we like, as long as f(0) = S0 and f(1) = S1.

(NOTICE: I'm only using two samples, not "sampling to infinite precision")

A "stepped" function (AKA piece-wise interpolation, AKA nearest neighbour interpolation, AKA Voronoi cells) would be something like:

  f(t) = (t > 0.5)? S1 : S0
That fits the constraints, but it's a terrible model of a sound wave (and a terrible model of a quantum state), since it doesn't vary smoothly like the real thing. Its 2D equivalent, f(x, y) = ..., is the "little squares" model of pixels.

A better model would approximate the smoothly-varying shape of the sound wave (or quantum system). For example, we can eliminate the sudden jumps using linear interpolation:

  f(t) = (S1-S0)*t + S0
Notice that both of these definitions only use two samples (S0 and S1). Yet we can sample these functions at any point t we like. This is how we can go from "input pixels" (point samples S0 and S1) to "output pixels". Say we want to drive a loudspeaker, whose position can be adjusted at evenly spaced times like -1.0, -0.9, -0.8, ..., 1.9, 2.0. We can choose the position of the loudspeaker at each time by using our function f, i.e. f(-1.0), f(-0.9), ..., f(1.9), f(2.0)

NOTE: The values of t will almost-always be translated/shifted by some amount. For example, if we record some audio then play it back, the time 't=0' of the recorded samples is different to that of the playback samples, since the playback occurs at a later time. Likewise, if we sample some light at position (x, y) on our camera sensor in Hawaii, those will be different to the (x, y) positions on our computer display in New York. We may also decide to stretch/compress the scale, although that's less common for audio!

If we use the "stepped" definition of f, our loudspeaker will be stuck at position S0 for a while, then quickly move to position S1, then stay stuck there for a while.

If we use the "linear" definition of f, our loudspeaker will start at position -S1 + 2S0, then gradually move to position 2S1 - S0 (passing through position S0 at time t=0, and passing through position S1 when time t=1).

The linear definition only takes two samples into account, so it doesn't work well if we have many samples; we hit a "sharp corner" as we pass through each sampled point. In the loudspeaker example, its speed will jump (AKA it has a discontinuous derivative). We can smooth-out such corners by using an equation involving a few more samples, e.g. higher-degree polynomials, or sine/cosine waves (AKA Fourier series, which the parent was alluding to), etc. For nice visuals see https://en.wikipedia.org/wiki/Interpolation#Example

>

SIDE NOTE: In normal Quantum Mechanics, mixed states can have any Real numbers as their amplitudes. Hence we can transition from one energy state to another in a smooth, continuous way; e.g. starting at 100% A + 0% B, smoothly decreasing/increasing the amplitudes of A/B, and ending up at 0% A + 100% B. That's unrelated to this discussion though. Also it requires that no measurements occur during the process; e.g. see https://en.wikipedia.org/wiki/Quantum_Zeno_effect


I also have a Masters in physics. You really don't need to explain the FT to me. Or the QFT for that matter. Or any of this stuff. I already know.

You understand it all yet you somehow overlooked it in my original comment. The first comment that started this all was that if you look at the wave forms on your computer (say, in audacity), it turns out those are discrete, the implication being the computer representation is a discrete approximation of a continuous reality. I originally replied with

"If you try to sample continuously and eliminate that discretization error, I have some bad news for you about the nature of reality itself..."

The eventual break down into discretely quantized values isn't just an artifact of the digital medium, reality itself is quantized. There is no truly analog tape recorder. Fine you can sample in the fourier basis rather than position basis to keep your "pure sine waves"* in position space, but you're still always going to generate discrete samples and you'll still need the same number of them. You are free to shift representation around, but the same amount of quantized information is present and stored somewhere. The initial observation that the sound wave files on your computer are "1d pixels" is unavoidable. Every quantity, when viewed with sufficient precision, is "pixelized".

*really what your doing is measuring in the momentum/energy basis, causing the function to be stepped there, but leaving the superposition intact in the position / time basis...until the listeners ear goes and measures in the position / time basis anyway. In principle if you've measured the sound wave as an ordinary time series down to the accuracy of quantized ear drum positions, it shouldn't matter if you're playing that back directly as "step function time series" or as superposition of sine waves via Fourier trick. In practice of course this never happens.


Related:

A Pixel Is Not a Little Square (1995) [pdf] - https://news.ycombinator.com/item?id=26950455 - April 2021 (70 comments)

A Pixel Is Not A Little Square (1995) [pdf] - https://news.ycombinator.com/item?id=20535984 - July 2019 (78 comments)

A Pixel Is Not a Little Square (1995) [pdf] - https://news.ycombinator.com/item?id=8614159 - Nov 2014 (64 comments)

A pixel is not a little square, a pixel is not a little square - https://news.ycombinator.com/item?id=1472175 - June 2010 (20 comments)


Black holes probably don’t have infinite density but that doesn’t stop us from finding utility in the theory of relativity.

Even if pixels are not little squares it’s a useful abstraction.


> Even if pixels are not little squares it’s a useful abstraction.

Zero-size points are the abstraction; "little squares" are more complicated and hardware-dependent, e.g. What's their area? What's their aspect ratio (are they actually square)? How are the sub-pixels arranged? etc.

When it comes to e.g. pixel art, etc. I think it's more precise to say: nearest-neighbour interpolation is a useful simplification.


But pixels aren't just points. By sensible arguments, they are usually squares in the sense that a 10*10 pixel area will be square. Otherwise they are non-square rectangles. After all, pixels are actually arranged in a rectangular fashion. They could be arranged like hexagons, but we settled on a square filling. They could have also been arranged randomly, like ink droplets on paper, but no, they are pretty square.


I think, a better way of putting it may be, pixels are samples at points, rather than representing actual point samples. Meaning, while they are sampled at a specific location, these are not actual values, as this would require ideal sensor technology that will probably never be achieved. So we have to account for all kinds of diffusion, reflection, sensor surfaces, shapes of optical channels, fall-off thereof, etc – and may wonder (probably to no definitive end) at what shape and to what extent to model them best.

Edit: Moreover, even when it comes to simply filling an area, things aren't that simple. E.g., when this was written (1995), this was still the era of CRTs, and CRTs are analog devices with analog response. A sole pixel may never reach full phosphor activation, while a stretch of them will, and there are also signal flanks.

If we wanted to represent a scanline information as

  0.0 1.0 1.0 1.0 0.5 0.5
we may have to actually write for a true representation (letting go of the idea of full phosphor activation),

  0.0 1.0 0.8 0.8 0.0 0.4
So, for this specific purpose, not every pixel is the same, but rather relies on context…


> while they are sampled at a specific location, these are not actual values, as this would require ideal sensor technology that will probably never be achievedwhile they are sampled at a specific location, these are not actual values, as this would require ideal sensor technology that will probably never be achieved

Indeed; hence why they're a useful abstraction

> So we have to account for all kinds of diffusion, reflection, sensor surfaces, shapes of light channels, fall-off thereof, etc – and may wonder at what shape and to what extent to model them best.

Whilst 'shapes' can give particularly simple models (like nearest-neighbour voronoi cells), that's still quite a limiting assumption. For example, models which take multiple neighbours into account, like bicubic interpolation (and pretty much everything except pixel-art editors), don't fit the concept of "shape" very well. In particular, the influence/extent of the pixels "overlap", which requires a separate notion of composition/interaction; and some areas may even have negative influence!


> In particular, the influence/extent of the pixels "overlap", which requires a separate notion of composition/interaction; and some areas may even have negative influence!

Indeed! Compare the edit. (Sorry for this other overlap in the temporal domain.)


This entire conversation (not just picking on you!) is missing the point. Pixels are a context-sensitive abstraction that mean different things depending on what exactly you’re trying to accomplish. They can be point samples in a grid, they can be rectangular uniformly illuminated regions, they can be bytes in an array. We don’t need one single definition, which is good because there isn’t one.


> By sensible arguments, they are usually squares in the sense that a 10*10 pixel area will be square

My first computers were Amigas, whose display elements ("screen pixels") were rectangles, but certainly non-square ;)


Over here in Europe (PAL) land, Amiga low resolution (320x256) was actually a mode with square pixels!

It seems [1] the aspect ratio for the NTSC low-res was 44:52 = 22:26 = 11:13 which is a bit off but teenage demo-coder me would probably not have cared much (not that I ever saw an NTSC Amiga). :)

[1]: http://amigadev.elowar.com/read/ADCD_2.1/AmigaMail_Vol2_guid...


Yeah, that's why I continued

> Otherwise they are non-square rectangles.


If you want to read much, much, much more about this, read his book "A biographpy of the pixel".

Good, but very dense history of the development of computer graphics. And pixels.


Speaking of drawing canvases not corresponding to the hardware, I have an incomplete memory and I wonder if any of you can help:

I recall that the original PlayStation, or WebTV, or both, had virtual canvas that was higher-resolution than a typical NTSC television of the time. And I vaguely recall that a technique was used whereby the pixel(s) that were shown on the NTSC TV display alternated between two full-screen frames rapidly (30 frames per second? I actually think it might have been much slower; I think you could actually perceive the flickering alternating when motion was frozen, but this was not really an issue when games were being played or TV shows being watched). Through some trick of human perception, the alternating of frames yielded the perception of a higher-resolution display. I recall Microsoft's early Interactive Television set-top-boxes not using this technique (I was a usability specialist working on it at the time), and the quality of text rendering in particular was noticeably poorer. Anyway, I have been unable to find documentation for how this technique worked, and I find it hard to google. Any links would be appreciated!


As mentioned by the sibling post, I think you're describing interlacing, which was the way broadcast TV worked but was not usually used by video game consoles. However, the PlayStation did support interlaced (or high-resulation) modes.

It was a trade off. NTSC CRT TVs displayed 60 "fields" per second (50 for PAL), with every other field being offset vertically by a tiny amount, creating a single frame 30 (25 for PAL) times per second with double the vertical resolution of an individual field. This did create artifacts most noticeable with fast moving objects, in particular it was one of the effects leading to a really stupid idea that creatures called "rods" existed which were invisible to human eyes but visible to cameras[0].

Video game consoles typically skipped the extra half-scanline at the end of every field that would normally be used to offset the next field and treated each field as a full frame. Lower vertical resolution, but higher FPS and simpler video hardware.

Given that WebTV would have had a lot of text and not a lot of motion, I would guess that it used an interlaced mode for the extra resolution.

Earlier consoles, like the NES, had flickering sprites sometimes purely for the reason that the hardware could only display so many (8? in NES case) sprites per scanline, so rendering was sometimes juggled on a frame-by-frame basis to fit more.

Also interesting note: The Atari 2600 (VCS) totally had the ability to do interlacing due to the fine-level control over sync offered (read: forced) by the rather spartan Television Interface Adapter. It was never used by a commercial game, but home brewers have created games using it. It is also worth noting that VCS games were frequently creating out of spec sync signals and it can really screw with capture hardware.

[0] https://en.wikipedia.org/wiki/Rod_(optical_phenomenon)


Are you perchance trying to describe interlaced video: https://en.wikipedia.org/wiki/Interlaced_video


This is wrong. Pixels may not be perfect little squares but they are not perfect point samples either. They're normally somewhere in-between.

Cameras do not sample points, and even though with sufficient filtering they can be equivalent to point samples of a low passed image, plenty of cameras don't have any filtering.

Similarly displays basically never have filtering to make pixels point samples. If they did then anti-aliasing wouldn't be a thing!

Hell, most projectors literally project little squares! (Even in 1995 IIRC.)

This must have been written by someone that just learnt about sampling theory and thought "aha! Pixels are samples!" without actually thinking about real life.


I think Alvy Ray Smith understands both the fundamentals of sampling theory and the practicalities of real-world image synthesis and display, given his role in the creation of computer graphics.

Re "If they [displays] did [filtering] then anti-aliasing wouldn't be a thing" - that doesn't match my understanding of sampling theory. If the computer graphics process of image synthesis has folded down high frequencies in the hypothetical source signal, well above (half) the image sampling resolution, so that they now overlap with lower frequencies of the image -- i.e. the image generation suffers aliasing -- then no "filtering" downstream (at least in the traditional signal processing sense) can undo that, and the shape the display elements certainly won't save you either. The job of anti-aliasing is to prevent high-frequency energy from landing in the band-pass in the first place.

Re "plenty of cameras don't have any filtering": the physical act of counting photons that fall onto the areal extent of a CCD element (with an the associated directional distribution created by the prior optics) is absolutely a filtering of the incoming optical signal. That is independent of any subsequent digital filtering of the pixel data.



Ok I stand corrected! Still he should know better.


This may be one of the most incorrect and arrogant things I've ever read.


This must have been written by someone that just learnt about sampling theory and thought "aha! Pixels are samples!" without actually thinking about real life.

This was written by one of the fathers of computer graphics.

chriswarbo was completely right, but was downvoted anyway. You are conflating displays of pixels with calculating the color of each pixel that goes into an image.

Each pixel in a rendered image is weighted average using a specific filter over an area around a pixel. If you treat that as a square, you get a 1x1 box filter which is terrible for anti-aliasing. You can try this by playing with renderer settings so you understand what he's talking about in the context he was talking about.


> If they did then anti-aliasing wouldn't be a thing!

Anti-aliasing makes perfect sense, even if you assume that individual pixels are points.


You're conflating many different things, all under the name "pixel". I think that's where your confusion is coming from:

> Cameras do not sample points

Cameras use sensor-elements (silver-iodide grains, CCD arrays, etc.). If you want to use the p-word, we could call them "sensor pixels", or suchlike.

> displays basically never have filtering to make pixels point samples

I don't understand what this means; displays are physical devices, which must have 3D extent.

In any case, we can call the display-elements of such devices "display pixels", to avoid confusion.

> most projectors literally project little squares

Again, these are display-elements (or "display pixels", if you must).

> This must have been written by someone that just learnt about sampling theory and thought "aha! Pixels are samples!" without actually thinking about real life.

The author is using the term "pixel" to refer to image data. That is not the same thing as display-elements, or sensor components, etc.

If you read it carefully, you'll see those examples are actually mentioned explicitly; along with e.g. the tone patterns used by printers.


Image data is recorded from sensor pixels (or rendered pixels) and displayed on screen pixels or pinted using ink pixels, without additional filtering unless the image is scaled or rotated. None of these are perfect pixels due to physical limitations but they all try to be mostly that: little squares.

It is convenient to think of image data as point samples because that lets you use sampling theory but that does not mean image pixels are actually point samples.

Or ro put it another way, if it's captured like a little square and displayed like a little square, it is a little square.


You are making the exact same mistake even though this person explained it to you.

This isn't about what a pixel in a display looks like, it is about what a pixel in an image buffer represents and how to render it without aliasing.

To anti-alias a pixel you have to realize that it is the weighted average of multiple samples over an area.


For most this is a distinction without a difference. References to CRT phosphors is now irrelevant.

Use of direct emmission (not backlight) mini-LED does make this relevant, but only at the highest fidelity levels that I have trouble conceiving a case for, e.g. VR with lower resolution mini-LEDs and wanting very square corners of boxes sacrificing crispness--effectively you'd render extra pointy corners to make them appear normal 90' corners given the natural rounding to be expected from pixel-centered light sources.


Where this space does (or may) get interesting again in scanning fiber displays and other experimental optics for HMD's: https://blair-neal.gitbook.io/survey-of-alternative-displays....

Not only does this break from the the display being a rectilinear grid, but this can also vary over time. This enables the actual physical display surface to be foveated, rather than lensing between a foreground/background layer, or just having foveated rendering that still renders on uniform resolution display.


QD-OLEDs have a triangle layout, so it's relevant


This seems like subpixel anti-aliasing which I thought was the greatest thing for LCDs. Then Apple comes along with Retina displays and uses grayscale anti-aliasing and super-resolutions. The only time I find that it matters is on those rare occasions where I attach an HD LCD rather than a 4k+. It would likely matter more with larger FOV than a TV or monitor such as VR. Densities keep going up, so will matter less and less.


A pixel is what you define it to be. This is a good thing, as long as your definition is a useful definition for the purposes of the discussion in which it is used.

It only becomes a problem if you're using one definition but your audience is using another, and neither side attempts to clarify what definition they're using, leading to disagreements that sound like disagreements of fact, but are only disagreements of definition.

I.e. just like most debates.


Pixels are points with a (usually) square footprint. Whether the point-ness or square-ness is emphasized depends on context. "Pixel" = "picture element" = "quantum of a picture", nothing more.


Way back in first year university, all full of enthusiasm for learning, a friend and I gatecrashed a 4th year computer graphics lecture.

It started with the declaration that “An image is made of ‘PIK-SULLs’. A ‘pik-sull’ is a littul square” and continued in this vein. It was terrible.

A while later we found a printout of this paper in the CS lab.


It is hard to understand what this article actually advocates. All the images have the filters aligned with the input pixels, with no grid of output pixels overlaid, as one would need to actually scale or rotate the image. It seems like the article actually suggests that we should use a filter in order to display an unscaled image on screen, even adding a border of filter garbage from outside the confines of the original image.

We are told that the square filter depicted is bad, no actual reason is given.

If we actually add in some scaling and rotation then it is really bad, but then again if we scale it down really far, then so are all the other filters, horrible aliasing all the way round.

Conceptually a filter operation can be thought of as two steps, first we apply an input filter to produce an intermediate infinite resolution version of the input image, then for each output pixel we use an output filter to sample the intermediate version and produce a simple colour value. In practice of course there is no intermediate image, the input and output filters are combined to a single formula that deals with a finite amount of data.

The reason that this model is not often mentioned is that the output filter is commonly just sampling a single point, thus the combined filter and the input filter becomes the same thing. This is often a very poor choice, leading to uneven sampling distribution and the aforementioned aliasing. It is possible to mostly avoid these issues by picking the exact properties of the input filter to match the desired level of scaling, but that is not something I have generally seen applied outside of one specific context.

That context is trilinear filtering used in 3D graphics. This input filter produce an intermediate that is exactly as blurry as it needs to be to avoid the aliasing that results from heavy downscaling. It is still visibly not perfect, and therefore we also use an output filter called an anisotropic filter. Rather that a single point sample it picks multiple point samples in a circle, typically 16. I think there is an argument to be made that an integral over the pixel-shaped square would produce slightly better results. But an exact integral is really expensive to compute, and the circle complements the trilinear filtering better than a 4x4 grid of samples, leading to a more even sampling of the input in cases where a texture is viewed at a sharp angle.

So modern filtering in 3D games don't use any of the fancy filters you see in papers like this. Not because a modern video card couldn't, but because they don't solve the problems video games care about, like aliasing.

For offline filters, like the ones you would apply in an image processing program, I do think squares have some merit. But we have to think of it as two filter steps. One approach would be to simply render each input pixel as a square in the intermediate image, and then for each output pixel sample a square from this image. The input and the output squares are possibly a different size, and oriented differently, but with a bit of maths we can still compute the exact result in finite time. The resulting image is decent, with each input pixel contributing the same amount to the output image, but the blurriness is possibly uneven, which in some cases can look jarring.

An alternative is to use a bilinear input filter, while retaining sampling a square for the output. This is a bit blurrier, but the blur is much more even.

Okay, but what if we use a circular output filter? Or also use bilinear for output, throw in some bicubic, or sinc or some other thing? The real world result is that you are now staring at a bunch of similar images trying to deduce which one looks best, and they are all kind of the same. The only markable difference is that some of them are a bit blurrier, and the others artefact a bit more, and ultimately that tradeoff is the main concern when choosing filters.


But what if the pixel identifies itself as a little square? Is the author going to promote hate against things that identify themselves as a little square?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: