Where Are Pixels? – A Deep Learning Perspective

Matumio · on June 19, 2021

On a similar note: "How do I draw a sharp, single-pixel-wide line?", from the Cairo FAQ: https://www.cairographics.org/FAQ/#sharp_lines

TL;DR: always think of the upper left pixel's position as (0.5, 0.5).

threatripper · on June 19, 2021

> TL;DR: always think of the upper left pixel's position as (0.5, 0.5).

The position of the center of the pixel. That pixel goes from (0,0) to (1,1) and therefore the center is (0.5,0.5).

The image area goes from the upper left corner of the upper left pixel to the lower right corner of the lower right pixel.

The centers are shifted by half of a pixel versus a grid starting at 0 and ending at N-1.

When drawing a line you need to hit the center of the pixel you want to fill.

alok-g · on June 20, 2021

Good article.

I have spotted similar bugs too which become evident very quickly when working with smaller resolutions and likewise with a smaller number of grayscales.

> We choose a H×W rectangular grid of points, from which we will draw samples.

An additional thing to keep in mind is how a camera capturing the image would be operating. It's not sampling in the true theoretical sense of picking points from a continuous signal. The pixels are of finite size, which makes the grid (2) as closer to the reality. There are additional complications for color images where the red, green and blue channels are integrating over different regions within the pixel area (see [A] for example). This makes the real grid as different from even (2) for different color channels. However, the math suggested by the author should not change still.

> It seems the mess is unique in the deep learning world.

The title, "Where Are Pixels? -- a Deep Learning Perspective", looks unjustified. What's presented is not a deep learning perspective. It applies generically.

[A] https://en.wikipedia.org/wiki/Bayer_filter

rav · on June 19, 2021

Digital elevation models typically represent terrain elevations in a grid of square cells of some cell size, e.g. 1 meter by 1 meter. Such a model is rarely useful unless it also has georeferencing information, i.e. coordinates for the bounding box of the grid. Such coordinates can of course be arbitrary, but for large-scale mapping some convention is typically used so that adjacent models of the same cell size will "line up" without small gaps or overlaps. For example, this can be achieved by making sure that the grid corners are on integer coordinates - from my experience this is the convention I've seen in most European countries. In Poland, they instead use the convention that the centers of the corner cells are on integer coordinates, meaning everything is "off" by 0.5 meter (half a cell). This has caused me quite some difficulties at $WORK...

ur-whale · on June 19, 2021

Obligatory:

http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf