Hacker News new | past | comments | ask | show | jobs | submit login
Introduction to the Math of Computer Graphics (codeofthedamned.com)
145 points by ingve on Dec 30, 2015 | hide | past | favorite | 32 comments



Small thing, but some of the examples have some pretty bad undefined behavior bugs. For instance:

    Matrix& operator*( const double scalar,
                       const Matrix& rhs)
    {
      Matrix result(rhs);
      result *= scalar;
 
      return result;
    }
(Returning a reference to a stack variable is undefined behavior, but realistically you're probably either looking at a hard crash or a really hard to track down bug)


I wonder why they didn't simply nuke the reference and get move semantics? Or better yet, use expression templates.


I have corrected the code by removing the reference.

The way the code is now structured will allow for the RVO (Return-value optimization) to occur, so the new instance will be constructed in the location of the receiving object.

Move semantics would not apply in this case, it would actually suffer from a similar problem as the version that returned a reference, except the program would not crash. Instead the return value would contain garbage.

(edit: added a link to details on how to use move semantics) http://codeofthedamned.com/index.php/c-r-value-references

I did not use expression templates, or try to optimize any of the implementation because my intent is to demonstrate "how" to get started with the math. I also wanted to make the code accessible to programmers that do not use C++. For example, someone trying to learn WebGL.


Thanks for pointing out my mistake, I have corrected the code to simply return from the stack.


that should be a compile error, surely? 'returning a reference or address of a temporary' or similar... i guess warnings as errors and having warnings to the max has spoiled me.


What I'm failing to understand is how do matrices relate to 3D graphics. What is a matrix representing?

I think I get the idea of a vertex, as that can be used to represent a point on a shape, like a 3d model.


Matrices are a way of accumulating a bunch of separate transformations you might want to perform on a point (in the lingo, points are vectors because each coordinate of the point is an element of the vector).

Suppose figuring out where a world point (in a game map, for example) should show up on screen (in pixels) looked like a bit like this function call sequence (this is simplified):

    screenPoint = perspective(move(rotate(move(worldPoint))))
Think of a camera moving around the world; in virtual reality, this is exactly equivalent to the camera staying put and the world moving around instead. That's more or less how 3D graphics works. You have a virtual box that logically lives behind your screen, with the same x and y coordinates as your screen resolution, and z coordinates handled by a Z buffer that's used to determine what overlaps what. A big chunk of the matrix work is about moving, rotating and squishing the whole virtual world until the bit you want to look at fits into the virtual box that lives behind the screen.

Each function (perspective, move (aka translate), rotate, scale, etc.) can be expressed in the form of multiplication by a matrix. See [1], for example. That turns it into something like this:

    screenPoint = perspectiveMatrix * moveMatrix * rotateMatrix * moveMatrix * worldPoint
But because matrix multiplication is associative, we can calculate a single matrix to do all the work:

    transformMatrix = perspectiveMatrix * moveMatrix * rotateMatrix * moveMatrix
    screenPoint = transformMatrix * worldPoint
And this is much more efficient because there's fewer calculations that need to be performed per point.

[1] https://en.wikipedia.org/wiki/Rotation_matrix


Wonderful explanation, thanks


A transformation from world coordinates (x, y, z) to screen coordinates (pixels).

A matrix corresponds to a linear mapping of some vector space to another. In case of graphics you usually start with local coordinates of your object. These have to be mapped to world coordinates, camera coordinates, and finally a perspective projection to get the screen coordinates. All these mappings can be expressed as matrices and applying mapping A after B is the same as applying the matrix product A*B to your initial coordinate vector. Precomputing the entire transformation matrix in this way causes quite a speedup for long transformation pipelines.


Check out this article! http://www.senocular.com/flash/tutorials/transformmatrix/

It helped me understand them quickly, with easy to understand, practical examples.

Don't mind the "Flash 8" context, there really isn't anything flash-specific in there.


That's a helpful site, especially the interactive examples. Thanks for the link


Without getting too mathy - matrices let you express useful transformations (eg. translation, rotation, scale) and multiplying two matrices combines those two translations - so you can have matrix tansform = rotationA * translation * rotationB

and now multiplying a vector (~ a 3d point) by that transform matrix is the same as doing each individual transformation (in the reverse order they appear in matrix multiplication) on a point to get the point after transform


It represents the axes of a space. Suppose you have a 3d space. It has 3 axes (traditionally referred to as X, Y and Z), usually all pointing in orthogonal directions, in some other space. The matrix holds the directions of these axes, one per column.

Here's an example 3-d matrix:

    [ a d g ]
    [ b e h ]
    [ c f i ]
(a,b,c) is the vector representing the direction of the X axis; (d,e,f), Y; (g,h,i), Z.


Matrices can be thought of as the coefficients to linear systems of equations. Things that linear systems of equations can do to their input variables are rotations, translations, scaling etc. Perspective distortion is another linear transform so it can be represented by a matrix.


Yes. Imagine you have a cube plotted in XYZ coordinates. The eight vertex points are (0,0,0), (0,0,10), (0,10,0), (0,10,10), (10,0,0), (10,0,10), (10,10,0) and (10,10,10).

Now let us say you want to rotate that cube 30 degrees on the X axis. You take the last vertex and make it a matrix, adding 1 at the end like you will do with the others, i.e. (10 10 10 1). Then you multiply it by the X rotation matrix ( https://open.gl/transformations ) using the appropriate angle. You now have the new correct position of that point. Now do the same for the other vertexes on the cube. The cube is still the same size and shape but its position has shifted.

You can do other easy transformations with matrix math. You can scale the cube with a scaling matrix - doubling it, halving it, or whatnot. Just multiply each vertex to the same scaling matrix. The new coordinates have the same size.

You can have a translation matrix to, where each vertex is multiplied by the same translation matrix, so that the cube is translated x, y, z coordinates from its origin.

You can also chain these matrix multiplications, so that the cube size is doubled, then rotated 60 degrees on the Y axis, then moved 20 X, -30 Y, 15 Z via translation. So that is multiplying three matrices in a row, then multiplying them against each vertex.


My intention with this first post was to show how to work with the mathematical constructs.

I have already started on the next article, which will provide the context for what these mathematical constructs provide.

To briefly answer your question, for 3d graphics the Matrix provides an efficient mechanism to translate, scale, rotate, shear and convert between different coordinate systems.


Here's a set of articles I wrote that might help?

http://webglfundamentals.org/webgl/lessons/webgl-2d-matrices...

That's 2d but if you follow them forward it will go to 3d


A 4x4 matrix can represent a position and a rotation, where the 3x3 matrix (called the transform) is the rotation and the 4th row (calls the transpose) is the location in space.

The neat thing is that when you multiply matrices together you update the original matrix. e.g. If you want to rotate something 90degrees to the right, you make a transform matrix with a zero transpose and multiply it with the original matrix. The resulting matrix will be an object in the same place rotated by 90 degrees.

Most modeling applications since about 2000 though use quaternions and a point to represent rotation and location. That's because it uses less memory and is easier/faster to compute.


Matrices and Linear Transformations by Cullen (Dover Books on Mathematics) will answer all your questions and more. It's written in plain English, doesn't require a mathematical background, and you really only need the first three chapters.


Yeah, while being informative the article is not necessarily inspiring. I just gives a rundown of linear algebra without talking about how it relates to the graphics.


"The Matrix and Quaternions FAQ" is also worth reading: http://www.j3d.org/matrix_faq/matrfaq_latest.html


I've been using quaternions for 15 years now and I still don't have a good mental image of why they work. I've just learned to accept the operations and what they do.


Visualising Quaternions is a book which goes into much beautiful depth about what they represent and how they behave, if you fancy digging deeper than the everyday practicality of them.


I had never heard of quaternions so I Googled the book title you suggested. In addition to the Amazon link [1], I also found a question on the gamedev SE where they talk about building a mental model of quaternions [2], which I found informative.

[1]: http://www.amazon.com/Visualizing-Quaternions-Kaufmann-Inter...

[2]: http://gamedev.stackexchange.com/questions/4801/how-can-you-...


This post is mistitled, and I really wish I knew where to find a thing that is what this post claims to be.

From the title, I was hoping this would cover coordinate systems, the Bresenham line algorithm and midpoint circle algorithm, Lambertian reflection, perspective transformation, quaternions, Gouraud and Phong shading, rotation, metaball approximation, parametric vs. implicit vs. explicit function representation, splines, ray tracing, alpha compositing, and some basic filtering. Instead, it consists only of basic linear algebra — it doesn't even cover how to generate a rotation matrix! However, it does cover a lot of stuff that basically doesn't come up at all in computer graphics, like multiplying nonsquare matrices together, and as pandaman points out, adding matrices together.

(Also, as overgard points out, the code contains basic novice errors: https://news.ycombinator.com/item?id=10812881)

(pandaman correctly points out that matrix-vector multiply is a special case of multiplying nonsquare matrices. I still don't think you need to deal with that in an introduction; it's easy to learn as a special case, and that's probably how you'll have to code it anyway!)

Unfortunately, I don't know where to point people for a real introduction to the math of computer graphics. This is a shame; even without libraries and hardware, you can do rotating 3-D shapes in just 15 lines of code http://canonical.org/~kragen/sw/netbook-misc-devel/rotcube.p... or ray-tracing in 186: http://canonical.org/~kragen/sw/aspmisc/my-very-first-raytra... or basic VR with your cellphone accelerometer in 114: http://canonical.org/~kragen/sw/81hacks/topopt-ar/

(That last one is cheating a little bit since I didn't write the circle-drawing algorithm myself, so it isn't included in the line count.)

Math is super powerful when you apply it to computer graphics. It empowers you to make magic happen in only a few hours and a few dozen lines of code, and the magic is visible even to people who can't program. You don't have to understand all that math stuff I mentioned in the first paragraph in order to do these things.


Matrix multiplication by vector and vice versa is a multiplication by at least one non-square matrix. But matrix addition, on the other hand, is something I have never seen come up in graphics.


Translation itself is not a linear transformation. To simply translate a vertex is a matrix addition(subtraction) problem.

The only reason that it is possible in the transformation matrix is because of the addition of the fourth parameter in a vertex [x y z 1] to create a homogenous linear system. This allows the translation to become a part of the transformation matrix.


I am not sure what you mean. If by "matrix addition" you meant adding vectors then it's obvious but adding actual matrices won't give you any translation operators because no matter how many matrices you have added together you still end up with a matrix, which, indeed, can only represent linear operators.


I believe we are mincing words with the added confusion of two mathematical meanings for the word vertex in this discussion.

The single row matrix, called a vertex, and a Euclidean vertex which we use to represent the points in the geometry, and happen to store in a vertex (matrix).

I do agree with your statement.


[deleted]


I was never a fan of special cases in Math. Especially when it's not a special case. Saying that an element in the product matrix is a product of a row taken from one matrix and a column from another does not require matrices to be square and fits all possible applications of matrix multiplication already.



too much maths, not enough understanding.

i worked out a lot of 3d stuff for myself early on, and the linear algebra approach is quite confusing vs. taking an intuitive approach and noticing that somethings are the same or special cases of...

... maybe i should write something about this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: