Here is a totally intuitive and probably wrong way to think about it, but its wh...

Here is a totally intuitive and probably wrong way to think about it, but its what I have in my head as my own mental model. Imagine that a matrix transformation is like a wind blowing in a particular direction. If you throw a ball, it gets blown by the wind, and its direction gets changed. If you throw it obliquely it ends up a certain distance from you. Which direction would you throw it in to make it go as far as possible? You'd line up the throw with the direction of the wind. Effectively that's finding your largest eigenvalue / eigenvector.

Now, for finding the first principle component we do something special. We create a wind that blows in the direction of variation of the data. We do it by getting the covariance between all our variables and substituting them into the each position of a matrix so that they make up vectors where each dimension is weighted according to the size of the variation in that direction. This makes a "wind" in the direction of the variation of the data. The first principle component finds the vector that most aligns with this covariance "wind".

I am sure this is technically wrong in essential ways and I'd love to hear it be corrected!