This is a textbook to help readers understand the steps that lead to deep learning. Linear algebra comes first especially singular values, least squares, and matrix factorizations. Often the goal is a low rank approximation A = CR (column-row) to a large matrix of data to see its most important part. This uses the full array of applied linear algebra, including randomization for very large matrices. Then deep learning creates a large-scale optimization problem for the weights solved by gradient descent or better stochastic gradient descent. Finally, the book develops the architectures of fully connected neural nets and of Convolutional Neural Nets (CNNs) to find patterns in data. Audience: This book is for anyone who wants to learn how data is reduced and interpreted by and understand matrix methods. Based on the second linear algebra course taught by Professor Strang, whose lectures on the training data are widely known, it starts from scratch (the four fundamental subspaces) and is fully accessible without the first text.
This is a textbook to help readers understand the steps that lead to deep learning. Linear algebra comes first especially singular values, least squares, and matrix factorizations. Often the goal is a low rank approximation A = CR (column-row) to a large matrix of data to see its most important part. This uses the full array of applied linear algebra, including randomization for very large matrices. Then deep learning creates a large-scale optimization problem for the weights solved by gradient descent or better stochastic gradient descent. Finally, the book develops the architectures of fully connected neural nets and of Convolutional Neural Nets (CNNs) to find patterns in data. Audience: This book is for anyone who wants to learn how data is reduced and interpreted by and understand matrix methods. Based on the second linear algebra course taught by Professor Strang, whose lectures on the training data are widely known, it starts from scratch (the four fundamental subspaces) and is fully accessible without the first text.