Quaternions are an extension of the idea of complex numbers. Complex numbers have a real and an imaginary part, while quaternions have a real part and multiple imaginary parts (3). So the basic idea is that these richer types of number, when used to build a network (instead of plain real numbers) have benefits.
So to get started with reading this paper you just need to learn about deep learning, and then also the very basics of quaternions, which would be taught in, for example, a first course on abstract algebra.
Quaternions are used to describe a rotation in 3D-space. Three numbers give the rotation axis and the magnitude of all four is the rotation angle. They are used instead of euler angles because they don't have a singularity problem (gimbal lock). I don't know if this plays a role here. Could these neural networks represent 3D transformations really well?
Geoffrey Hinton's idea of "capsules", which I don't really know anything concrete about, tries to address the recognition of objects subject to rotations, etc. That's a topological/structural strategy within the neural network, though, so quite removed from an idea like quaternions.
It's worth observing that what distinguishes complex numbers from 2-vectors like (x,y) is that there's a multiplication rule that corresponds to rotation around the origin. Similarly with quaternions. But you can also just use them as glorified vectors of 2 or 4 elements.
I believe you missed a step there. After learning deep learning, GP would want to learn about how complex numbers are an improvement. Only then would one want to consider how quaternions are an improvement.
I don’t think quaternions would be taught in a typical first course of abstract algebra. Do you know of a textbook where they are featured prominently?
I don't know of any book which features them "prominently", but I also don't think you'd really need one. They are taught in various abstract algebra books, they're just taught in the fashion of, "Here's an exercise that introduces a peripheral topic it's useful to know about." For example, groups and rings of quaternions show up in MacLane & Birkhoff's Algebra (62, 426; 282) and Lang's Algebra (9, 545, 723, 758).
Edit: In an effort to find more applied information I put down my math books and picked up the information theoretic ones. You can find more information about the use of quaternions in the two volume Handbook of Digital Signal Processing and Salomon's Data Compression. More generally, when quaternions aren't explicitly referred to it's helpful to look up the coverage of complex rotations, especially with respect to the Discrete Fourier Transform.
I am from Dublin, where quaternions were invented, so they get mentioned a lot by mathematicians and physicists here, maybe getting a higher billing than they do elsewhere. Computer graphics is obviously a place to go for introductions also, but it is typically going to be a more applied and less rigorous treatment.
When I took abstract algebra in Berkeley years ago, they were taught but they weren't the focus of the course. Basically, they're an example of skew field (division ring) so they have some interesting properties that was briefly studied. But obviously, one has to study more to understand their applications.
So to get started with reading this paper you just need to learn about deep learning, and then also the very basics of quaternions, which would be taught in, for example, a first course on abstract algebra.