Does use of complex numbers really provide improvement? How does this work? (other than cramming 2 numbers into 1... which itself is suspect... the complex plane has the same cardnality as the reals...)
Cardinality is a complete red herring here, we don't care about the set theoretic structure, and ultimately we're taking finite approximations anyway. The structures we care about are the metric which tells us which solutions (neural nets in this case) are nearby to each other and the algebra which tells us how to compose solutions.
The algebra of real numbers is simply less structured than the complex numbers. One of the key properties of the complex numbers is that they naturally have both a magnitude and a phase. This lets them capture phenomena that have a notion of superposition and interference.
As you correctly pointed out, you can simulate a complex number with two real numbers. The key is to exploit the particular geometric and algebraic properties of the complexes. One example in neural networks is the phenomenon of synchronization, where the outputs of neurons depending on the presence of a particular stimulus all have the same phase. This can be exploited for applications such as object segmentation.
So the widest possible view of this line of research is that putting more algebraic structure on your parameters can improve the behavior of your learning algorithms. My extremely hot take on how far this can go is a full fledged integration of harmonic analysis and representation theory into the theory of deep learning.
I'm coming from a signal processing background, so thinking in terms of magnitude and phase is comfortable to me. Does synchronization, in the sense you're describing, really happen in deep learning (ANN) systems? I'd love a link or reference.
There aren't really any introductions, just research papers. If you have some understanding of real valued neural networks you'll probably be able to work your way through the literature.
Cardinality has nothing to do with this discussion. Quarternions have different structure than real numbers. Over every set, you can find any (reasonable) structure you want, but that's not a very interesting question. For example Q and Z have the same cardinality but they look nothing like each other other than the fact that Q is analogous to the fraction field of Euclidean Domain Z.
Are quaternions more smooth and connected than complex numbers? My understanding was that higher-dimensional hypercomplex numbers tend to lose useful structure. I'm also curious what being connected in this context means.
Maybe it would help to think of the turing machine as analogous. Many programming languages are Turing complete, you can express any computation in any of them, but some languages are more expressive than others and let you reach and work with ideas you wouldn't conceive in a less expressive language.
Lots of things in math are similar. Simon Altmann's Icons and Symmetries makes a case that using representations with insufficient symmetry impeded our learning of the laws of magnetism.
Complex numbers are a particular 2d slice of 2x2 matrices that happen to capture rotation and other periodic phenomena very well. If you are trying to solve some problem that you suspect to involve periodicity, focusing on complex numbers helps you get there faster.
You can use complex numbers number to represent higher dimensional objects using only primitive operations, scalar values, and an imaginary number for each dimension. However, computing with these values is significantly more challenging that real vectors.
This book on 'Geometric Algebra' starts to explain: http://www2.montgomerycollege.edu/departments/planet/planet/...
They have some pretty useful properties. For example every polynomial of degree n has exactly n complex roots and if a complex-valued function is differentiable wrt a complex variable then it's also infinitely differentiable and analytical.