Thanks for sharing this. There's a lot to digest in there, but there were a few highlights that stood out as possibly relevant to the OP paper.
> Theorem 6.4 ([2]) Complex FCMLPs having (6.9) as activation function are only universal approximators in L∞ for the class of analytic functions, but not for the class of complex continuous functions.
> ... the complex numbers (C0,1) are a subalgebra of the quaternions (C0,2). Hence the quaternionic logistic function is also unbounded. Neither could it give rise to universal approximation (w.r.t. L∞) since this does not hold for the complex case. One may argue that such things become more and more less important when proceeding to higher dimensional algebras since less and less components are affected. This is somehow true, but it hardly justify the efforts.
> ... Summarising all of the above the case of Complex FCMLPs looks settled down in a negative way. ... Hence Complex FCMLPs remain not very promising.
Unless I'm misreading, it seems already known that you _can_ use complex numbers (or quaternions) in neural networks...but you don't really gain anything from doing it.
One of the authors here. One thing about quaternion convolution is that you can write a color image into quaternion space by considering each channel as an imaginary axis. This lets the convolution act on the entire color space in a different way compared to real-valued networks, which may it do better for things like segmentation where you need to be more sensitive to changes in the color space.
In my opinion, GA is Haskell of mathematics. Just like Haskell subsumes Rust as a library (OCaml does [1], thus Haskell does too), GA subsumes quaternions (which are Rust here). Using subpar tool is not a good practice if better tool is readily available.
And it is better tooling what I am trying to advertise here.
But quaternions are themselves generalized by Geometric Algebra. And there is a plenty of information about use of GA in the field of neural computing: https://arxiv.org/pdf/1305.5663.pdf (page 3). For example, universal approximation theorem for GA is presented at https://www.informatik.uni-kiel.de/inf/Sommer/doc/Dissertati...
I think that fine article is a step back.