It is not directly in the article, but in a link to a tweet by djb, the creator of ChaCha8. He believes that the cpb listed in the Randen comparison is off:
He mentions that perhaps the implementation of ChaCha8 for the benchmark is done by hand and unoptimized. And it is true from what I saw that a lot of benchmarks with ChaCha8 are implemented with none of the tweaks that make it fast.