A quine is more similar to (but not quite the same as) an uncompressed image containing a QR code with a compressed copy of the same image, which has its own challenges; but like, a cryptographic hash function (such as SHA2) is designed explicitly to make this hard... if you had chosen a simple checksum instead it would be trivial to create an image containing the text of a checksum of the image: a compression function (particularly a tiny custom one designed for this one time use, that removes barely any entropy at all) is predictable enough to be manipulated. (edit: I hadn't looked at this one, though; but now that I see its code, apparently it is just cheating by accessing its own code via the DOM; it doesn't need to do that, and it doesn't even really feel like a "quine" to me given this implementation.)
For anyone wondering how is this possible I've made a quick research and the method is in the description part of this youtube video: https://www.youtube.com/watch?v=ufq2Eb78kSU
In short, Youtube API's resume upload function can be abused to achieve this.