Yup, the pigeonhole principle really defines a hard limit on compression, in that it is impossible to compress (make smaller) ALL files.
The way compression algorithms get around that, is by abusing the fact that we rarely want to send around arbitrary data, real world data has lots of redundancy in it, so we can make algorithms where real world data gets mapped into compressed files smaller then they are (by finding the redundancies in the data), and purely random data gets mapped into files that are actually slightly longer than the data.
The way compression algorithms get around that, is by abusing the fact that we rarely want to send around arbitrary data, real world data has lots of redundancy in it, so we can make algorithms where real world data gets mapped into compressed files smaller then they are (by finding the redundancies in the data), and purely random data gets mapped into files that are actually slightly longer than the data.