It's not limited to commercial code... Lots of codebases for complex applications are substantially large. Often the translation files filled with strings alone will be 50% of it.
Well, the Linux kernel source code is around 71MB compressed, I'd guess maybe 200MB uncompressed. That's quite a difference in source code size, and I think that the same is true with most OSS projects. The WINE project for instance is also < 100MB for the full (huge and extensive, including translation) source.
I think that commercial codebases just end up with a lot of cruft and nobody ever feels like cleaning them up (plus, there is incentive for keeping things a bit clunky as it buys slack-off time and/or extra hourly pay). As above, I also think they use other commercial/crappy components like third-party widgets that had the same treatment, so it all snowballs into a huge/unwieldy thing.