I've sometimes heard wikipedia described as a "large scale static site plus a medium scale social network". The caching is a bit more complex than a naive static site due to churn rate and freshness requirements, but fundamentally you are right, without frontend varnish caching, wikipedia would be very different in terms of hosting requirements and scaling complexity.
I'm also wondering if the caching strategy they are using is a naive one (ie: cache is valid for a fix duration, like 5 minutes) or if it's a more active one (like stakeoverflow), with cache in validations each time a page is modified/commented on.