The top line in a gprof profile of an evented C web server had better be select() (or equivalent) or read() (or equivalent). So I don't follow you here.
Yes, the average server thread is going to spend far more time blocked on I/O than not. But the server should be able to handle more than one request simultaneously.
(Or you can use an event model; the point is, if your server is spending all of its CPU time blocked while under load, then something is wrong.)
Now you're not following me, timr. An evented server can handle 10,000 connections simultaneously. Run it under load, profile it. Select is at the top of the profile. The program spends all its time in the kernel, waiting on I/O.
That evented server is also faster than the threaded version of same.
What's the compute task you think a program is bottlenecking on in an application that does nothing but shovel data out of a database into a socket based on a btree key?
You're overgeneralizing. Where your server spends its time depends entirely on the profile of your app. If your pages render (on average) in less time than they wait for I/O, then your server will spend most of its time blocked on I/O.
I suspect that you're thinking of something like a lightweight proxy server, that waits for incoming requests, then quickly hands them off to an app server that does most of the work. That's trivial. If you just profile just the webserver process, then yes, you would expect that most of its time would be spent in wait. But while the proxy server is blocked on I/O, the app server is doing a ton of other things for any non-trivial page rendering.