I actually worked on a face-tracker using Python and a webcam mounted on servos as a research project on a quad-core. I highly doubt the Python added much overhead (we were using OpenCV, which is compiled), and it still wasn't very elegant; however, you make an excellent point. Once we can push web technologies to make JavaScript capable of computer vision, or use other technologies/plugins that empower computer-vision in-browser, simple applications like video surveillance will get much more interesting.
EDIT:
Just a thought, wouldn't it be cool if we didn't have to ring doorbells anymore? There'd be a webcam instead of a lens on the door, and a small computer that notifies you when someone's at your door. It could even tell you who they are if we create adequate recognition technologies.
Actually, I've worked on a hand gesture recognition thing on web for quite some time (http://api.alii.tv/). The unfortunate part is, that is a plugin, and it is really hard to make a plugin compatible with every machine in wild.
Combine this with something like http://nicklothian.com/blog/2009/12/15/using-flash-to-shim-a... and you might be able to do facetracking in Javascript