I'm getting the feeling that with all the unique server setups in use, monitoring and metrics systems are going to be just as unique and specific.
There are some interesting process monitoring projects out there like god, monit and bluepill, as well as ec2/cloud specific stuff from ylastic, rightscale and librato silverline. Have you ever used any of those tools?
Fitting all these together for my setup is trial and error, but it really does force me to think hard about my tools and assumptions even before I get hard data.
I hack on the aforementioned Silverline at http://librato.com, and we provide system-level monitoring at the process/application granularity as-a-Service. (We also have a bunch of features around active workload management controls, but that's out of scope here). It actually works on any server running one of the supported versions of Linux, not just EC2. Benefits of going with a service-based offering are the same as in any other vertical, you don't need to install and manage your own software/hardware for monitoring.
There are some interesting process monitoring projects out there like god, monit and bluepill, as well as ec2/cloud specific stuff from ylastic, rightscale and librato silverline. Have you ever used any of those tools?
Fitting all these together for my setup is trial and error, but it really does force me to think hard about my tools and assumptions even before I get hard data.