If I understand this correctly then the huge improvement in latency (from 200ms to 3ms) comes from not having to deal with slow clients directly. Traffic to your front-end server are now only from ELB, and ELB is "spoon-feeding" the web-clients. This is true if you are using ELB in "http-mode".
This also explains why you can cut the front-end servers by 20% - as each request is handled more efficiently (lower latency equals higher throughput). Also, connection-reuse is more efficient as the set of servers in the ELB-pool is more limited that the set of web-clients.
ELBs have terrible TLS support... Cipher suite choice and ordering support is abysmal, and they only recently started supporting newer TLS versions. OCSP stapling isn't supported either.
It does, but it's not very useful, as you cannot have a VIP span multiple AZs and typically a spontaneous instance failure is correlated with an AZ problem. ENIs (and IPs) are tied to a subnet, which is tied to an AZ. Even the example only utilizes a single availability zone. While there may be a tiny bit of additional redundancy added by doing this, you still need a way to fail over to another availability zone.
I see this issue a lot with blogs related to tracking companies (Mixpanel, for example). These companies are serving their blog CSS from the same hostname as they serve their tracking code, so Ghostery blocks it.