I think I might have 2c to clarify the OP's stand, having worked on a mobile XMPP client before. This is with regards to his point about bandwidth and battery usage on mobile.
XMPP is a connection-oriented stateful protocol. The connection setup process is very expensive: Connect, authenticate, advertise the list of supported client features, get the roster (contacts), and IIRC presence exchange, etc. Therefore, once you've established a connection, you really don't want it to break. Even just reconnecting is very expensive.
However, in the mobile scenario, regular connection breakage is the norm. What most mobile based clients will then do is proxy the XMPP connection on the server-side and connect to the client over HTTP instead. This is well defined in the BoSH spec (XEP:124). BoSH was originally designed for web-based clients, but happens to meet the need of the mobile use-case.
However, when it comes down to implementation, there are several optimizations that can be applied:
1. The roster exchange at the connection time is very expensive, and usually only sends data to the client that the client probably has already stored locally from a pervious session. Connection overhead can be reduced drastically if there was some way of versioning the roster and only transferring deltas. This was the OP was referring to. (Roster versioning, XEP:273).
2. Presence packets are one of the most chatty aspects of the protocol (x went offline, x is now online, etc). I've heard someone say that it constitutes about 80% of the packets on a typical connection, but this is anecdotal. In cases where the user isn't actively interacting with the client, having presence packets being sent all the time seems wasteful. One way to tackle this is to buffer up and de-duplicate the presence packets on the server, and send them in batches only when necessary. (For example, if x goes online, then offline, then online again, and then goes 'busy', the client only needs to know that x is busy.) I think this is what the OP was referring to when he mentioned the google:queue for delayed presence delivery. This alone can be a huge win for battery and bandwidth, since transferring data and processing it can be reduced drastically.
Aside: Presence is so chatty, that some mobile clients simply don't handle presence at all. Take WhatsApp for example. It doesn't show whether every person in your list of contacts is online. The assumption is that every user is online by the nature of the device, and this assumption works in the common cases. In the edge cases is where you'd need to buffer up messages. This reduces chattiness drastically, making the protocol seem much more lightweight. (WhatsApp is based on XMPP, but have spun it off into their own thing.)
Just thought I'd add these points to the discussion here, since there seemed to be some lack of understanding of the protocol. IMHO, the XML, as verbose as it is, isn't really the problem. The protocol itself is verbose and whether the transport uses XML or whatever else is of comparatively little consequence. This verbosity of the protocol (and not so much the format of data transfer) has its effect on battery life and bandwidth usage.
>Aside: Presence is so chatty, that some mobile clients simply don't handle presence at all. Take WhatsApp for example.
One could also take Hangouts for example. :)
anyways, very informative post and good anecdotes, thank you for sharing.
my understanding of the XMPP spec is that (assuming the server follows the server-client spec, anyways) there's no way to unsubscribe from your contacts' presence notifications and keep them on your roster [and therefore able to be contacted without reauthorization, usually.]
Presence subscriptions don't have to be symmetrical in XMPP, and indeed you can even have contacts on your roster that don't share presence in either direction. As for exchanging messages, as far as I know Google's implementation is the only one that doesn't allow that without presence subscription. All others do.
In fact, this caused presence subscription requests to be the only attack vector in terms of spam in Google talk, which some argue makes it harder to combat. Mostly because there is less data available to detect malicious behaviour.
XMPP is a connection-oriented stateful protocol. The connection setup process is very expensive: Connect, authenticate, advertise the list of supported client features, get the roster (contacts), and IIRC presence exchange, etc. Therefore, once you've established a connection, you really don't want it to break. Even just reconnecting is very expensive.
However, in the mobile scenario, regular connection breakage is the norm. What most mobile based clients will then do is proxy the XMPP connection on the server-side and connect to the client over HTTP instead. This is well defined in the BoSH spec (XEP:124). BoSH was originally designed for web-based clients, but happens to meet the need of the mobile use-case.
However, when it comes down to implementation, there are several optimizations that can be applied:
1. The roster exchange at the connection time is very expensive, and usually only sends data to the client that the client probably has already stored locally from a pervious session. Connection overhead can be reduced drastically if there was some way of versioning the roster and only transferring deltas. This was the OP was referring to. (Roster versioning, XEP:273).
2. Presence packets are one of the most chatty aspects of the protocol (x went offline, x is now online, etc). I've heard someone say that it constitutes about 80% of the packets on a typical connection, but this is anecdotal. In cases where the user isn't actively interacting with the client, having presence packets being sent all the time seems wasteful. One way to tackle this is to buffer up and de-duplicate the presence packets on the server, and send them in batches only when necessary. (For example, if x goes online, then offline, then online again, and then goes 'busy', the client only needs to know that x is busy.) I think this is what the OP was referring to when he mentioned the google:queue for delayed presence delivery. This alone can be a huge win for battery and bandwidth, since transferring data and processing it can be reduced drastically.
Aside: Presence is so chatty, that some mobile clients simply don't handle presence at all. Take WhatsApp for example. It doesn't show whether every person in your list of contacts is online. The assumption is that every user is online by the nature of the device, and this assumption works in the common cases. In the edge cases is where you'd need to buffer up messages. This reduces chattiness drastically, making the protocol seem much more lightweight. (WhatsApp is based on XMPP, but have spun it off into their own thing.)
Just thought I'd add these points to the discussion here, since there seemed to be some lack of understanding of the protocol. IMHO, the XML, as verbose as it is, isn't really the problem. The protocol itself is verbose and whether the transport uses XML or whatever else is of comparatively little consequence. This verbosity of the protocol (and not so much the format of data transfer) has its effect on battery life and bandwidth usage.