It's not really complex on its own, just that a lot of developers used ad-hoc regexp-based parsers and made some assumptions (like hardcoded namespace names, say, some software really expected "x:" prefix and didn't care for "xmlns"es) when processing it. I absolutely admit (and don't like) a bad framing scheme — it would've been much better not as XML stream, but as TLVs of structured data — but that's manageable with streaming parsers, and otherwise I don't see why XMPP's core is super complex.
When I think of something super complex that's SOAP (where you can literally write the same call in multiple different and server-incompatible ways), and XMPP is nowhere close.
As much as I don't like XML (and I do), it allowed XMPP a great amount of extensibility that's just not possible with simpler/non-extensible formats like JSON (although possible with YAML or ASN.1, and doable with implementing some format on top of JSON). Most chat protocols just don't allow to implement arbitrary extensions. E.g. you want your chat service to have a concepts of, say, hats¹ for participants — with IRC the best you can do is to run a HatsBot or make a custom DCC protocol. With XMPP you just decide on a schema and put your stuff with a custom namespace (then if it's something useful for everybody, file a XEP). So, while XML may be not best format, I think it absolutely it made sense.
___
¹) I just wanted something silly. So, I thought of Team Fortress-style hats.
So even ignoring XMPP's intrinsic complexity, it's based on XML. XML is impossible to use securely out of the box without severely tweaking the parser. XML entity bombs, gazillion of different charsets you might encounter, stack exhaustion due to nesting of namespaces or elements in places you don't expect etc.
The main reason many of these issues are not better known is because it did not get popular enough that people bothered exploiting it.
> As much as I don't like XML (and I do), it allowed XMPP a great amount of extensibility that's just not possible with simpler/non-extensible formats like JSON
You do not need schemas and namespaces to be extensible.
> The main reason many of these issues are not better known
Nope, they all well-known and were exploited (I saw all of this stuff, in vivo). Except for charsets — XMPP had limited charsets right from the beginning, so it's not applicable.
XML absolutely has a lot of issues, I'm with you on this. I've just pointed it made some sense.
> You do not need schemas and namespaces to be extensible.
Schemas - no. Namespaces - I'm sure they're necessary. Even if the namespaces are just prefixes of JSON dictionary keys or custom ASN.1 OID, they're still namespaces (and XML namespaces aren't really different here, just a URL-to-short-prefix maps and colon-separted prefixes). Otherwise it will end up very messy.
I mean, nobody wants a logic like "that's a message from Foobar 1.x client, so 'x-typing' property here means typing state, not use of monospaced typewriter-style font like Bazbaz 2.x software does." And that's inevitable without namespaces.
When I think of something super complex that's SOAP (where you can literally write the same call in multiple different and server-incompatible ways), and XMPP is nowhere close.
As much as I don't like XML (and I do), it allowed XMPP a great amount of extensibility that's just not possible with simpler/non-extensible formats like JSON (although possible with YAML or ASN.1, and doable with implementing some format on top of JSON). Most chat protocols just don't allow to implement arbitrary extensions. E.g. you want your chat service to have a concepts of, say, hats¹ for participants — with IRC the best you can do is to run a HatsBot or make a custom DCC protocol. With XMPP you just decide on a schema and put your stuff with a custom namespace (then if it's something useful for everybody, file a XEP). So, while XML may be not best format, I think it absolutely it made sense.
___
¹) I just wanted something silly. So, I thought of Team Fortress-style hats.