Original comment below but I decided instead to go for the following:
I am extremely proud of what work is being done online today to secure communications.
While we have companies telemetrying our native stacks[1], web browsers[2], and messaging platforms[3], we also have people working on software that doesn't do those things and still tries to empower the user to get what they need done without being a double agent for a 3rd party.
The contact social graph in signal is stored inside of SGX using a service called contact discovery.
Signal is attempting to design the system such that Signal can never know whose contacts are in your phone as a service provider. They deal with side-channel leakage of lookups from the contact DB into the enclave using a technique called linear scan which is a constant-time bitwise XOR operation on every contact. This is the most brute force version of a class of techniques known as oblivious RAM (ORAM) which are increasingly being used to manage data loads into secure enclaves.
Obvious caveat: if SGX gets broken then these contact lookups are vulnerable to side-channel analysis until the enclave is patched. I think this is a strictly better security property than not having the enclave, but it's far from perfect (no security model is perfect FWIW).
In short, Signal is doing everything they can to avoid having access to your social graph. If you still don't think what Signal is doing is enough, you can run your own signal (or matrix) server, but then you are running a very, very valuable server from a graph analysis perspective. At present, I believe the only way to make the metadata in these services less interesting is to put it inside of an enclave in the hopes that will reduce the value of attempting to attack the servers which manage the graphs for these comms networks.
Source: I work on MobileCoin which uses similar techniques for managing a side-channel resistant ledger.
> If you still don't think what Signal is doing is enough, you can run your own signal (or matrix) server, but then you are running a very, very valuable server from a graph analysis perspective.
...which is precisely why we’re working on P2P matrix. No servers; nowhere for metadata to accumulate (other than the clients, of course).
I understand the goal. I don't understand how a peer P2P system is going to route large groups at scale. I don't think phones are powerful enough to deal with the kinds of routing you need for large scale comms services.
In general, as the number of nodes in a comms graph increases, the overhead of synchronizing the comms between those nodes increases to the point that noise dominates signal, which is why most comms networks end up being a hub and spoke system instead of a mesh. I think that mesh can potentially work in the ~10k node range, but I don't think you can have multi-million node mesh networks, which is what a P2P system will need to function at scale.
I would be thrilled to be proven wrong, but my knowledge of networking suggests that in a high node count network, the overhead of synchronizing node state dominates network traffic.
Edit: In summary, I think that in most big comms networks, the endpoints do the encryption and the servers do the heavy-lifting of routing. Again, I would love to be proven wrong.
Edit 2: Just to be clear, I am excited for the way matrix pushes the envelope on comms tech. I think it's cool to see active development in these systems wherever it comes from. My caution is only about what I've seen in pure mesh-networking systems (and the dangers of self-hosted systems becoming the very centralized systems users thought they were escaping from).
So it's true that P2P Matrix is currently full mesh (which is why it's staggering a bit as everyone trying it from HN piles on). However, you can absolutely do better than full mesh without going straight back to hub-and-spoke: you can use spanning trees (like Yggdrasil), or gossiped segmentation as libp2p's Gossipsub does (https://blog.ipfs.io/2020-05-20-gossipsub-v1.1).
Also, you don't need to scale larger than the number of nodes in a single room (or worst case, the number of nodes visible to your account). For context, the largest rooms in today's non-P2P Matrix have about 100K users in them, and a typical poweruser account sees about 400K other users at any given point. Obviously as Matrix grows this will increase, but I strongly suspect rooms will then move into "celebrity" or "auditorium" modes - much as Facebook & Twitter etc have a separate class of routing algorithms for handling traffic for accounts with millions of followers.
The problem with hub and spoke is that the hubs hold the social graph and then become the target for censorship. I don't think there are any techniques employed by Matrix to mitigate that threat at this time (please correct me if I'm wrong).
In short, if you have a small network (under 10k nodes), I think P2P can work, but for networks of large scale (>10k nodes) I think you need hub and spoke, at which point the routing nodes in the center are the lynchpin. You can use some kind of mixnet tech to try to get around this, but that increases latency and computational overhead, thus lowering throughput. You can go the Signal route and throw the graph in an enclave, but there's still side-channel analysis (which is the thing that mixnets are trying to deal with, albeit I can't comment on their efficacy). Harry over at Nym has some cool ideas here.
I am not sure that spanning trees or gossip protocols solve the problem I'm describing, but, if they do, I'd appreciate if you could elucidate further.
Edit: Yes, I agree that N to N routing networks don't scale well. I think a K of N broadcast network can scale, but it's a tricky UX tradeoff.
Matrix mitigates the threat of a hub & spoke model by not being hub & spoke (particularly in P2P!) :)
> I am not sure that spanning trees or gossip protocols solve the problem I'm describing, but, if they do, I'd appreciate if you could elucidate further.
Perhaps the confusion here is the expression "hub and spoke" which sounds to me like a static centralised star topology routing model.
My point is that if you have 1M nodes trying to share data (e.g. follow a celebrity's personal pubsub topic), you clearly need a smarter routing algorithm than full mesh (where the celeb's node would have to do 1M parallel pokes to send its messages). You're completely right that one solution is a hub-and-spoke model (where the celeb's node would poke a big centralised hub somewhere, which would then relay the poke out to 1M followers on spokes) - but as you point out, the hub becomes a centralised chokepoint of failure/control/privacy-violation etc.
So, the main other options I'm aware of are either to arrange your 1M nodes into a spanning tree (or overlapping spanning trees) of some kind, as Yggdrasil does (their current one looks like https://yggdrasil-map.cwinfo.org/#, although only ~300 nodes are live atm)... or you let the nodes self-organise into some kind of hierarchy based on gossiping and fan out the messages that way (as per https://blog.ipfs.io/2020-05-20-gossipsub-v1.1, and some of the experimental routing stuff we've been doing with Matrix).
The good thing with Matrix is that it's room-based, and I haven't really seen rooms with over 10k users. Presumably most of these nodes would be offline most of the time.
I can easily see routing being delegated to more powerful homeservers. That said, you do not need perfect routing to be better than hub and spoke at using the network effectively, and it seems to be what most current mesh networks aim for.
> No servers; nowhere for metadata to accumulate (other than the clients, of course).
If the network itself is controlled by adversary, you leak more metadata passing data directly. If data goes through honest server, attacker needs to do time correlation analysis.
Do you know anyone who was able to successfully compile & run the server as well as re-compile the mobile clients to specify a different server address? Smells a bit like vendor lock-in to me if you aren't going to bother adding such a UI widget on a FOSS app.
Although I think it's still a long way to usable and interoperable p2p, matrix does support voice+video chat (in 1:1 chats p2p with signaling via matrix; otherwise via jitsi meet integration).
3rd party identifiers like phone numbers and email addresses can be registered and discovered via centralized service. At least for phone numbers it's not really possible decentralized.
Please check out Firefox about:telemetry - that I believe is how it should be implemented so users don't freak out. There is nothing about me there. That is a simple way to help project.
I am extremely proud of what work is being done online today to secure communications.
While we have companies telemetrying our native stacks[1], web browsers[2], and messaging platforms[3], we also have people working on software that doesn't do those things and still tries to empower the user to get what they need done without being a double agent for a 3rd party.
1 (windows 10, chrome OS, GMS android)
2 (cookies, pixels, fingerprinting, CDNs, chrome itself)
3 (whatsapp, FB messenger, telegram)
From OS[4] to browser[5] to messaging platform[6].
4 (debian, qubes)
5 (...maybe not? konqueror probably doesn't do telemetry?)
6 (irc probably, matrix, delta chat)
Matrix is IMHO in competition for mindshare not (directly) with WhatsApp, but with Signal.
Matrix | Signal
Increasingly decentralizable | Centralized
E2EE but not quite for metadata | E2EE and metadata-free mostly except recently requiring PINs and server-side storage
Federated with its costs (slower development etc) | Nonfederated with its costs (outages etc)
Temptingly close to P2P or CS | only CS
No voice comms | Voice and video comms
No built-in social graph | social graph via phone (being worked on?)
OSS in practice | OSS in law but hard to contribute to