> because now all of your microservices can query Zanzibar at any time
This sounds a bit like a chokepoint. Is the important point here that Zanzibar is distributed and therefore is a good thing to be querying from all over the system (as supposed to one centralised application).
Contrary to microservice cargo cult, it's possible to build a relative monolith that scales infinitely. The bottleneck is the db, but if you have a schema where data is easily sharded you can scale it infinitely.
There's plenty of giant monoliths that scale fine. Like Google's analytics and gmail. If you have a database that can scale microservices are more about isolating code between different teams than any performance advantage
The novel aspect of the Zanzibar paper is its application of
distributed systems principles to avoid such a chokepoint. This includes not only the design of the service itself, but also the consistency model used in the APIs that are consumed by applications that make many operations cacheable.
As someone who’s not the founder of an authorization provider, I’d tend to agree with you. Sure looks and sounds and quacks like a choke point!
But it’s also fundamentally hard to avoid isn’t it?
The challenge is that authn is so easy to implement statelessly, since you can verify a token anywhere you have a public key. But authz is far more complicated, since it requires an ACL list of arbitrary length along with the token. It’s not like GitHub can stuff a list of every repository I can access into my access token.
>But authz is far more complicated, since it requires an ACL list of arbitrary length along with the token. It’s not like GitHub can stuff a list of every repository I can access into my access token.
This is exactly the problem that Zanzibar solves that makes it exciting!
I've written about why giant lists of claims are not a good way to structure permission systems[0] and Zanzibar-inspired services do not function this way.
Instead they ask you to query the API server when you need to check access to an item.
All API calls return a response along with a revision.
The response will always the same at a given revision, which means you can cache the response.
If Zanzibar disappears, your app can function so long as content is not modified, which would force you to invalidate the revision.
And that's only if you want consistency in your permission system -- a feature that not all permission systems even support.
Most applications can tolerate just using the cached response regardless and relying on eventual consistency.
All of this is also ignoring the global availability of the Zanzibar service itself which it gets from using a distributed database like Spanner and replicating into data centers in every region in the world (which is why you want someone else to run it for you).
This sounds a bit like a chokepoint. Is the important point here that Zanzibar is distributed and therefore is a good thing to be querying from all over the system (as supposed to one centralised application).