How is encapsulation thrown out the window in sizeable codebases?? It is the sin...

usrbinbash · on Jan 25, 2022

Could I see some examples of this evidence?

Let's consider a really simple object graph:

    A -> B
    C

A holds a reference to B, C has no reference to other objects. A is responsible for Bs state. By the principles of encapsulation, B is part of As state.

What if it turns out later, that C has business with B? It cannot pass a message to A, or B. So, we do this?

    A -> B <- C

Wait, no, we can't do that, because then B would be part of Cs state, and we violate the encapsulation. So, we have 2 options:

    1. C -> A -> B
    2. C <- X -> A -> B

Either we make C the god-object for A, or we introduce an abstract object X which holds references to A and C (but not to B, because, encapsulation). Both these implementations are problematic: in 1) A now becomes part of Cs state despite C having no business with A, and in 2) we introduce another entity in the codebase that serves no purpose other than as a mediator. And of course, A needs to be changed to accomodate passing the message through to B.

And now a new requirement comes along, and suddenly B needs to be able to pass a message back to C without a prior call from C. B has no reference to A, X or C (because then these would become part of its state). So now we need a mechanism for B to mutate its own state being observed by A, which then mutates its own state to relay the message up to X, which then passes a message to C.

And we haven't even started to talk about error handling yet.

Very quickly, such code becomes incredibly complex, so what often happens in the wild, is: People simply do this:

    A -> B <-> C

And at that point, there is no more encapsulation to speak of. B, and by extension B is part of A's state, and B is part of Cs state.

Jtsummers · on Jan 25, 2022

Encapsulation (to lock down an object like this) is used when there's a notion of coherence that makes sense. If B could be standalone, but is used by both A and C, then there's no reason for it to be "owned" by either A or C, only referenced by them (something else controls its lifecycle). Consider an HTTP server embedded in a more complex program. Where should it "live"?

If at the start of the program you decide that the server should be hidden like this:

  main
    ui
      http
    db

And main handles the mediation between db and ui (or they are aware of each other since they're at the same level), but later on you end up with something like this:

  main
    ui
      http
    db
    admin-ui

And the admin-ui has to push all interactions and receive all its interactions via the ui module, then it may make less sense (hypothetical and off the cuff so not a stellar example, I'll confess, but this is the concept). So you move the http portion up a level (or into yet-another-module so that access is still not totally unconstrained):

  main
    ui
    db
    admin-ui
    http
    -- or --
    web-interface
      http

Where `web-interface` or whatever it gets called offers a sufficiently constrained interface to make sense for your application. This movement happens in applications as they evolve over time, an insistence that once-written encapsulation is permanent is foolish. Examine the system, determine the nature of the interaction and relationship between components, and move them as necessary.

Arbitrary encapsulation is incoherent, I've seen plenty of programs that suffer from that. But that doesn't mean that encapsulation itself is an issue (something to be negotiated, but not a problem on its own).

throwaway889900 · on Jan 25, 2022

Those are called callbacks, we use those everywhere. If you don't want to use callbacks, you can use something like a publish-subscribe pattern instead so that X doesn't need to be indirectly linked to B through A and can publish directly to X.