Hacker News new | past | comments | ask | show | jobs | submit login
Better Software Design with Clean Architecture (fullstackmark.com)
277 points by lgclrd on July 3, 2017 | hide | past | favorite | 139 comments



This design puts entities in the middle - everything else pivots around entities. Additionally, it uses the OO approach of implementing logic as methods on the entities.

After a couple of decades of experience, I've come to the conclusion that this isn't right. Most business rules involve processes, which are inherently procedural, or, from another perspective, functional - functions of the whole state of the system to a new state of the system. Most processes don't logically belong on an object, and when you stick them to one end, you create lots of problems for yourself.

Just take that example Student entity class; what stops you from writing:

    student.RegisteredCourses.Add(course);
Moreover, when you have a relation between two entities, it's not uncommon for the relation to be modelled at both ends; that is, a student has a list of courses, and a course has a list of students. Do you implement the same validation at both ends? Make one end (choose one) read-only? Do edits done on one end automatically turn up on the other end? How do you protect the visibility of methods that keep either end in sync, without exposing invariant-violating APIs to other code?

Invariants and validations for fields are trivial with an OO approach; for entities, they're reasonably easy, but sometimes you need partially invalid versions while editing or constructing an entity; but once you bring in multiple entities, and relations, everything starts getting hairy pretty quickly. Assertions and validations that would be trivial to write [1] in a relational language like SQL aren't possible in an object-oriented language without a lot of system-building.

So I've come around to the idea that the database is a better thing to put at the centre; that encapsulation and hiding of the main fact store is harmful to the architecture of a system, especially in a heterogeneous environment where your entities are represented in different languages, all backed by the same fact store.

[1] Trivial to write, but not necessarily cheap to evaluate. I'm not advocating writing all global validations in SQL.


This is solved by DDD and aggregates. Here's how i would do it.

   Course - Aggregate Root

   Student - Aggregate Root

       RegisteredCourses[RegisteredCourse[]) - Sub Entity Collection - with id reference to course. With metadata like date time when the registration occurred.

       RegisterForCouse Method
There should be no mutable public properties, internal methods should be private. Everything should go through an aggregate root method.


CourseRegistration.Register(course, student);

CourseRegistration.ViewCoursesFor(student);

I think giving either entity "ownership" of the relation is a disaster waiting to happen. I just have this vision of someone wanting data about students accidentally loading all the course information. In the DAL of course you end up modelling however but seeing courses in a student just makes me ill.


In this case it isn't that bad since its only loading registered courses for that specific student. Which for a student would be limited(10?). In fact it should probably be a business rule inside RegisterForCourse that a student can only register for a specific amount of courses per semester.

You normally need load an entire aggregate, so that you can maintain your invariants. If you separate into 3 aggregates such as Course, Student, CourseRegisteration then starts to become harder to maintain business rules. For example if you wanted the RegisterForCourse method to limit student course registration to 10, it would now have perform another query.

If start doing queries that don't match up your domain model, then should probably do CQRS. Which would allow you to build a model optimized for queries.


I just think its bad because its asking the student to be involved in the process of registration. I don't think that way of thinking scales. Unless you're making a util (which enables any architectural practice tbh) you're gonna eventually get stung if you pollute your currency with logic. Currencies just like graphical interfaces should be as dumb as possible.

One day someone is going to want slightly different rules for student course registration and then they'll realise this rule is baked into the core and that's sad.


"One day someone is going to want slightly different rules for student course registration and then they'll realise this rule is baked into the core and that's sad."

This is a desirable trait, and one of the main points of DDD. If you want to change registration, you change the core business logic. It then applies to all applications using that business logic. If you have a real business case for different registration methods, then you simply model that on your entities.

If you don't do this, your going to get business rules applied inconsistently as programmers will interpret requirements slightly differently.


Its already been deployed and some customers are relying on the old behaviour. Old clients might need to call new server code. If you've baked your logic into the objects you pass around you've done a stupid as the client definition could give a different answer than the server definition. ENJOY YOUR "CONSISTENCY"!

So what's easier to change? Giving customers versions that give them all different types of the core base type Student or JUST changing the type of CourseRegistration that they visit? (CourseRegistration is a service as opposed to a currency).

You keep your currency CLEAN.


You identify what customers want to customize, and provide a way customizing that. Some customers want 10 courses per student, others 5 courses. You could easily model that in a domain model.

If you want something radically different, then are you even developing the same application anymore?

By the way i've never heard anyone call the core entities of an application "Currency" before, where did you even learn that? I imagine that could be confusing.

I also imagine having an application where you have build custom methods for each customer who wants something slightly different is terrible for codebase maintainability.


They're currency because when you buy something from a shop your money doesn't start telling you what you can and can't do and it travels all the way through many layers of worlds. It is the core and its a good word, use it.

I make a career out coming to companies that should have (according to you) made radically different systems but didn't. Most companies don't find it economically justifiable to re-write based on slightly different customer requirements and most companies have many deployments, not just "one perfect one" that most modellers appear to imagine. If your design doesn't have the flexibility of change designed into it you're going to struggle when you get a second client.

> I also imagine having an application where you have build custom methods for each customer who wants something slightly different is terrible for codebase maintainability.

Yeah of course its a massive pain in the ass but it enables you to sell more.


"Most companies don't find it economically justifiable to re-write based on slightly different customer requirements and most companies have many deployments,"

Then don't. Make the domain model customisable with various options.

Having specific methods for specific companies seems a lot worse.

I build SAAS applications which have various customization options customers can do. I don't really have a problem with it. I've designed a clean way to customise.


I do more enterprisey stuff so our customers are a little less accepting of model restriction. I find a service based architecture more flexible as its pretty easy swap out services that just work off a common currency that define different or customisable behaviour. The sort of situation where an entity is performing business logic makes that harder do as opposed to swapping out the service.

I made the same mistakes as OP and learnt the hard way. I encourage you to explore a greater purity in your models in the future because I genuinely believe it leads to better code.



How did you decide that the registration is a property of the student and not the course?


Registration is definitely not a property of the student.

It's always a good idea in these situations to appeal to real life. The actual business will point the way of the business simulation. In the real world we don't ask students to register for a course. Instead students ask the Registrar to register for a course. When the Registrar makes her decision she considers far more than just the internal state of the Student; she considers (1) does the course have any available seats? (2) has the Student met all the course pre-requisites and, most importantly, (3) has the student paid his tuition and is he even a valid member of the university community.

What the Student does have is a history and a context -- that is, a state -- that must be considered when registering for courses. The student may also have preferences -- courses he wants to register for.

The language of the business should guide these decisions always. A student submits a request tfor a course and it is the office of that accepts or denies this request.


This guy has been thinking about this properly.


Take this thinking to the end and realize that it leads to freestanding functions. In general all the context of the program is needed to execute a functionality. It's not like the registration office owns all the students. It's not like a registration wouldn't change the student's context. Students are both an independent and related concept. The proper object to call most things on is a "Global" object. Now instead of

    Global.do_some_thing(foo, bar)
just

    do_some_thing(foo, bar)
There you have it. OO is an unsound approach which survived so long mainly due to the perceived real world "analogy" and because the Object-Verb-Predicate syntax simplifies code completion.


I like the simplicity of this. If the app is of appreciable size, do_some_thing depends on databases, webservers, external processes, filesystems and configuration. How do you test/debug/explore its functionality without setting all of that up?


I would actually say these all make good "objects", i.e. isolated pieces. On the other hand, no runtime polymorphism is needed. To avoid OOP at a syntactic level, what do you think of the following?

- For tests at a smaller level, decompose the application such that most parts are easily testable in isolation (without external "hard" dependencies).

- For mid-sized tests with external dependencies but mostly unidirectional dataflow, setup a global virtual table with all the mock methods (and instances) that are needed. Alternatively, traditional linking methods.

- For larger "integration" tests there is no substitute to testing the real thing. At some scale and level of interactivity with the database, you just have to talk to the true filesystem, the true database etc. You can still setup a test instance for most cases where there is no interaction with external services.


I believe OO is good for exactly two things: abstract data containers and state machines.

In the former, access hiding cleans up the API and prevents unsafe usages of the container. In the latter, OO enforces a protocol to keep the state machine sealed off and only aware of key inputs and outputs.

And that's it. I've found nothing else. Data itself is better off when strategized to fit in a database, whether off-the-shelf or a custom-tuned, in-memory design. The state machines may need to query a part or all of the database, as well, so their ability to restrict scope only goes so far.


You nailed why I think many object oriented designs fall flat. People presuppose both that the objects within their domain encompass all objects (just students and courses) as well as that the objects within a domain will not change.

When #1 is missed I usually see that theres a design that doesn't mimic its domain and thus lose the ability for developers and users to have clear, concise communication. At that point OO is a disservice.

When #2 is missed you end up with IFilteredCourseAdapterProcessor as people attempt to bolt on components to solve future needs.

The addition of the "Registrar" to the domain immediately demonstrates how the naive interpretation is missing core components and I bet the users and devs fundamentally aren't speaking the same language.

This, imo, leads to conversations like, "Of course so and so approves all the registrations! Otherwise it would be madness!"


I've come to avoid OO and use only freestanding functions where possible mostly because of this problem. So often it ends up in a syntactic distinction that is absolutely meaningless otherwise.

I use some OO to make abstract datatypes in languages that are inherently OO, but I think the explicit virtual table approach in C or the type classes approach in Haskell are much cleaner.

Lately I had to make a REST API which is basically a distributed Object-oriented interface. I think I managed to get it done with compromises, but I'm not happy. Another idea would be to make a procedural interface first and then make a REST API on top if needed. But I have some doubts it can work out practically.


Some would call your procedural interfaces services and map your REST API directly to publicly available service methods.


Ah, REST.

An API is there because someone in the other end wants to do something. If you don't design it but just slap REST on top of your data, then you're not doing that person a favour.


Well you can still "design REST" on top, right? But it could mean some duplicated efforts.

Actually, after having tried a few times, I'm pretty sure I don't want to put REST ideas at the center of my architecture. You just happen to need a network transport, and not even in all cases (debugging for example). And an existing model is never going to be able to represent all the concepts of your application domain. This means that you need to build your own representations.

In theory there is a point of using a standardized object protocol which can represent some CRUD use cases. But from my limited experience I think it breaks pretty quickly, to the point where I can use only GET and POST.

Using protocol layers of increasing specificity may make it easier to use existing tooling with data exchanges. For example, many APIs use HTTP statuscodes as a more coarse-grained version of the return codes in the body (in an ad-hoc format). Also caching is often brought up as an example where you want to buy in to a standardized protocol.

But some APIs don't buy in, like Facebook, which reportedly returns 200 OK always. It seems like a lot of best-effort work with little returns to me as well. But I don't know - I'm not a professional business software guy.


This is actually a part of understanding the specific domain. How do the educational practitioners think about the problem for application your working on? Is the application student or course centric? This is big part of DDD, and involves working domain experts and getting inside each others heads.

You could defiantly model it as the Register method on the course. But this example shows how you can avoid people messing around with internals once you've modeled it. There is only one way of registering for a course.


I wonder how to even formulate the question to the practitioners -- it seems like an artificial choice imposed by the computer formalism (single-dispatch OO).

In other words, I don't see how it is a domain question, and it seems likely that the domain practitioners just think "there are courses, and there are students, and students can be registered for courses".

My suspicion is that DDD would basically be better without the focus on single dispatch OO.


See my comment below. The language of the business will almost always guide the design in the right direction. There are rare cases where the business is unaware of a more "essential truth." This isn't about OO it's about faithfully capturing the model of the domain which is all DDD is.


"students can be registered for course" - This is already implying an order.


If I say "students can be registered for courses in the school" then you might think the OO model should be

    school.register(student, course)
which does actually seem reasonable to me. Here the school object would basically be the system's logic as a whole, and both the course and the student could just be IDs.

I really think single dispatch OO is a huge distraction and I haven't seen any strong arguments for its use as a general paradigm.


Nothing about DDD says you can't use multi-dispatch. If this how you think about the problem. Then this is your solution if your language is capable of expressing it.


Yeah, I'm quite a big fan of Eric Evans and DDD. Especially the more high level parts about ubiquitous language, bounded contexts, anti-corruption layers, and many of the other patterns. Yet I think that the focus on single-dispatch OO kind of dates the book and makes it less general and beautiful, much like how the GoF's "Design Patterns" book is more banal than Christopher Alexander's work on pattern languages because of the overemphasis on specific implementations of single dispatch OO patterns (the notorious visitor pattern, for example).


If the pairings of courses and students are registered in the school object, then that's basically OO implementation of a many to many relationship of students and courses.

That's why I like Django. You just tell it that you want a many-to-many between students and courses and it does they for you, symmetrically, without forcing you to introduce a class such as "school".


Bi-directional models come with their own problems.


That's wishful thinking.


All your trying to do is match their language, so its easier to develop against a description of a use case. Technically either way allows you do what you want to do.


I know there are ways to solve the problem; but I believe these are accidental complexities, not essential complexities. That is, these design solutions complect the problem, sacrificing simplicity for a dubious principle.


Actually I think this is a major advantage. They are not complexities, they are putting down into code how you think about your application in your head.

If your design is some data, which you can mutate however you want to get the job done, you don't really have model of the application.

Your business rules can become inconsistent because different developers are implementing different but similar methods in your business logic layer all applying slightly different rules. If the entities themselves enforce these rules, it become a lot less likely.


If you are subscribing to the DDD snake oil, you usually weasel out of this decision by claiming that it belongs to the student in one Bounded Context™ and to the course in another.


Thank you for mention this. I was hoping that someone mention this. Usually DDD solves this issues because of the rules in the construction of the Domain Models. I usually identify entity != domain models... Entity is the representation of the storage units and domain models goes beyond that by including domain rules....


I've come to the exact same conclusion after spending time with these same types of systems.

Practically speaking, I find that abstracting a system into a set of commands and a set of queries (i.e. CQRS) is often the "right" solution. Each command/query encapsulates an individual use-case, and all of the business rules and database access required.


CQRS and entity central solutions normally go together like peas in pod.

Command side has a domain model, which would be your business objects in this case. A command simply loads the entity, performs a method on it, saves it.

Query has a query model which designed for fast reads, normally built by events emitted by the domain model.


I agree. After 20 years of experience I realize "Clean Architecture" will break down on larger systems. It's easy to create a clean looking architecture with a limited set of use cases. You can do it using any design methodology.

Where systems breakdown is when other viewpoints get added. For this system it might be the following:

  - Pricing/Invoicing

  - Prerequisites

  - Academic Status

  - Professor assignment

  - Course Reviews / Social Media
All of these viewpoints are close enough to the registration system that there is a bias towards reusing as much existing code as possible.

The problem is that at a high level two viewpoints look 95% similar, but in the code it equates to creating interdependencies between all of the viewpoints.

Different teams might be successful in keeping the separation in the system, but in most systems I've seen the entanglement starts in the database with the entities. Columns that become nothing more than status flags for different viewpoints. Columns with near identical names that mean almost the same thing, but are handled differently because the viewpoints treat them differently.

When the system gets big enough, a developer cannot mentally map the whole thing. When implementing a feature they will look for what is available vs what the architect had in mind.

This is how you wind up with 10 different getCourse calls all of which are building off one another with various parameters. The code will have a lot of if/thens checking the parameters to make it work for a particular viewpoint and avoids bugs for the others.

The more separation you have between the viewpoints the better. I now prefer separate entities/databases for every viewpoint. There is a set of entities common to all, but these are fact level entities. A course, A student, A professor, An admin, A TA. The entities should contain no status.

Where the viewpoints need information from each other they should just query the appropriate viewpoint, or have the source viewpoint send out updates (great place for event sourcing).

It might sound like I'm describing micro services. I wouldn't argue that, but I would say that a viewpoint is a higher level concept than a service. A viewpoint could be a collection of micro-services, or a single system (using this Clean Architecture).


I think a big part of that is this mistaken idea that, if you duplicate a single line of code, you've done a terrible, terrible thing.

Sandi Metz did a pretty good talk about this, basically saying we need to think more before we abstract things away because "code duplication". https://www.sandimetz.com/blog/2016/1/20/the-wrong-abstracti...


I think what you call viewpoints are called Bounded Contexts (https://martinfowler.com/bliki/BoundedContext.html) in DDD parlance.


>I've come around to the idea that the database is a better thing to put at the centre

I've seen systems that take this to the extreme of not allowing a single piece of logic into the domain. Domain objects were essentially data containers only. So, if you had a Person object with firstName and lastName properties that represented those DB columns, then even a getFullName() that concatenated the two was verboten.

Instead, all logic had to be in a service. This led to lots of duplication and a super-massive service layer in a system that was decidedly procedural, even if it was implemented in an OO-language.


> So I've come around to the idea that the database is a better thing to put at the centre

In my admittedly limited experience, applications come and go, but data lasts forever. So I find it unusual when a database is not at the heart of business software.


Data lives forever. Schemas live short lives.


if only you've gone one step further, and discuss the (imho) cleanest solution - a event store and a query mechanism (aka, event sourcing, and CQRS https://martinfowler.com/bliki/CQRS.html).

When you want to ask questions about the state of the app, you make queries. These queries could be sql - in which case, your app is directly dependent on the type of storage, and is almost un-abstractable. it could be a custom/bespoke query language (where a set of hardcoded api calls to the Db/datastore counts as an api).

The event source system is responsible only for storing facts. Therefore, the "problem" of where the validation of students against courses doesn't exist, because that relationship is a "fact" in the event store (there must've been a registration happening at some point for this fact to exist). Therefore, a programmer _cannot_ make the mistake of accidentally adding a course to a student who didn't register, unless they maliciously do it.


Most CQRS/Event store systems, simply just build up OO style domain objects by applying events to them.

Then apply a command to the domain object, then save the emitted events.

They are still entity centric.


    I've come around to the idea that the database
    is a better thing to put at the centre
Same for me.

    I'm not advocating writing all global validations
    in SQL.
What do you mean? Can you give an example?


I have also concluded that a lot of business processes can't easily be modeled by objects. There are almost always a bunch of exceptions that need to change change data directly .


I've seen this a few times before, and it led me to believe that I was stupid. Comments like "...business rules simply don’t know anything at all about the outside world" left me wondering what components that DO know something about the outside world, actually know about the outside world?

Turns out that the answer is a data access component might know where to find a database, and a client app might know where to find a web service. These things were rather obvious to me, so I failed by trying to find loftier answers. Which probably says a lot of about how I think.

Also, the concentric circles did nothing for my way of thinking, as messages are linear - they start from a place, and they end in a place. Thus a stack was a lot easier for me to digest than these concentric circles.

And finally, the ambiguous language. Why should I care that there are enterprise business rules, and application business rules? They're both rules. Rules go into a component in the middle, or business layer.

And thus I found it a lot more useful to think of app design in terms of vertical tiers (hardware abstractions), and layers (software abstractions). Tiers include perimeter, DMZ and corpnet, and layers include (but not limited to) user interface, façade/service gateway, business layer, and data layer. With some cross-cutting concerns like comms, security and operational management (logging, exceptions and so on).

I don't think either is better than the other. It was, however, the first time I really understood that two people can be very knowledgeable about the same thing, and yet speak using a completely different vocabulary.


>Also, the concentric circles did nothing for my way of thinking, as messages are linear - they start from a place, and they end in a place. Thus a stack was a lot easier for me to digest than these concentric circles.

Messages might be linear but scopes are encompassing inner scopes. Encapsulation is not usually depicted as a stack.

>And finally, the ambiguous language. Why should I care that there are enterprise business rules, and application business rules? They're both rules.

Obviously because different scopes apply to the former than to the latter. TFA even clarifies that: "The key is that they ("enterprise business rules") contain rules that are not application specific - so basically any global or shareable logic that could be reused in other applications should be encapsulated in an entity".

In general, the similarities ("they're both rules") between two things don't say much (if anything at all) without considering the differences. Shiitake and Amanita Muscaria are "just mushrooms", but one can kill you.


> Messages might be linear but scopes are encompassing inner scopes. Encapsulation is not usually depicted as a stack.

It can be, as concentric circles and stacks are isomorphic. What can be represented by one can be represented by other. Examples of dealing with scopes as stacks would probably be familiar to anyone working with C, C++ and other languages with stack-based local variables - your scopes there literally correspond to what's on a stack. Similarly, lexical scoping can be seen as a sequence (i.e. a stack) of associative containers.

(Hell, you can see concentric circless as a Tower of Hanoi viewed from top - i.e. a stack (albeit inverted in this case - I'd put the innermost circle at the bottom of the stack).)

I understand GP's pain because I too would prefer the same diagram as a simple stack of abstraction layers. But then again, such diagrams almost never work without corresponding textual explanation anyways, so it doesn't matter much how the diagram looks as long as the companion text is OK :).


As an aside: according to wikipedia, there are no recorded deaths from ingestion of Amanita Muscaria. It's not really a poisonous mushroom, though it can have intense psychoactive effects when taken in large quantities.


>As an aside: according to wikipedia, there are no recorded deaths from ingestion of Amanita Muscaria.

Well, the Wikipedia lemma puts it this way:

  Although classified as poisonous, reports of human deaths 
  resulting from its ingestion are extremely rare. A fatal 
  dose has been calculated as 15 caps.[55] Deaths from this 
  fungus A. muscaria have been reported in historical journal 
  articles and newspaper reports,[56][57][58] but with modern 
  medical treatment, fatal poisoning from ingesting this 
  mushroom is extremely rare.[59] Many older books list 
  Amanita muscaria as "deadly", but this is an error that 
  implies the mushroom is more toxic than it is.[60] The 
  North American Mycological Association has stated that 
  there were: "no reliably documented cases of death from 
  toxins in these mushrooms in the past 100 years".


Oh look, more kludgy over-designed enterprise gumf.

When are people going to learn that if you just start with the entry point with granular components, letting each define the interface for its dependencies, you get a much nicer, looser, more flexible structure than these enterprise "patterns" that ultimately all just turn into a big ball of mud?

Stop pretending you can design codebases.


People forgot at some point that design patterns are for solving problems. The key being that you should only be solving problems you actually have, and stop making up imaginary problems in your head before you even start coding.

My code is typically some dumb data objects, static functions to apply rules to that data, and a bunch of interfaces for abstracting external dependencies. The code that ties everything together lives at the entry points, and is hopefully written like one is reading a requirement.

To take the course registration example, the "register" entry point might look something like:

    register(studentID, courseID):
        // Load student from data store interface.
        Student student = this.datastore.getStudent(studentID)

        // Business rules for whether a student can still register for classes.
        // This would check things like course overload and duplicate registrations.
        if (!Students.canRegister(student, courseID)):
            return CannotRegisterError

        // Record registration.
        this.datastore.addRegistration(studentID, courseID)
Advantages:

* Data objects are dumb. No need to even test them.

* Rules are all static functions, so you can call them anywhere, anytime, including for testing. Extremely flexible and allows easy remixing of rules for new requirements.

* All external dependencies live behind an interface, so they can all be mocked away for testing.


I know this is written as OO, but it is very functional in style. Keeping IO as far away from entities and business rules goes a very long way in easing maintenance. However, sometimes you need to do IO in your business rules, which necessitates interfaces for mocking.

IMO, this general idea is very powerful for most code. I believe algebraic effect systems will popularize it further by making it a natural pattern. While OO should be written in this way, we are currently in this weird period of programming history where we don't talk about practices and design as seriously as we should.

But, I'm glad you mentioned this approach. It is not hard to understand, it is elegant, and it is as simple as can be. Just requires a tiny amount of glue and forethought.


What kind of design is this? Is there a name for it? I'd like to read more (both out of interest and being skeptical of how this scales)


I should come up with some catchy name and a blog post for it. It's basically my own custom mix of procedural (entry points), functional (static validations), and design by contract (external abstractions) in the noun worlds of Java and C#. I arrived at it after seeing too many instances of:

* Architecture / design pattern lasagna, where breaking through abstraction layers felt like Inception. And adding a data field meant going through all 5 layers of code...

* Inheritance for functionality instead of typing.

* Also, breaking Liskov substitution willy nilly then wondering why things are hard to reason about. (Because your code becomes entirely reliant on the types actually present at runtime, since you can't rely on semantic equivalence.)

* Tacking methods into data objects because that's where the data is. See other comments on this post advocating a "register" function on the Student object, or should it be on the Course class... Answer is neither. Encapsulation is about maintaining invariants. It's not about putting all functions related to students in the Student class.

* Zero unit tests because external dependencies weren't isolated and all the logic is hard coded into exact use cases, which eventually end in hard coded external dependencies...


I've seen that particular type of method/function be called a use case, interactor, or service object.


What are the disadvantages?


A great question. Honestly it's a very natural fit for me, so I'm probably not very well qualified to answer.

There can be a lot of code repetition. To me, that's not a bad thing. I've seen people to themselves in knots trying to DRY some common code, just to later have to undo it all because a requirement changed for only one of the code paths. And, as I said, I like my procedures to read like use cases.

I'll see if I can think of anything else...


You need to write more structure than if you had simply just put all the code in the web handler or whatever. You also need to be able to abstract all DB/API calls if you want to test easily, as mocking things you don't own just leads to pain later on.


This plus checking codebase in regular intervals for: * KISS? * Bundled components? * Dependency injection useful? * Duplicated components / functions? * Too much/less abstract classes? Interfaces useful?

In my case that looks like: Write code for 6h, review and refactor code 2h. Result is that less code is produced (due to refactoring/removing/etc.) and codebase keeps being simple. On the other hand it's easier to write tests.

No need to use complex enterprise patterns. Most of time simple facades and delegators are enough. Consider writing small simple components instead of using heavy patterns with a lot of boilerplate code.


It's just human nature. People come up with geocentric ideas for everything. It's certainly not limited to software designers or even particular modes of software design. Ever talk to Semantic Web people? Or, ontologists? People tend to naturally assume that there is one set of "Enterprise Entities". Heck, it even took physicists a long time to come up with general relativity. And, they only did it after their absolute models broke down.

So, yes, each granular component should have its own perspective of "the Enterprise". But, contexts can be very different. For a simple example, a "car" may have completely different interface depending on if its user is a "car designer", "car assembler", "car salesperson", "car driver", etc. So, someone trying to define a universal interface for a car will make themselves and their teams crazy. Worse, the idea of "a car" will change through time, so even perfect interfaces will have to change.

This isn't limited to object oriented design, by the way. Functional programming paradigms have the same issues.


I always advocate building up a walking skeleton of the system because only then does the architecture begin to emerge. I want to see how the data moves through the system to handle the core functionality; the actual goal of the system. I've sat in too long meetings where someone with their architect hat on spends hours going over this wonderful architecture for Core System Rewrite(tm) and not one time ever mentions the actual CORE functionality the system is trying to model. It's all gibberish about layers and entities and buses, etc. We're building a tire inflater, you have not one time mentioned how it actually inflates tires!


I find your extreme as problematic as the article's.

What you describe, at least in my experience, more often than not leads to a big ball of mud as easily as the methodology in the article.

Software can be designed. It's just there's little appetite for an actual rigorous design process in this industry. Instead there is a lot of bandwagoning and looking for one size fits all solutions (like the article's), mixed thoroughly with people who haven't ever really grown beyond thinking of textbook CS as the solution to every problem.


Which is essentially what this is.


My objection to Uncle Bob is that it seems really heavy on process, with lots of indirection via adapters, abstract base classes, etc.

I get that they're useful for taming a certain amount and kind of complexity, but it's not clear to me that it's always going to be apparent at the beginning that it's going to need taming in that specific way. I've found that starting with the concrete cases, I only sometimes have to go up a level of abstraction and indirection. Conversely, I've built the wrong abstraction many times by starting too high.

I don't read or write Java so I'm sure I'm missing a lot of context. That's what I'm looking for. Uncle Bob's an eloquent speaker and his talks make a lot of sense, but I have trouble reconciling that with the code samples I see.


I had been quite skeptical of Clean Architecture when I first came across it. I don't find Uncle Bob's post on it particularly insightful; for me it's not vocational enough.

Then a few years ago I had to maintain a software stack written by contractors from Pivotal (https://pivotal.io/) in a Clean Architecture style - it was truly a revelation for me; akin to that "aha" moment of fully grokking homoiconicity in LISPs.

Now, up until this point, all code I'd encountered at Twitch heavily reflected the domain of the problem it was solving and the technology in which it was written. That is, if you were looking at a certain piece of architecture you had to really fully understand not only exactly the intent of that code base but also the details of the framework in which it was written. As codebases increased in number and size, it became much harder to scale as an engineer. Jumping from a Rails monolith, to a highly concurrent Golang HTTP CRUD-API, to a highly asynchronous Twisted-Python request routing system (broadly the main three backend techs) had a very large cognitive load. The eng org cleaved along these lines and maintaining velocity in that world was very hard; attempts to introduce new tech or join chunks of the org took on a "religious" tone.

So initially coming across this clean architecture stack felt very similar to that. It had a lot of "weird new things" in it, but once I understood that much of it was routing (tagging inbound requests in a manner that a deeper layer could understand the intent of the request and pass it to the correct interactor, which would then work on the appropriate entities) it suddenly became incredibly easy to hop around the code base and update the important aspects of it.

I asked the authors who had been contracted to build this system where they got their inspiration and they cited many lunchtime discussions and pair programming sessions influenced heavily by Uncle Bob's clean architecture.

I would have really enjoyed seeing more systems built like this because, to the maintenance programmer, it was very clear where things had to go. However only encountering one Clean Architected system didn't really give me a solid idea of how well it would scale across various domains.


That codebase sounds interesting, is there a change some parts of it were open sourced?


Unfortunately no :(


Oh wow. I can't be sure, but I'm pretty sure I'm the engineer at Pivotal that you paired with. I definitely remember you making reference to homoiconicity in lisp when we were pairing the day that Clean Architecture really clicked in your head. It was a really cool moment.

There's a lot of engineers at Pivotal, and we have a lot of projects, so maybe I can prove I worked on that codebase by referencing obscure details from it? I remember writing a LOT OF TESTS that used quotes from Wutang Clan and 36 Chambers as strings for names of entities, and a lot of references to 90s hip hop. Was this the same project?

Also, I took some time after our project (and several others) to start a small example of applying Clean Architecture in a toy Go Project, which is open sourced. Maybe check it out?

http://github.com/tjarratt/go-best-practices


Yes, it was you! Nice to reconnect :)


Surprising this has been written in 2017. It looks like java written a decade ago.

This can be a kind of organizational solution much like microservices to separate concerns. This is important when collaborating with a large amount of people. Clear boundaries and all that.

You pay for those boundaries with a convoluted mess of classes that describe the design pattern rather than the business logic. You end up with AbstractRequestInterfaceFactory type stuff.

Trade offs. Decide when they're worth it.


> Surprising this has been written in 2017. It looks like java written a decade ago.

Clean Architecture™ has been around for a while now; I wouldn't be surprised if it was more than a decade already. This post is a guide to an existing concept, not a presentation of something new.

EDIT: This is Uncle Bob, 5 years ago: https://8thlight.com/blog/uncle-bob/2012/08/13/the-clean-arc.... So maybe it's just 5 years old - but then there were similar things (like Hexagonal Architecture) before.


"You pay for those boundaries with a convoluted mess of classes that describe the design pattern rather than the business logic. You end up with AbstractRequestInterfaceFactory type stuff."

I disagree. If a design (call it an architecture) is clean and coherent, it's obvious.

Because obviously a client must send the order to a server, which is protected by a gateway. And of course that gatway validates the client request, and the data coming in. And of course the server must have business rules to validate and process the order. And clearly the business components must call a data access component to persist the orders to a database. Simple, no?


> Simple, no?

what happens when the business rule clashes with the database's constraint implementation, because different people either misinterpreted, or due to incompetence or miscommunication?

I think splitting up sounds great in theory, but in practise, has lots of pitfalls. That's not to say it's not a good idea, but one musn't look at it with rose-tinted glasses.


Well you'd test, which would catch that scenario. Assuming you're not testing and discover it after releasing to production you'd just use your change process.

Either way this is just part of building a system. Refector, re-test and release (or re-release). It's hardly going to be the only bug you find.


Clean architecture is just repackaged DDD. I sympathize with what Bob is trying to do though -- for some reason this stuff often doesn't "click" with many developers until they see it and then it seems "obvious."

I think it's hard to really explain why software should be built like this. This is why architects and senior devs and the like often end up ruling by dictat and simply laying down rules like "messages only in the ACL, no messaging in the domain." The problem that Clean Architecture and DDD are trying to solve are architectural and ultimately organizational problems and these problems are not clear to most developers who are just given a story and told to implement some new feature or story. The only hope is to make these these architectural problems everybody's problem. Spread the pain.

Microservices btw are no escape hatch. Microservices that don't get the essential system boundaries right are in fact going to become a horrible mess. If anything microservices make it more important to think carefully about the boundaries and the interfaces of the system.


>Surprising this has been written in 2017. It looks like java written a decade ago.

Because good architecture has somehow evolved?

>This can be a kind of organizational solution much like microservices to separate concerns.

This is orthogonal to microservices. And microservices aint but one model of creating applications, not "THE 2017 model".


> Because good architecture has somehow evolved?

Of course. Every year gives us greater insight into what approaches work and don't work. Which is why we have seen Java technologies like EJB, JCA, JTA, OSGI, JSP etc generally fall by the wayside in favour of less convoluted and less architecturally heavy approaches.

> And microservices aint but one model of creating applications, not "THE 2017 model".

Microservices are unquestionably the defacto standard for most larger software development projects. Especially since they are intrinsically linked to the containerisation deployment approach which is a modern day concept.


> Microservices are unquestionably the defacto standard

There's a wide world of software out there that you have yet to explore.


>Microservices are unquestionably the defacto standard for most larger software development projects.

In the HN echo chamber maybe -- but not sure even there.


I am very intetested. What makes you think that microservices is current standard? Is there a peer-reviwed study?


I agreed with the first part of your answer a lot more than the second part.


The 'AbstractRequestInterfaceFactory' is something that derives from Java's limitations, not from clean architecture. Clean architecture in Ruby for example is very elegant and does not have unnecessary boiler plate abstraction methods.


More specifically I'd say i only see those kinds of classes together with spring.

Can't remember ever seeing them in plain or modern Java EE code.


I been using a very similar pattern [Port and Adapters]http://blog.ploeh.dk/2016/03/18/functional-architecture-is-p... since moving to F#, paired with [Railway Oriented Programming]https://fsharpforfunandprofit.com/rop/ for the 'Adaptper' level and using DDD for modelling the 'Entities' & 'Use Case' layers' and it been working well for me so far.

Scott Wlaschin has recently released an [ebook]https://pragprog.com/book/swdddf/domain-modeling-made-functi... on the topic which goes into more detail.


These are all architecture ideas I've been interested in, I'll have to check out the book.

Overall architecture in development is still early on in maturity.


You know someone is certified when you have to scroll code sideways because the names don't fit the screen. I thought we already agreed that process is the opposite of progress? I mean come on, RequestCourseRegistrationInteractor, who are you trying to impress here? These processes were designed to turn humans into machines; to decrease the dependence on creativity and skill at the cost of additional effort and complexity; to enable large groups of unmotivated developers to deliver mediocre software reliably; which makes them sub-optimal for any other use case.


I think you are opposing it for the wrong reason.

There should not be a lot of room for creativity when implementing specific business rules. The goal should be clarity and readability.

These convoluted patterns obscure the logic and confuse the reader with their pointless abstractions.


I think you are confused. The whole point of software is to create something that didn't exist before, building the same software over and over again doesn't make any sense. I'm not talking about creativity in interpreting business rules, I'm talking about creativity in building software.


Who have you agreed with? Software has to be predictable and easy to maintain. It is great if you can predict what the class does based on its name. It is also great if you know what you have to search for based on the name of pattern. Software is there not to express the creativity of a given programmer, but rather to meet the requirement of the technical tasks.


We are all here to express our creativity, that's priority #1 and the only reason we ever went anywhere but in circles. I don't mind descriptive names; these names are not descriptive, these names are part of the process. This is how you really do it: 1) start from the problem you are trying to solve, 2) solve the actual problem in the easiest way possible to get experience, 3) improve the solution until it says exactly what you mean. Bottom up, not top down; skills and creativity, not rigid rules; that's how you build great software.


Most of the programming problems have been solved before. Some of the solutions turned out to be solid programming patterns. Before you write a single line of code, you check if there is a "standard" way of solving the given problem. That way the code is more maintainable and more people will be able to work with your code.


So we've been told. Part of my point here is that code like this is the opposite of more maintainable. Writing code for clueless people to read and understand is a loosing game for everyone, that energy is better used to cultivate skills collectively. This is about not having to differentiate between individuals, about being able to treat a bunch of coders as a code-factory; which is very convenient but neither efficient nor humane.


IMHO, the most important thing in architecture are the concerns and goals of the architecture. Not the particular architecture itself. A single architecture can't check all the marks, or you end up with a monster. The job of an architect is to identify the main risks and pain point of a particular system, and adapt the structure of the project to address them best.

"Enforcing separation of concerns" is a good one. But so would be "identify the various runtime threads of your system easily", or "minimize code surface responsible for mutation of shared data", or "decouple configuration primitives from the components themselves".

Those concerns are more or less important depending on your use case. MMORPG server code or regular desktop app, or mobile webapps sold as templates, all have very very different concerns, so i dont think it would make sense to use a single architecture as a template for every problem.


It's interesting to note that a layered dependency architecture is nothing new. As a matter of fact, it was one of the fundamental design decisions of the 1968 THE Operating System [1], on which Edsger Dijkstra had been programming and designing extensively.

[1] https://en.wikipedia.org/wiki/THE_multiprogramming_system


Oh god no. DON'T POLLUTE THE CURRENCY.

This is a massive fuck up and it won't scale or be modular. The idea is fine but you need the entities/currency into a smaller core with your DAL and relationship objects above that layer. The only functions your currency should have are accessors or read-only convenience calculations. Other operations should be handled by another parent because let me tell you that that Course collection is going to end up being null or empty A LOT when it "shouldn't" be.

You'll want your use cases to return shallow object graphs (e.g. GetCourses => courses.Enrolled => StudentName, StudentId) from the data layer/services and then if you want to look up a Students details you make that another call. The alternative is to run some sort of "scope" object that defines the depth of graph you want when accessing a general api. The thing you're looking to avoid though is returning more information than is necessary.

Also who taught this guy to model? Students are in courses in this model the courses are inside students and that's silly.


For shallow vs deep, what I decided for my code base was that I'd pick the appropriate depth as a general rule for an object and not worry about I/O micromanagement.

If it is cheap to pull the additional data and it makes sense to expect that data in using the object, I'd simply embed the representation. If it was expensive or not commonly used, I'd include a reference to the data (usually ID(s) of the data).

Then in the repository, I'd just get everything necessary to return a full object as designated.

An onion architecture gets funky if you don't know if the object you are working with has all its data, especially because the object itself shouldn't know how to fetch more data about itself.

In this particular case, the two objects probably shouldn't have ANY domain connection, and there should just be a method on the course repository to "GetCoursesForStudent" that accepts a student ID. It's perfectly ok to model relationships in the database that don't have parallel relationships in the domain objects.

Or, one could create a "SemesterEnrollment" object that contains a student object and a collection of courses. Which would probably make more sense, as that's the object that should be referenced to generate bills, report cards, etc.


I was studying this topic yesterday, the same idea with another name, functional core and imperative shell.

Came across this collection of articles and talks, for anyone interested: https://gist.github.com/kbilsted/abdc017858cad68c3e7926b0364...


As a teenager I would sit in the offices of grey haired old men at the software company I had no business working for after dropping out of high school. These men would hand me a photocopy of an OOPSLA paper, or something from a journal. I was told to read it, then come back and discuss. This was my intro to many areas of software architecture.

I became familiar with the names of people like Booch, Rumbaugh, Jacobson and others. Over the years I learned various object oriented programming languages and patterns. Each new thing was like discovering a horcrux, at first magical and powerful, but ultimately evil. Later I began to learn functional programming and that's what I try to use most these days but it too has its promises and lies. I have built and helped others reason about many many complex systems and all I can say is this:

The only system that is well ordered internally is the one that accreted complexity in increments, and was continually refactored along the way. Best practices be damned.

Those systems might not look how you'd design them were you Uncle Bob. They work, can be understood, and can be modified without much consternation. We all can recite examples of masterpiece turned morass. Conversely some of us can recite an example of a frog, a weird complex beast but ultimately very well adapted to its environment and quite resilient. Frogs are not beautiful but great at eating bugs.

One fact of complex systems is: humans cannot know the "right" design until AFTER he has arrived at it.

If a human can know the design a priori then the problem is NOT COMPLEX and if it is not complex then it is not modern software.

This is humbling knowledge for someone who wants to believe there can be a language and pattern of order for all systems. One that can be expressed in anything less precise than code itself.

Given a fanciful machine that could assemble subatomic particles in any fashion the user desires none of us could, having never seen one, design a frog.


I've been using a very similar architecture on an application for about 2 years now, and it's been incredibly smooth to work with, especially in Go (passively satisfied interfaces and enfirced lack of inheritance), though I do feel like the article overcomplicates the design (also recommend looking up onion architecture).

Basically, for me, it boils down to the following:

1. Create domain package(s) with the basic business entities (users, vendors, products, etc) - can't import anything. 2. Create usecases package(s) that acts on these objects and can import anything in the domain package, but nothing else (though can accept repository store interfaces). 3. Create infrastructure packages for taking to databases, caches, etc - can't import from rest of application. 4. Build repository stores to translate between repositories and domain objects (can import domain and usecases, and accepts interfaces for infrastructure). 5. Add controllers/main packages that build repository dependencies with infrastructure config, and pass those interfaces into usecases that do the necessary work and spot back results (imports everything).

The end result is a bit topsy turvy to get used to at first, but then it's a dream. Extremely easy to test (very little dependency coupling in each package), and the most complex parts of the application logic are totally isolated from the complex parts of the infrastructure, so you end up with less cognitive load when dealing with either.

After my experiences so far, I can't see ever switching back to "top down" architecture instead of "inner out" when given a choice.


Also relevant to this discussion is 'Domain Driven Design and Onion Architecture in Scala' by Wade Waldron from Scala Days 2016

https://www.youtube.com/watch?v=MnNeDXg3Qao


The entire time I read this I kept thinking of the fundamental theorem (and its corollary). Also, considering this an all-solving hammer has the risk, like with everything, of being premature optimization, methinks.


It's so refreshing to see vibrant discussion of how a domain layer should be implemented, when I have practically been run off from teams for suggesting that a domain layer pattern, any domain layer pattern, should be used.

I've seen enormously complex domains implemented as DTOs that are acted duplicately in ORM queries, in PDF rendering, in CSV exporting and then again in CSV importing.


Which language is he using? Is that C#?

It looks like this only deals with in-memory data structures. How does stuff get read from / written to the DB?


Actually, very similar, since Entity Framework is usually used. Sources implement the IQueriable interface which lets you write queries with method chaining or LINQ. To be honest, it kinda looks better than it is. In reality I find it produces sub-optimum performance and a lot of things can not be done without spilling query logic to the app. Not to mention a lot of linq/method chaining queries cannot be translated to SQL (by the database provider).


You can persist this model using anything you want.

You would have some kind adapter/library that handles the persistence.

It would take these models, and covert them into your datastore.


How would that work in practice? Would this line stay the same:

    RegisteredCourses.Add(course);
And magically trigger an insert in the database?


The repository pattern in the original example would allow you serailise your objects to whatever datastore you want. His object design isn't that great for being persistence ignorant though. I would instead do something like this.

    Course - Aggregate Root

    Student - Aggregate Root
        RegisteredCourse[] - RegisteredCourses - An array objects with date time of registration, and id reference to course.
The method for adding a course would simply create a new RegisteredCourse and place it in the registered courses array.

_studentRepository.save(student) would be a simple matter converting this in memory representation into sql.

I'm not big fan of using lazy loading, and direct references to other aggregates inside of other aggregates. This makes mapping these models without ORM difficult.


Can you show in real code, what you mean? From the text you wrote, I cannot grok it.


If you see an architecture that has a box or a ring with the word "controller" in it - the design is wrong.


It's a shameless plug, but I just have to throw in http://manuel.kiessling.net/2012/09/28/applying-the-clean-ar...


Every new system starts out fairly clean, at least in certain dimensions. Then the real world intervenes.


At what point should basic validation be completed?


There is no easy answer here. Where and when validation happens requires some careful analysis.

I find it helps a lot to think carefully about (1) message validation -- is this a valid message? (2) entity validation -- is this entity in a valid state? and (3) enterprise validation -- is the business system as a whole now in a valid state? These three questions map on to what is often an ACL, domain, and domain services layer. At the end of the day there's going to be corner cases though in which case as a rule of thumb there's something to be said for doing validation at the edges and moving it in closing to the center as needed. In my experience when it's not clear where validation belongs that often means nobody really understands that validation (or that validation is even wrong and is not what the business wants -- it happens!) and so there's something to be said for keeping it out of the core domain.


Superficial validation on things like form fields still occurs up in the UI layer before a Use Case is invoked. Additional validation is likely to happen in the Use Case as well. It's a ubiquitous thing so just like in any other app it should be happening at a few different points but the stuff validated in the Use Case is going to be related to the business rules it's trying to execute.


My approach has always been to validate anything that crosses a trust boundary. This might be user->app; client->gateway; or app->database.


>My approach has always been to validate anything that crosses a trust boundary.

That's a good approach unless a validation requires a db call with inner join (for example). This becomes too costly to do it 5 times (for example) just to get something written into db.


It remains a good approach, even in your scenario - because that perf problem is solved using a temporal cache.


Caches are an excellent way to introduce path-dependent, time-sensitive bugs.


The safest, most bug-free code is no code at all.


I see it now .....

A interface to define the validation aspect. 3 separate implementations of said interface : 1 for db lookup, 1 for cache lookup and last for unit tests. 1 factory method which return the instance to use depending of context. Another temporal cache that holds the instance of said interface (after all, you can never have enough caches).

And all of a sudden, validating some single shit takes +300 lines of code (plus tests). Welcome to enterprise , clean, better designed, greatly arhitected software everyone....


Why do an interface at all? Is the validation definitely being shared across business functions?


This is a great/simple rule to keep in mind.


Hmm, how do you define "superficial validation"? Form field facultativity (better word?) cross-field consistency, range acceptance, presence of sql escape characters ... they can all be considered superficial, but also core business rule IMHO


"presence of sql escape characters" - isn't a business rule. That's technical.

Range acceptance - could defiantly be a business rule. Cross-field consistency - Could also be a business rule.

I implement these rules both sides. Ideally i don't want an invalid command sent off in the first place. I also don't want badly implemented UI corrupting my business data.

This way i can provide instant feedback to the user, and also protect the data in case the UI is badly implemented.

The main point of these architectures is that you can use them from many user interfaces.


"I implement these rules both sides" ... and the maintenance overhead?


Still easier than having separate domain models.

Adhering to dry in all cases, is not always the best solution. As the software community has discovered over the past years. Especially in relation to microservices.

In my software there's a bit of a disconnect between the application and interactors. As there is a message queue between them.


This is stuff like there is JavaScript regex that checks for an @ character in an email and in the back end code. There is no reason to not do both.


>they can all be considered superficial, but also core business rule IMHO

Business rules does not mean what the business (bosses) dictates that the app should have.

Business rules the rules for the business domain.

Whether you should escape sql (or even use sql) is a technical/application concern, not a business rule.

A business rule would be something like "customers must get 10% discount if they bought more than 100 units" or "no employee should be allowed to have a salary bigger than their manager", etc.


Field level validation(Is this email in a valid format) checking that input is sane is done at the UI level.

Business rule validation(E.g student can't be assigned more than 10 courses) is done on the entities themselves.

You put it on both sides to get the best of both. Instant feedback to the user, and on entities as last resort protection against a badly implemented interface.


> Field level validation(Is this email in a valid format) checking that input is sane is done at the UI level.

In two places in UI: as close to the user as possible (which improves ergonomics) and at the system's border (which prevents entering invalid data to the system at all). While the former is somewhat optional, the latter is absolutely necessary and cannot be left to client-side JavaScript.


looks like onion architecture, no?


Basically "Uncle Bob" took the onion architecture, changed it slightly, and took ownership of it.

But yeah, regardless of the lame appropriation, a solid concept.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: