Hacker News new | past | comments | ask | show | jobs | submit login
Analysis of Software Architectures (firatatagun.com)
111 points by fatagun on Jan 9, 2016 | hide | past | favorite | 37 comments



This analysis brings to the front a personal frustration, and I imagine it's not unique to me.

This article discusses some architecture techniques, but moreover, it buys into a Culture of Architecture that we don't have a word for yet. This CoA is predominant among Java engineers who worked at large, slowly-dying companies (e.g. IBM).

The issue with the CoA is that it cares more about buzzwords than quantifiable claims or objective measures. CoA was once addicted to microservices, when that model failed it'll be "all about" another, and so on. Rife with academic good-sounding CoA words that nobody can argue with like "SOLID" and such.

To be a little more concrete, "Broker topology" is a 25-cent word for a 1-cent idea (and it's not the only one, see every keyword in bold in the page, it's almost like a college textbook). There seems to be an implicit assumption in CoA that there is a finite number of architecture patterns and they must be learned by name.

Again, to be more concrete, "layered" architecture is an oversimplification of a fundamental idea (abstraction) in engineering.

This textbook-esque presentation also really misses the spirit of coding. It presents a UML (another CoA must) of "Event-Driven" architecture, then one of "Layer" Architecture, without really explaining if these can be combined or are mutually-exclusive (of course they could be combined). But by treating them as named nouns it creates a misleading and complex language around fundamental ideas.


I violently agree, and I suspect most of HN does.

There's an associated danger though: I find that many startup hackers actively dismiss good ideas (and even entire programming languages) simply because they smell a bit of CoA.

Even the CoA crowd occasionally comes up with truly good ideas that have way more substance to them than the meaningless patterns mentioned in this article.

As an example: My personal pet peeve is the startup world's ignorance of Domain-Driven Design and CQRS (Command-Query Responsibility Segregation). If you want to make scalable backends, I really recommend you read up on CQRS, it's practical and it fits modern ideas. Most examples are in C#, but you'll survive.

For frontenders, CQRS is basically Flux on the backend, kinda sorta. It ought to be super hip right now but it isn't, and I suspect that's only because it comes from C#-o-world and has a ridiculous unpronounceable acronym.


At work I've been implementing some rough CQRS/DDD stuff for "the 2.0 system", and I'd summarize them as:

CQRS: Writes/commands and Reads/queries are fundamentally asymmetrical. Accept it and take advantage of it by having separate pipelines with different features.

DDD: Collaborate with the business-folk to discover their mental-model, and treat it and the code-model as similar co-evolving pieces. Focus on behavior and rules more than pigeonholes for data. Use Repositories to handle how RAM is not infinite, and Services for stuff you truly can't fit into meaningful objects.


> As an example: My personal pet peeve is the startup world's ignorance of Domain-Driven Design and CQRS (Command-Query Responsibility Segregation). If you want to make scalable backends, I really recommend you read up on CQRS, it's practical and it fits modern ideas. Most examples are in C#, but you'll survive.

Most starts never become successful enough to have to worry about scalability; in fact, if they do, it's almost a luxury.

I don't have experience with CQRS and event sourcing, but it does seem to come with its own share of problems, for example dealing with changes of the data model, which I suspect would happen often in young businesses.


It's very easy to do CQRS without Event-Sourcing. ES is really a separate kind of architecture-piece that's concerned with how you mutate your persisted data.

For example, at work I've got a CQRS system where the MySQL database looks pretty much like you'd expect from any other system. (Customers table, one row per customer, etc.) It's still CQRS because writes and reads occur through different paths, and writes can have side-effects to proactively-build readable data. For example, "Top 100 Happiest Customers" might be its own table--managed by application code--rather than view or query.


If you do it that way, you don't get the famed scalability; it's still limited by the scalability of the mysql backend.


Not sure exactly what capability you're referring to with the "famed" scalability, but my point is that CQRS in no way requires Event Sourcing, and offers its own set of benefits. (It's also a helluva lot easier to do, or un-do.)

In particular, CQRS helps you make a sane architecture for "read models", where certain features (say, a homepage showing a hard-to-calculate leaderboard) are simple queries against a data-source designed specifically for that feature. The backing data-source is kept up to date by application code as a side-effect of your "real work".

This means you can substantially reduce the runtime load on the database as well as reducing how much application-logic "leaks across" in triggers/procedures/views. Since your database is often the "bottleneck of last resort", this means better scalability.


The codebase is, however, very well structured to make that change later on.

You don't need to do a lot of CQRS to make future scalability a relatively simple redesign.


This frustrates me as well.

It's not just limited to the Culture of Architecture, a bunch of concepts in software seem to have similar issues (MVC vs MVVC, REST, etc). Unlike a bunch of other disciplines, there is a distinct lack of thought into what should be applied for each application problem. Electronics textbooks generally mention that a circuit is best for a specific set of applications, and talk about some of the limitations of applying the circuit (eg: very accurate voltage regulation, but does not tolerate fluctuations in temperature).


Huh, I didn't think about it until now but I'll anecdotally confirm your EE vs SWE observation. Then again, maybe I'm just far enough from the center of the EE scene that I don't get hit by as much dogma. Hard to tell.


OK, help me understand:

Is there any difference between CoA and design patterns? Sure, "Broker topology" is probably trivial, but so's Chain Of Responsibility and Decorators and everything else the Gang of Four touches on.

So, are you just as opposed to design patterns as CoA?


I want to be clear on this, I am not at all opposed to design patterns. I am not opposed to architecture. The part of CoA that bothers me is the C (culture).

That is to say: There are many great problems to solve and some reusable strategies that can be employed to solve them. However, in my experience (such as this article), many people who talk most loudly/frequently about architecture are parroting this complex and unquestioned set of assumptions and behaviors (read: culture) in a counterproductive way. Again, those behaviors include: reinventing a complicated words for simple ideas (and using them in front of audiences without defining them), creating UML diagrams even for audiences they don't expect to know UML, focusing on adding a lot of new named "patterns" to their mental book of patterns.

For example, since somebody brought it up, take CQRS. Firstly, the way Fowler and Wikipedia explain this are vastly different, if not contradictory (indicating an issue). Secondly, it can be explained (the wikipedia version) as [to quote wikipedia] "methods should return a value only if they are referentially transparent and hence possess no side effects." This has so many exceptions I wonder why it justifies its own acronym.

And the perhaps worst of all part of this culture is that members of the culture then find a need to explicitly use all of these patterns in the most verbose ways. The more obscure the pattern and the more complex, the better they feel, like a 15-year-old who just discovered a thesaurus. Pretty soon they have 3 classes named things like MessagingLoopInterpreterInterface to achieve very basic things.


I feel the same way much of the time.

However, I think much of the value of having design patterns and fancy vocabulary for architecture boils down to having simple, short names for complicated, yet common, patterns and principles.

For example, if you're being onboarded into a team of twelve to develop an already somewhat large product, the phrase "it's an n-tier application, with many of our older models' data access layers using some kind of CQRS" is immediately meaningful and useful to you, in the same way referring to 1000+ line objects as "adapters" or "aggregates" is meaningful and useful to you.

Having this vocabularly enables communication of an intended purpose and responsibility for arbitrarily large swaths of code in a way that boilerplate documentation and tests can't.

(Naturally, if you're writing something from the ground-up, or you're one of a small handful of developers on the project, the value of this is somewhere between negative and negligible, which might explain the polarization of opinion on this topic...)


Unfortunately, the Wikipedia article [1] doesn't distinguish between CQRS (command-query responsibility segregation) and CQS (command-query separation). They are entirely different concepts. Compare Fowler's CQRS article [2] to his CQS article [3].

CQS is an old OO design principle suggesting that an object's behavior is easier to reason about if every method either changes the state of the object or interrogates its current state. That's not always possible, but it's a helpful habit.

CQRS is the idea that use cases involving searching and reporting across many entities are fundamentally different from use cases involving interactions with particular entities. It is therefore sometimes worthwhile to develop separate data models for searching and reporting that are updated asynchronously.

[1] https://en.wikipedia.org/wiki/Command–query_separation

[2] http://martinfowler.com/bliki/CQRS.html

[3] http://martinfowler.com/bliki/CommandQuerySeparation.html


> it can be explained (the wikipedia version) as [to quote wikipedia] "methods should return a value only if they are referentially transparent and hence possess no side effects

This simply isn't even accurate at all; that "it can be explained" as such is just plain false.


"Architecture astronauts"


This articles makes the assumption that features are easier to develop if subsystems are broken up into separate components, with separate data sources and deployments.

But experience shows that this is only sometimes the case. Monolithic applications with a big SQL engine as the backend have the advantage that you can do joins very easily. If the data is split into three microservices, you spend much more time thinking about how to do the these cross-service joins more efficiently, narrowing down the amount of data you have to fetch from the other services and so on.

People argue that cross-service joins should not be necessary, but that's also easier said than achieved; in reality, they tend to show up, no matter how carefully you partition the services.


I guess that not just the joins become more difficult in a partitioned system, but especially transactions. Maintaining transactional consistency across multiple services seems to me like a nightmare. Even writing tests that address consistency could be very difficult.


Often you don't worry about transactions across services. I know that someone will say we have to have them. In such cases, fine worry about them. In everything else, either isolate the write side of a transaction in one service or don't require them.


"don't require them".... Transactionlity is principally dicated by the requirements to the architecture design, not the other way around.

" isolate to one service " Yep, a single service with a single database, as mentioned.

My point here, like the earlier one, is that there are a lot of situations where multiple services don't fit the domain. In fact, very many, given the extreme complexity around transactions.


And it's not a coincidence NoSQL systems often punt on joins and transactions in order to scale. Both are really hard to make both correct and efficient in a distributed environment.


Also this article isn't as good as it could be. too much happiness about micro services. Microservices makes more sense in a bigger team than in a small team. A small team can't work really easily with a codebase that is splitted too much (and with small I mean less than 10).

A startup should ALWAYS start with a dumb monolith


Does anyone find this a bit contrived? When you build a complex piece of code, there's always more than one way to draw a diagram of it. For instance, I've written trading systems.

- It's event based because there are publishers and subscribers.

- It's layered because there's a UI that draws stuff from the business layer, which takes data from a database

- It's plugin because it uses a DI container, which decides exactly which services to run on which machines.


As someone who worked briefly on "enterprise" software (in Java), this article brings back memories of preposterously complex and uncomfortably bureaucratic monster-systems that were a nightmare to work on. It also reminds me of http://www.joelonsoftware.com/articles/fog0000000018.html


Good for you. I keep working more or less on similar kind of systems where an 2 line change in a method turns out 20 code + 5 config file change.


Regarding microservices. They sound nice in theory, and certainly work for some cases. But what is the common pattern to handle a situation like this (in a performant and "scalable" way).

Say you have two (completely separate micro-) services:

- Order service - Customer service

Now when a customer buys something, it is stored in the order service. But how is the customer information referenced in the order service? Do you store just some "customer id" and let the clients of this service worry about what it means? Or do you store some customer information also in the order service (in memory cache or in the services own datastore, both solutions which then need to synchronize this data somehow)?

Eg. how about a situation where you want to display a sortable list of orders, and it includes the columns for the customers first and last names. How can you sort on these columns if the order service doesn't know anything about the customer other than some "identifier"? Especially if the list is paged so you can't look up all the needed customer's information.

Just thought about this because this article (and many others) tout microservices as some kind of silver bullet.


Denormalization is often a practical answer; the order service keeps an (immutable) copy of the customer's name and delivery address.

If the customer changes the address, it will only matter for new orders.

One can argue that this is actually a sensible behavior, because there's a point after which you can't change the delivery address anyway (once it has been handed over to the delivery service, for example), and this behavior of copying the address just moves the point a bit.

Likewise, when a customer buys a product, a copy of the product description and price is included in the order; if the price changes between the checkout and the delivery, it's good if that isn't reflect in the amount that is invoiced. After all, the checkout established a contract that cannot be easily changed retroactively.


Right, I think this is an example of the importance of good requirements and scenarios before coding.

While programmers may prefer to create denormalized "model of everything at once", some business processes require copies and snapshots. Not as implementation details, but directly.


Good point. In this case at least it is actually preferable to copy the customer data over to the order.


The cross-service join problem is definitely a weak point of microservices. Generally you'd try to avoid designing service boundaries that you plan to do joins or transactions, etc. across. If you do need joins/transactions though, it's not impossible to coordinate them across services.

Your reporting feature that wants to provide a sortable list of orders+customer information, whether it lives in the order or customer services or in its own, would have to collect order information including customer ids, then use those ids to collect customer information, join these together in a working area of some kind if they're large (in-memory cache as you've mentioned is a decent candidate) and sort/filter that as needed. A cache + invalidation strategy appropriate to the application could definitely be needed for acceptable performance.

Depending on the characteristics of the team: size, planning style, communication style, etc. the cost of having to solve cases like these can be well worth paying to have smaller projects which can be reviewed easier with clearer ownership responsibility etc. Sometimes the tradeoff is a dramatic win and these are where the microservices advocates come from.

Everything is trade-offs. With billions of humans, the worst design pattern/methodology/technology you can think of still probably has an appropriate application somewhere.


Agreed.

Usually it's always the "user" object which cuts across services. Was wondering if there is some common pattern for handling this.

For the scenario I outlined here some kind of "third service" / aggregated view of the data could also be a solution. Ie. store the data needed for queries in some queryable datastore (nosql, sql; pull the info from the needed services and refresh the view when needed)


You can do a cross service join in a single request. Combine that with good caching and a normalized architecture is (almost) as performant as a demoralized one.

Check out https://github.com/facebook/dataloader

We use it to do cross service joins in the form of bulk requests in a very transparent way.


I think the article is not bad. It just mentions some patterns and compares them. of course they can be composed.

While I agree an "architectural astronaut" approach to software engineering is wrong, we still have to understand the main underpinnings of software engineering. data, separation of concerns, composability, state vs non stateless.

recognizing patterns in granular code as well as in system composition is important. And naming them makes communication easy. PubSub, Visitor, Mvc, etc are still useful names.

What I think is funny is that the java implementations are often big compared to the same solution in functional languages.

Static OO languages often need "software patterns" such as wrapper objects, where they are nonexistent in functional languages. imho


Related: "The Architecture of Open Source Applications" (http://aosabook.org/)

They've recently released new chapters of the upcoming fourth volume.


This analysis is a badly regurgitated copy of the free O'Reilly e-book about software architectures by Mark Richards located at http://www.oreilly.com/programming/free/software-architectur...

It's a lot worse than the original.


This website badly needs editing. There are a lot of bad sounding and awkward sentences.

"For several years now, many enterprises and companies employed this architecture in their projects and it almost became the de facto standard therefore it is widely known by most architects, developers and designers."

"Layers don’t have to know about what other layers do, in example: business layer doesn’t have to know how data layer is querying the database, instead business layer expects some data or not at all when it invokes certain method in data layer."

"Some of the features and functionality doesn’t need to go through all the layers"

"If you have 20 percent of requests just passes through layers and 80 percent of requests does real processing, it is fine, however if the ratio is different then you are having sinkhole anti-pattern syndrome."

"Event mediator doesn’t do or knows any business logic, it just orchestrates the events."


The chart at the bottom of the article is incorrect in at least a few spots:

- The article does not identify difficulties with testability or development of the MicroService architecture.

- The face shown for Layered architecture's "Development" is colored incorrectly (since it is a smiley-face and the article text supports that).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: