Maintain a clean architecture in Python with dependency rules

hbrn · on Dec 15, 2022

This (plus Law of Demeter) is the right way to handle medium-big size projects, though I'm not completely sold on the tooling. I mostly do it manually (yes, it is still doable with dozens of modules since dependency hierarchy doesn't change often).

One recommendation I have is to present the hierarchy as DAG. Existing image (https://sourcery.ai/static/05300f06cb847360719e2aa31dc5a31b/...) doesn't make it very obvious that api is a highest-level module, even though it is clearly stated in the rules.

AlphaSite · on Dec 15, 2022

Import Linter is probably a better choice for Python since it’s free https://pypi.org/project/import-linter/

hbrn · on Dec 15, 2022

I thought about using it at a small scale, but frankly I find more value in a visual representation, and once I have that I don't want to explicitly blacklist imports: those rules can already be derived from the graph (i.e. any import that introduces a cycle is a violation).

Doing it manually allows me the following:

1. I get to define what are the namespaces (domains) that matter, irrespective of the package structure. E.g. import from stripe.api_resources is still a dependency on stripe, not on stripe.api_resources.

2. Work around a bunch of dependency caveats (frameworks like Django do runtime imports and mix high and low level concepts in settings, db foreign keys might inverse your dependencies, etc)

3. Violations are very easy to see: they are cycles in the graph, i.e. arrows pointing upwards. Those are typically design flaws. Though I still allow certain violations because practicality beats purity.

4. Since some violations are allowed, I get to decide how to arrange the graph so that it is more clear what is the flaw and how to address it.

I haven't found a good tool that allows me to get all of these. One day I'll have to build it myself.

rekahrv · on Dec 15, 2022

Great points. My ideal choice would be having both: visualization & rules as code. :-)

* Visualization gives a better overview. * Rules as code allows more fine-tuning. E.g. explicitly allowing those few exceptions you mentioned in point 3.

To your points: 1. I absolutely agree. No dependency to Stripe also means no dependency to any of Stripe's subpackages. (The article should probably emphasize this more.) 2. This is a good point. The rules generated for Sourcery only check `import` and `from ... import` statements. Runtime imports are (for now) out of scope.

rekahrv · on Dec 15, 2022

Yes, Law of Demeter is exactly what these rules are trying to achieve. :-) Thanks, a DAG is a great recommendation.

andrew_eu · on Dec 15, 2022

Before clicking on this, I expected to see import-linter [0] which achieves something very similar but with, in my opinion, a bit less magic. Another solution in a similar spirit is Pants [1], though this is actually a build system which allows you to constrain dependencies between different artifacts (e.g. which modules are allowed to depend on which modules).

To Sourcery's credit, their product looks much more in the realm of "developer experience" -- closer to Copilot (or what I understand of it) than to import-linter. Props to them for at least having a page about security [2] and building a solution which doesn't inherently require all of your source code to be shared with a vendor's server.

[0] https://github.com/seddonym/import-linter

[1] https://www.pantsbuild.org/

[2] https://docs.sourcery.ai/Product/Permissions-and-Security/

memco · on Dec 15, 2022

Thanks for the additional tools to tackle this problem. We usually don’t have problems with this at work, but I just so happened to discover one today and was dreading the work it will take to sort out how to fix it.

hakanderyal · on Dec 15, 2022

After dealing with that problem and enduring the pain of it for years, I finally switched to C#/.NET. It has the necessary tooling to achieve this and more.

Rewriting a lot of things was time well spent rather than trying to tame the dynamic nature of Python and my tendency to overuse it.

And I can't believe I'm writing this after all these years evangelizing Python and dynamically typed languages.

vyrotek · on Dec 15, 2022

Pleasantly surprised to see .NET shared on HN! I've had a lot of success with it in my career building several SaaS platforms from the ground-up. The tooling is great. It's wild how productive a small startup team can be on the .NET stack using a clean architecture.

whiskey14 · on Dec 15, 2022

Can you please give a short summary of why I should give C#/.NET a go for my backend services?

I've been fighting battles in Python backend services to get a nicely decoupled API, logic and DB layers for a while...but sqlalchelmy, alembic and flask/django/fastAPI are my safety blankets

electroly · on Dec 15, 2022

Other commenters have good specific points but I'll add one overarching theme: .NET is developed by a well-funded corporation that is incentivized to bring all the popular innovations from other ecosystems back to the .NET world in a cohesive form. If something becomes popular in another programming ecosystem and people want it, we'll get it in .NET and it'll be done in the same style as everything else we have. It's pretty refreshing working with a system that was designed to work together rather than cobbling bits together.

dfee · on Dec 15, 2022

My journey went through mypy, then typescript for frontend, the typescript on node. The story here was the type system is so much better that it allowed better prototyping, larger codebases and confidence.

I’ve done a lot of C# and Java now over the last few years, and I don’t love their type system, esp compared to typescript, but they scale much better against large codebases - especially with tooling like bazel.

I’ve been looking at Haskell and Rust a lot to fill this intermediary: code that’s performant, with a very expressive type system.

I maintain(ed) a number of popular python packages, and that journey lasted for nearly a decade.

daxfohl · on Dec 15, 2022

This is my experience, having gone the route you're looking. Haskell (~6 month trial) was unproductive for me. Primarily the ecosystem is full of abandonware. Secondarily it lures you into spending way too much time refactoring stuff into the most concise possible form, which you can no longer understand (and frequently needs rewritten completely because the tiniest change to the most concise possible form invariably explodes through several layers when you have to make changes later). Rust (~3 month trial) may be great for codebases where you'd legitimately consider C / C++, but too much work otherwise; I personally wasn't doing anything that I'd use C for, so it was not worth it.

I ended up being very happy with F# as a middle ground for several years, but eventually migrated back to C# as they started adding more and more F# features. The primary challenge with F# was the parity mismatch with the underlying runtime, so you end up having to write a fair amount of non-idiomatic F# to interop with common libraries. But otherwise it's great. (I also tried Scala for a year and hated it: too many ways to do any one thing).

hakanderyal · on Dec 15, 2022

I was in the same boat. I wanted to switch a few years ago actually, but EF core was missing core features I took for granted in Sqlalchemy.

As for the reasons:

- Static typing and C# projects makes code organization and refactoring dead easy.

- Modern C# doesn’t require that much boilerplate, and has features that allows a developer to speed up development, like Python.

- EF Core covers everything I need from SqlAlchemy/Alembic.

- LINQ is an awesome way to work with collections. Type safe DB queries comes handy both when developing and refactoring.

- ASP.NET covers everything I need from flask/Fastapi.

- More speed and lower resource usage is nice.

- Being able to use an IDE with it’s full power is nice.

My main reason to switch was static typing, and my only requirement was a good ORM comparable to SqlAlchemy. The rest is just bonus.

danuker · on Dec 15, 2022

> - More speed and lower resource usage is nice.

> - Being able to use an IDE with it’s full power is nice.

In my experience, Visual Studio is much slower when developing. My Python workflow affords a 200ms red-green-refactor loop, while VS is on the order of several seconds.

This might not seem like much, but it has a great impact on my engagement, flow, and satisfaction.

hakanderyal · on Dec 15, 2022

Rebuilding the project to run the tests adds a bit of time, yes. I see this as a cost of the static typing, runtime speed etc.

It’s worth it in the end. YMMV.

kcartlidge · on Dec 15, 2022

> Rebuilding the project to run the tests adds a bit of time, yes.

It costs money, but I've paid for NCrunch for a fair few years and find it invaluable for this reason. It doesn't even need you to save changes before it spots them and runs affected tests in the background.

If cost is an issue you can also start `dotnet watch test` going in a terminal/command prompt for non-interactive live-reload testing.

danuker · on Dec 15, 2022

I am not so sure it's worth it. In general, developer time is much more valuable than machine time.

Maybe if you're building a large high-performance server, you should invest in performance. But otherwise, if you only look at at computational complexity, and batch up/avoid I/O when possible, you're fine.

lowbloodsugar · on Dec 15, 2022

I grew up with BASIC and made it to Java by way of assembly, C, C++ and C#. This year I put some Rust into production tooling. I've used python along the way, but usually as a scripting tool. I've never worked at a company whose codebase involves a lot of python. So beware of confirmation bias in my thinking.

What follows is my opinion, I am aware it is my opinion, but in my sphere of influence, it is not up for debate when it comes to writing code. It might occasionally be a conversation over lunch.

I put python in the same bucket as BASIC. It's not a production language. "developer time is much more valuable than machine time." Yes. Absolutely. Iteration speed is vital. But it is vital in more than just the test loop. It is important in the "minor refactoring of various classes" up to the "major refactoring of entire systems" loop too. And python just doesn't make that easy. It actively makes it difficult. It makes comprehension difficult. It's difficult to look at python code in a code-review and have a good idea of what the classes involved are. I don't even write scripts in it any more. I've found that any script that is worth writing is likely to grow and evolve over time, and if it is not written in something like C# or Java, then it will become an intolerable mess. I've seen entire organizations that are basically cargo culting.

I encourage you to learn a statically typed language and its tooling.

yCombLinks · on Dec 15, 2022

All of the benefits of static typing save a ton of developer time in other places. No one is talking about saving machine time as the primary benefit of static typing.

gmueckl · on Dec 15, 2022

Static typing in .NET saves developer time on a massive scale. Sure, the compile times and startup times may be longer (and tests may take longer to run for that reason), but the languages also allow for editor/IDE tooling that boosts developer productivity massively. Visual Studio with Resharper or Rider may seem expensive, but if you work with these tools full time, they pay for their cost multiple times over in almost no time.

hakanderyal · on Dec 15, 2022

For me, runtime performance is only a tiny percent of the advantages. It could have been slower than Python, I would still make the switch.

roflyear · on Dec 15, 2022

This is true. Almost all C# projects will take a while for you to run. It is unfortunate.

The upside is hopefully you "don't need to run it as many times" but ... eh. No thanks.

kcartlidge · on Dec 15, 2022

There is a delay, true. Inevitable with the compilation phase, and I do find it irritating that my Go stuff builds so much faster. That said, there's reasonable (not perfect) live reloading happening these days which helps somewhat.

camdenreslink · on Dec 15, 2022

One reason is that entity framework is the best ORM out there. It blows sqlalchemy and alembic out of the water imo (I’ve used both a bunch).

Another reason is that decoupling and adding layers to your code is more part of the culture. Look up “domain driven design C#” or “onion architecture C#” and there will be a lot of resources on how to achieve it. There is stuff out there for Python as well (and the concepts translate between languages), but not nearly as much.

mattgreenrocks · on Dec 15, 2022

The .NET ecosystem is great.

It feels a lot more professional than other ecosystems. For example, they actually talk about layering/coupling as professionals should! People actually seem to talk about architecture as well rather than blindly believing that the conventions forced on them by a framework are sufficient for all use cases.

I especially like the gradient in the .NET world from micro ORMs to full-fledged ORMS. Most ecosystems seem to develop a big ORM that constantly accrues features (and bugs) and eventually becomes enshrined as a "best practice" because it acts as a kitchen sink.

daxfohl · on Dec 15, 2022

+1 I was much happier using Dapper compared to EF. I figure if it's good enough to run stackoverflow, it's probably good enough for whatever I happen to be doing.

The amount of open source in dotnet is great. (I think more than Java? My impression of that is dominated by Apache etc., though my experience in the Java ecosystem is limited. Presumably people in Java land would expect the same of dotnet being dominated by Microsoft, but that's really not the case).

megaman821 · on Dec 15, 2022

I haven't played with SQLAlchemy in a while, but I was comparing EF core to the Django ORM, and EF core seemed to be lacking in features. There were a few things missing but the two that pop to mind are Window function and Case statements.

elforce002 · on Dec 15, 2022

Interesting. What about using mypy to have some sort of static typing a la typescript?

wiseowise · on Dec 15, 2022

Lipstick on a pig. Unlike TypeScript, mypy feels really clunky.

w_t_payne · on Dec 15, 2022

I do exactly this in my side project. I have a set of rules which put restrictions on which packages and modules can be included from other packages and modules. For example, a high maturity package is not allowed to depend upon a low maturity package. Similarly, a core library package is not allowed to rely on a package that is specific to a particular product or a particular piece of bespoke development. In this way, much of the potential for circular dependencies is eliminated, and the purpose and internet is clearly communicated.

(I don't do this using sourcery though ... I have my own set of rules)

cjohnson318 · on Dec 15, 2022

I do a similar thing. Here's the style-guide I learned from: https://phalt.github.io/django-api-domains/styleguide/

Basically, you have an api class for each Django app, and you use this class for all external interactions. The api class calls the service class, and the service class deals with the Django ORM. I added a view class, which is my DjangoRestFramework layer; so when a request comes in, it's caught by my view class, and passed onto the api class. I have DRF serializers for outgoing data, and pydantic schemas for incoming data. I also have a selector class for read-only views of my data.

It's a lot of typing, but I know exactly where everything is when something goes wrong, or I need to add a small adjustment somewhere, also it's easy for new devs to learn and use. One downside is that an api change require you to touch a dozen files.

rekahrv · on Dec 15, 2022

Thanks a lot. The Django API Domains Styleguide looks great. Do you perhaps know some open source projects that follow this structure?

cjohnson318 · on Dec 16, 2022

Not really, no. But it is pretty straightforward. My projects have: - apis.py - the "external" surface of the app - views.py - DRF layer on top of apis.py - services.py - the layer that writes to the Django ORM - selectors.py - provides "read-only" views or filters of data - serializers.py - serialized outputs, using DRF - schemas.py - pydantic classes that control incoming JSON types - models.py - Django model declaration - urls.py - url endpoints, pointing to views.py - core.py - maybe another file with more business logic, used by apis.py

I have had trouble using Django API Domains interfaces.py, so I left them out. The main point is, figure out the right balance of concise code and separation of concerns for your taste, and your stack. Good luck!

rekahrv · on Dec 16, 2022

Thank you, that's very helpful. Do I understand correctly that the REST component uses 2 different data structures? schemas.py for the incoming JSON and serializers.py for the return data?

cjohnson318 · on Dec 16, 2022

Yes, according to the way I did it. You could put DRF serializers and Pydantic schema into the same file and call that "serializers.py", or you could just use DRF for incoming form validation.

Similarly, you could collapse "selectors.py" into "services.py". I put read-only operations into "selectors.py" and write operations into "services.py", but you don't have to. I got that idea from this styleguide: https://github.com/HackSoftware/Django-Styleguide which is in the Appendix of the Django Domain API docs.

rekahrv · on Dec 15, 2022

That's very cool. Can you tell a bit more about this set of rules?

"For example, a high maturity package is not allowed to depend upon a low maturity package. Similarly, a core library package is not allowed to rely on a package that is specific to a particular product or a particular piece of bespoke development." I really like these.

w_t_payne · on Dec 15, 2022

I have a general system for representing metadata in source files. (I use YAML documents embedded in block comments).

Some of this metadata gives traceability information for requirements, tests etc.. while other metadata enables me to associate a maturity level with each file.

My build system understands this metadata and uses it to inform e.g. the minimum test coverage that it expects on a file-by-file basis.

The same metadata is used to ensure that all of the other components that a file references are at the same level of maturity or higher.

I also have metadata for each file (partly derived from location in the repository) that gives each file a number which defines it's position in a hierarchy of design elements.

The position in the hierarchy helps to indicate what the purpose of the file is. I use this to make a distinction between those core, foundational, stable design elements upon which other design elements may build, and those more peripheral, ephemeral and 'agile' design elements which can be quickly tailored to meet the needs of a client or partner.

This means that a (hopefully stable) core API component can be prevented from relying upon a (perhaps less stable) bespoke customer-specific component. It also means that there's more freedom in changing and adapting peripheral designs as you can have confidence that it's stability is not something that is going to be relied upon.

rekahrv · on Dec 15, 2022

Thanks for the detailed description. That's a really sophisticated system with several cool features. * minimum test coverage on a file-by-file basis * various levels maturity

"It also means that there's more freedom in changing and adapting peripheral designs as you can have confidence that it's stability is not something that is going to be relied upon." That's a big advantage, indeed.

I also like the concept of storing this metadata next to the code in structured comments.

falcor84 · on Dec 16, 2022

>While Python doesn't allow circular dependencies between modules, it won't stop you from introducing circular dependencies between packages.

Just to nitpick, while it is a very potent foot-gun, Python absolutely does allow circular dependencies between regular modules; here's a good write-up about this:

https://stackabuse.com/python-circular-imports/

vasili111 · on Dec 15, 2022

Am I only person that prefers to use raw SQL over of SQLAlchemy? I do not see any real advantage of using SQLAlchemy over raw SQL if I do not plan (which I do not plan) in future to switch DB engine for the application. Do you see any real advantage of using SQLALchemy over raw SQL queries if you do not plan to switch DB engine for your application in future?

gghhzzgghhzz · on Dec 15, 2022

I use it mostly for reverse engineering a model on top of a legacy database when working on projects to clean and migrate that data.

I seen some very legacy database 'designs' and have never failed to model them with a combination of sqlalchemy join mapping, datatype mapping and some object properties in python for cases that are simpler to just express as list comprehensions.

You end up with some data quality rules / Transformation logic you can reasonably share with business users.

On the Load end I normally do that via sql bulk inserts as using an ORM just adds too much overhead and not enough control.

willseth · on Dec 15, 2022

It's nice to have all of your db operations in Python and automatically integrated with existing Python tooling. It also makes it easier to refactor, organize, etc. SQLAlchemy comes OOTB with a lot of nice convenience tools and functions, and there's an ecosystem built around it, e.g. Alembic for schema migration. There are some cases like really complex queries where it can get in the way, but overall I find the tradeoffs are easily worth it for the convenience

piafraus · on Dec 16, 2022

No, I feel raw SQL is much cleaner and you can see immediately what will be executed, which indexes will be used, etc. No black-box magic.

dontlaugh · on Dec 15, 2022

Alchemy makes it quite easy to compose queries, which isn’t possible with SQL. That’s about it.

rekahrv · on Dec 22, 2022

A follow-up post: https://sourcery.ai/blog/dependency-architecture/

neves · on Dec 15, 2022

I've already seem tools like this for other languages, but never seen someone effectively using them. Does anyone here has good or bad experiences with these architecture rule systems?

mikeholler · on Dec 15, 2022

Is there anything similar to this for Java/Kotlin/Gradle?

Sankozi · on Dec 15, 2022

There is ArchUnit - https://www.archunit.org/

mikeholler · on Dec 15, 2022

Thanks, that looks like exactly what I was looking for.

naizarak · on Dec 15, 2022

JPMS allows for exactly this, but that entire project was so poorly implemented that no one uses it.

thundergolfer · on Dec 15, 2022

This is supported in Bazel with package visibility rules. Once you've got that feature as a way to tame a larger and expanding codebase, you'll wonder why it isn't a feature in more systems.

https://bazel.build/concepts/visibility

rekahrv · on Dec 15, 2022

Thanks a lot for sharing this link. I haven't used Bazel, but this concept of target and load visibility sounds cool.

"Once you've got that feature as a way to tame a larger and expanding codebase, you'll wonder why it isn't a feature in more systems." :-)

shankr · on Dec 15, 2022

This has also been recently integrated in pants.

https://github.com/pantsbuild/pants/issues/13393

lyu07282 · on Dec 15, 2022

Is there something like this for react/jsx? I always wished I could constrain component dependencies across the atom > molecule > organism layers.

revskill · on Dec 15, 2022

In Typescript, i normally just allow interface to be dependencies between layers. (API, command line programs,..) -> (Services) -> (Database).

Thaxll · on Dec 15, 2022

Your DB / api layer should never touch the same models.

inwit · on Dec 15, 2022

It's these kind of rules that mean I'm here wading through 5 layers of exquisitely decoupled nonsense that could be done in a few lines

Thaxll · on Dec 15, 2022

Convert function between API and DB model does not sound complicated.

Storing the API model in your DB is really a bad idea.

camgunz · on Dec 15, 2022

I just listened to the DHH/Kent Beck/Martin Fowler discussion about TDD "damage" and both sides still seemed unconvinced by the end of it, but this exact example came up. It seems like SOA (whether it's DDD or Hexagonal or Clean or w/e) and TDD really push you towards this kind of layer bloat for one reason or another.

I'm (maybe obviously) on the SOA-skeptic side, my arguments generally are:

- Most apps aren't that big and don't need multiple layers of abstraction (i.e. the ORM and its models are totally fine). If the app starts getting too big for its britches, probably the best thing to do is make it 2 apps (too big: 2 apps is a good slogan here).

- Dependency injection and mocks are pretty bad ideas that are only occasionally useful (DHH uses the example of a payments gateway), but mostly push IoC through your whole app and make control flow confusingly backwards. Mocks are always in disrepair, and almost never accurately reflect what they're trying to mock, and thus ironically are big vectors for bugs that make it through testing.

- Having tons of unit tests tends to slow eng velocity to a crawl, because they test the parts of the application that aren't the requirements (were these functions called, what's the call signature of this function, was this class instantiated, etc.). Unit tests create a super-fine-grained shadow spec about the lowest level details of your application, and mostly I think they shouldn't ever be committed to a repo. They help during individual development, but then the whole team is stuck maintaining them forever whenever they make changes. They also tend to slow down CI because they're slow and always flaky.

- You almost certainly will never need to switch databases, let alone abstract across a database, a message queue, and a web api. It's not worth doing a "repo" abstraction and encapsulating those details.

- There are (now) really good libraries for almost anything you want to do. ORMs literally map database entities to domain entities--they just abstract the persistence for you. Sounds like a repo to me! We also have good validation, logging, monitoring, auth/auth etc. built into frameworks and 3rd party services. A lot of the things you might put into other layers or even other services are now neatly packaged into libraries/frameworks you can just use and SaaS things you can just buy, leaving you free to just implement your business logic.

chao- · on Dec 15, 2022

I generally agree with the position that unit tests should be used with discretion, and that full coverage via unit tests often leads to thousands of low-ulitility or redundant tests, and so on. However I cannot agree with this:

>They also tend to slow down CI because they're slow and always flaky.

In my experience, unit tests are the most stable, the least flaky, because they touch the least code and often have very simple setup. An integration test might rely on four database tables being just-so, and go on to connect with two external services (and whether mocked, replayed, or live, flakiness may arise). That integration test is twenty times more valuable, but it is equally more likely to break for reasons tangential to its core assertions.

camgunz · on Dec 16, 2022

Oh, yeah I have experience with super flaky integration/UI tests too. I think a couple rules mostly keep things from getting out of hand in unit tests (never import `random`, use `freezegun`, etc.), but in my experience even this fails to corral the flakiness of 1000s of unit tests. I'm hopeful that property testing frameworks make a dent here, but I haven't had enough experience with them yet.

> An integration test might rely on four database tables being just-so, and go on to connect with two external services (and whether mocked, replayed, or live, flakiness may arise).

I actually feel most comfortable when my integration tests are essentially just API calls--or you could think of them as unit tests of the API. That way, if it's flaky in CI, it's flaky in prod too, and you know to fix it.

This is where mocks generally lead you astray, either they act like everything is fine, or they have some randomness/etc. built in and it causes flakiness. Any time you're testing a mock and not "real" code is a huge failure IMO; it debases the entire scientific process.

---

I guess I would summarize my testing position as "test all the API calls you support, pretty exhaustively". That's your spec. If you need some unit tests while you're developing something, definitely add them, but once you get things up to spec, just toss 'em. Otherwise you're binding future engineers to your implementation, which generally isn't helpful.

kortex · on Dec 15, 2022

Agree on the points that you should never need to abstract over your database, orm, message queue, etc.

Disagree on dependency injection. I came from the globals/patch everything school of python, to the Fastapi/Pytest DI flavor, and it's a breath of fresh air. It's just so much easier to abstract the IO providers and swap them out with objects tailored to the test suite - eg for database, I create db objects which roll back any transactions between tests.

Hard disagree on unit tests. Maybe in other languages, but in Python, trying to develop even a moderately complex app without unit tests is a nightmare. I know, I've lived it. Even in an app with >85% unit test coverage, there was still a ton of friction around development on any of the interfaces which had low coverage.

Any gains in velocity of development almost always cost far more in debugging down the road.

I love python, but it is really prone to dumb footguns at runtime, NoneType errors in particular. You need to impose a lot of discipline to make large python apps enjoyable to develop on.

camgunz · on Dec 16, 2022

> Disagree on dependency injection.

I think DI makes a lot more sense in languages that are statically typed. In Python, the implementations I've seen use a lot of ABCs, "_in_mem" repos (for mocking/testing). I've been assailing mocks elsewhere, but my criticism of ABCs is that you write a bunch of boilerplate code, only to still just get a runtime error that your tests should catch anyway.

DI also pushes IoC... everywhere. There's not really anything inherently wrong with it, but the fact that it's backwards from typical control flow makes it confusing because the two always coexist. As a result, when trying to trace the behavior of code, you have to dig through lots of layers of configuration and/or implicit magic to discover what implementation is actually being called. Or, you're lucky and there's only ever a single implementation (this is most cases), but then why are you using DI in the first place?

FastAPIs handlers kind of blur the lines of DI. The way they've implemented route handlers conflates DI with callbacks. They could easily have done something like what Django does, and generated API docs by the type signatures of the handlers, then it wouldn't look so much like DI, just mapping URLs to handlers.

> eg for database, I create db objects which roll back any transactions between tests

I've experienced the pain of implementing this myself in Go, so I know it's not super trivial to set this up yourself, but that said, I get irritated when I see testing influencing design decisions in ways like this. Like, this is a big architecture decision made solely to support an additional database configuration. My opinion is that this conditional belongs in some kind of `if TESTING:...` block during app initialization, not literally injected into every route handler, etc.

> Hard disagree on unit tests. Maybe in other languages, but in Python, trying to develop even a moderately complex app without unit tests is a nightmare.

Oh I've had that pain too, but types have pretty much solved those issues for me. Or, weirdly I've only encountered:

- apps with (effectively) no tests

- apps with only unit tests

- apps with both unit and integration (and maybe UI) tests

I've never encountered an app with only integration or UI tests, but in my personal/contracting projects, I only ever write integration tests, and it's worked great. Coverage stats help a lot here too, you can see which code paths aren't taken and either fix the bug or delete the cruft.

No tests is clearly very bad, but I think integration tests are far, far better than unit tests.

yunohn · on Dec 15, 2022

> leaving you free to just implement your business logic.

Often, engineers (and HN) forget that code is a means to an end - not an artistic expression that provides value by its pure existence.

hbrn · on Dec 15, 2022

I mostly agree with you and DHH on that topic, however in my experience reasonably applied SOA/DDD actually shields me from this layering nonsense.

When your apps live as a service on the network or as a nicely isolated module in your repo, you no longer have a reason to over-engineer them. You don't need a grandiose architecture that solves every problem, instead you can make local decisions that are good enough in the specific context. Though, admittedly, I found it hard to sell such "inconsistencies" to other tech leaders, most folks aspire to those grandiosities.

> If the app starts getting too big for its britches, probably the best thing to do is make it 2 apps

That's the argument in favor of SOA, isn't it?

camgunz · on Dec 16, 2022

> When your apps live as a service on the network or as a nicely isolated module in your repo, you no longer have a reason to over-engineer them. You don't need a grandiose architecture that solves every problem, instead you can make local decisions that are good enough in the specific context.

Totally agree yeah, it's a huge boon to engineers to focus on their tickets rather than to have to constantly consider application architecture. I think as long as a framework exists (either an off the shelf one like Django or even--shudder--an in-house one) you get this benefit.

My problem (and maybe we agree here too) comes from the dynamic where the team decides on SOA, and embarks on this saga of implementing the "framework" themselves. That's a big loss in productivity as now you have 2 software projects: your framework and your business app.

> That's the argument in favor of SOA, isn't it?

Eh, not really. I don't think anyone disputes that big apps should be multiple services. SOA/DDD/Hex/etc. aren't novel for suggesting that, their novel claim is that you can tame the complexity of an enormous service by rigidly adhering to their principles when structuring and implementing it. My counterargument is that it's simpler and easier to split services before they become enormous.

aobdev · on Dec 15, 2022

I hate that these are called models (probably because they extend pydantic’s BaseModel), but if they were called Schema or Serializers would this still be true? Typically what you see in a FastAPI project is a class that parsers the request body, and the same or slightly modified class that serializes the response back out after touching the DB. And this isn’t a new idea, because Flask+Marshmallow and DRF do the exact same thing.

rekahrv · on Dec 15, 2022

I've used multiple names for similar packages incl. `models` and `schemas`. :-) Yes, for this example, I picked `models` to follow Pydantic's terminology.

IMO, the FastAPI approach you described makes a lot of sense: The "schema" stored in the db and the "schema" returned by the API aren't the same, but they are quite similar. They have many common properties => They can often have a common base class.

rekahrv · on Dec 15, 2022

Thanks, that's a good point and perhaps a good topic for a future post :-) How to ensure that the API and the db use different models even if those models are in the same package?

dangets · on Dec 15, 2022

I struggle with this also, I assume the answer is to not have them in the same package. You can also break the application into separate 'domain', 'infra', 'application' modules as documented in [0] with rules on what dependencies are allowed in each module (e.g. domain should not have db or serialization implementation). The problem is that this does create several adapter layers which adds to the mental complexity.

[0] https://learn.microsoft.com/en-us/dotnet/architecture/micros...

marginalia_nu · on Dec 15, 2022

Why not?

yuppiepuppie · on Dec 15, 2022

Title is misleading, it should be "Maintain a clean architecture in FastAPI with dependency rules"

lyu07282 · on Dec 15, 2022

It has nothing to do with FastAPI, pretty sure sourcery would work with anything.

rekahrv · on Dec 15, 2022

Thanks, that's a good point. We thought that a small FastAPI project shows the general concept as well. Do you have suggestions which other examples would be useful?