venvs are namespace isolation, they are like containers.
Even in huge monorepos you can just use something like a Makefile to produce a local venv using PHONY and add it to clean too
This is how I actually test old versions of python, with versioned build targets, cython vs ...
You can set up almost any IDE to activate them automatically too.
The way to get you coworkers to quit complaining is to automate the build env setup, not fighting dependency hell, which is a battle you will never win.
It really is one of the most expensive types of coupling.
Would recommend you to install pyenv[1]. It was very useful when my team had to update a lot of projects using <=3.10 to 3.11
[1] https://github.com/pyenv/pyenv
I have multiple versions of Python built from source. If I want to test what my code will do on a given version, I spin up a new venv (near instantaneous using `--without-pip`, which I achieve via a small Bash wrapper) and try installing it (using the `--python` option to Pip, through another wrapper, allowing me to reuse a single global copy).
No matter what tooling you have, that kind of test is really the only way to be sure anyway.
Some libraries break across different versions of Python for a variety of reasons.
Pinning python version with asdf (in conjunction with a venv) gets you reasonably far in ensuring a certain project works across a lot of people in a team.
A venv does not actually install a different Python interpreter. It's bound to the Python version that created it. You cannot make a Python 2.7 venv using a Python 3 interpreter. You need Python 3.10 to create a Python 3.10 venv.
There are plenty of situations where the Python interpreter version matters. As a non-exhaustive list, you have libraries that compile code, non-Python languages that link to cpython, build scripts that do different things depending on wheel/setup/other-bundled-stuff, Python code that uses removed compat shims like importlib-metadata...
If you haven't run into one of those situations yet, congratulations. I've been through this already, and making a reproducible environment does require first installing a pinned version of the Python interpreter and THEN setting up a venv using that particular interpreter.
> You need Python 3.10 to create a Python 3.10 venv.
Yes, but this doesn't need to cause a problem for those of us using bare-bones tooling. Speaking for myself, I just... run venv with that version of Python. Because I have a global installation of everything from 3.5 to 3.13 inclusive, plus 2.7 (just so I can verify what I post online about how things used to work in 2.x). And I never install anything to my base Python versions, because my system Python is Apt's territory and the ones I built from source are only and specifically there to make venvs with.
This seems like a solved problem with pyenv which is very popular. You can also include a `.python-version` in your git repo root dir to automatically use the correct python interpreter version when in the scope of that repo.
How do you expect to support violating that contract, where the OS vendor, package maintainers issue updates and security fixes, while you maintain your code, with the realization that Python releases have two years of bug fixes and three additional years of security fixes. Are you actually forced into this model or are you just adding to code debt refers and adding to the accumulation of poorly written or unmaintainable code that makes future changes even harder than they are today?
Using OS package managers helps you avoid accumulating so much code debt, making moving to newer versions the easy path actually results in better written and maintainable code.
If you just listen to depreciation warnings, and prioritize trying to stay close to the current released versions, and insisting that running on unsupported versions is an incident and not hiding it under the rug, things get better over time.
That said, the Python direct download paths are very stable.
I would still recommend creating packages from those downloads, while there are possibly better options, fpm has treated me well for well over a decade.
Looking in homebrew, which is probably one of the bigger unknowns, they have targets for everything going back to 3.8, with all major Linux distros supporting farther back and obviously VM/containers are an option there too.
I have run into the situations above, but the use of unsupported and/or end-of-life software is not something you intentionally help an organization do.
Especially with the actions by several agencies to bust EULA's
While IANAL we will see what happens in the courts, when the CISA labels something as "This dangerous practice is especially egregious in technologies accessible from the Internet"
There is a very big difference between sustainment needs and active development, one is a reality, the other could possibly be framed as "Gross negligence", invalidating any EULA protections in many parts of the country.
Obviously if you are running other peoples software the calculus of trade-offs changes.
But I have done the Python 2.7 to 3.x migration for several open source and commercial projects in the past. I can promise you that the time you waste on using 2.7 with software you control is far more expensive than updating.
If you are stuck on 2.7 on a product under active development, which has been EOL for 5 years and was warned to be going away for almost a decade prior to that...the problem you have is not with any 3rd party dependancies, the problem is with your organization.
No 3rd party vendor solution will fix those problems for you.
This is more or less my experience, but I think in part it took a while for pip to actually get into a usable position, hence some of the proliferation of other options.
That might be fine in your context. People's problems are real, though. What they're almost always missing is separating the source code from the compiled output ("lock files"). Pick a tool to help with that, commit both files to your ("one's") project, problem solved.
People end up committing either one or the other, not both, but:
- You need the source code, else your project is hard to update ("why did they pick these versions exactly?" - the answer is the source code).
- You need the compiled pinned versions in the lock file, else if dependencies are complicated or fast-moving or a project goes unmaintained, installing it becomes a huge mindless boring timesink (hello machine learning, all three counts).
Whenever I see people complaining about python dependencies, most of the time it seems just that somebody lacked this concept, or didn't know how to do it with python, or are put off by too many choices? That plus that ML projects are moving quickly and may have heavy "system" dependencies (CUDA).
In the source code - e.g. requirements.in (in the case of pip-tools or uv's clone of that: uv pip compile + uv pip sync), one lists the names of the projects one's application depends on, with a few version constraints explained with comments (`someproject <= 5.3 # right now spamalyzer doesn't seem to work with 5.4`).
In the compiled output - i.e. the lock files (pip-tools or uv pip sync/compile use requirements.txt for this) one makes sure every version is pinned to one specific version, to form a set of versions that work together. A tool (like uv pip compile) will generate the lock files from the source code, picking versions that are declared (in PyPI metadata) should work together.
My advice: pip-tools (pip-compile + pip-sync) does this very nicely - even better, uv's clone of pip-tools (uv pip compile + uv pip sync), which runs faster. Goes nicely with:
- pyproject.toml (project config / metadata)
- plain old setuptools (works fine, doesn't change: great)
- requirements.in: the source for pip-tools (that's all pip-tools does: great! uv has a faster clone)
- pyenv to install python versions for you (that's all it does: great! again uv has a faster clone)
- virtualenv to make separate sandboxed sets of installed python libraries (that's all it does: great! again uv has a faster clone)
- maybe a few tiny bash scripts, maybe a Makefile or similar just as a way to list out some canned commands
- actually write down the commands you run in your README
PS: the point of `uv pip sync` over `uv pip install -r requirements.txt` is that the former will uninstall packages that aren't explicitly listed in requirements.txt.
uv also has a poetry-like do-everything 'managed' everything-is-glued-together framework (OK you can see my bias). Personally I don't understand the benefits of that over its nice re-implementations of existing unix-y tools, except I guess for popularizing python lockfiles - but can't we just market the idea "lock your versions"? The idea is the good part!
That's been my experience too. The main complaint I hear about this workflow is that venvs can't be moved without breaking. I just rebuild my venv in each new new location, but that rebuild time can add up for projects with many large scientific packages. Uv solved that pain point for me, since it provides a "pip install" implementation that runs in a fraction of the time.
Anyone remember the leftpad fiasco in the node ecosystem? That could happen in any dependency system that allows owners to unpublish dependencies and that's one risk users must weigh when adding them.
Yeah, I assume pinning the version is something everyone does? Or probably many just don't and will have those "python deps management is a mess drama".
TBH, I've seen tutorials or even some companies simply do `pip freeze > requirements.txt` :shrug: which is a mess.
I was reacting to a comment that said a dependencies.txt and a venv was enough, so in the model I critisized there is no pyproject.toml
> Regardless, majority of the times, deployment is done via Docker.
What, generally? In your peer circle? I'd say [citation needed] — docker has the problematic habit of inducing more moveable parts that can bite you, so I know many who — if given the choice — would rather deploy python projects without it. Having deployed many python application on actual bare metal and VMs alike I'd say the ratio of Docker VS just Python is more like 1:8.
The script will run with uv and automatically create a venv and install all dependencies in it. It's fantastic.
The other alternative, if you want to be extra sure, is to create a pex. It'll even bundle the Python interpreter in the executable (or download it if it's not available on the target machine), and will run anywhere with no other dependency (maybe libc? I forget).
You can go a step further and have scripts runnable from anywhere without cloning any repo or setting up any venv, directly from PyPi: when you package your lib called `mylib`, define cli scripts under [project.scripts] in your pyproject.toml,
[project.scripts]
mytool = "path.to.script:main"
and publish to PyPi. Then anyone can directly run your script via
uvx --from mylib mytool
As an example, for Langroid (an open-source agent-oriented LLM lib), I have a separate repo of example scripts `langroid-examples` where I've set up some specific scripts to be runnable this way:
Hm, I think you can just run something with 'uvx <name>' and it'll download and run it, am I misremembering? Maybe it's only when the tool and the lib have the same name, but I think I remember being able to just run 'uvx parachute'.
You're right, that's only when the tool and lib have the same name. In my case, I have several example scripts that I wanted to make runnable via a single examples lib
>Put this at the start of your script (which has to end in .py):
Yes, that format is specified by PEP 723 "Inline script metadata" (https://peps.python.org/pep-0723/). The cause was originally championed by Paul Moore, one of the core developers of Pip, who authored the competing PEP 722 (see also https://discuss.python.org/t/_/29905) and was advocating for the general idea for long before that.
It's also supported by Pipx, via the `run` subcommand. There's at least one pull request to put it directly into Pip, but Moore doesn't seem to think it belongs there.
I'm in the middle of replacing all of my usage of direnv with mise [1]. It does everything direnv/asdf can (for my use cases) plus a lot more. Mise can install a lot of dev software/frameworks [2], run tasks, file watcher all in a compatible manner for *nix OSes. mise is ridiculously good and even minimizes my need for docker locally.
Poetry -> UV migration is missing: If you already have a project using Poetry, with a large pyproject.toml with many extras and other settings, there currently isn’t a smooth way to port this to UV. Someone can enlighten me if I missed that.
It doesn't have to. But if it does not end in .py, you have to add the --script (or -s for short) flag to tell it to interpret the file as a python script.
In before 'Python dependency management isn't hard. You just use {newest Python flavor-of-the-year way} of doing {same thing that is standardized in other languages}.'
Which, fair. Python is and will always be a bazaar.
>> PDM (Edit 14/12/2024) When I shared this article online, I was asked why I did not mention PDM. The honest reason was because I had not heard of it.
"There should be one-- and preferably only one --obvious way to do it."
It was originally not a bazaar but "batteries included" where the thing you wanted to do had an obvious best way of doing it. An extremely difficult property to maintain over the decades.
Of the list you offer, only Poetry, Hatch, PDM and Uv actually do "Python dependency management" - in the sense of offering an update command to recalculate locked versions of dependencies, a wrapper to install or upgrade those dependencies, and a wrapper to keep track of which environment they'll be installed into.
pipenv, micropipenv and pip-tools are utilities for creating records of dependencies, but don't actually "manage" those dependencies in the above sense.
Your list also includes an installer (Pip), a build backend (Setuptools - although it has older deprecated use as something vaguely resembling a workflow tool similar to modern dependency managers), a long-deprecated file format (egg) which PyPI hasn't even accepted for a year and a half (https://packaging.python.org/en/latest/discussions/package-f...), two alternative sources for Python itself (ActiveState and homebrew - and I doubt anyone has a good reason to use ActiveState any more), and two package management solutions that are orthogonal to the Python ecosystem (Conda - which was created to support Python, but its environments aren't particularly Python-centric - and Linux system package managers).
Any system can be made to look complex by conflating its parts with other vaguely related but ultimately irrelevant objects.
My entire point is that there is no best way, because people have varying needs and preferences and aesthetic senses.
In my opinion, the best state of the ecosystem involves having a single integrated tool for users (it creates venvs and installs libraries and applications to them, and runs applications in temporary venvs) plus multiple small Unix-philosophy tools for developers, to do the individual tasks they need done. There can be choices for each of these - especially for build backends (PEP 517 was designed specifically to enable competition there). In this world, people could still have all-in-one workflow tools - which mainly just integrate smaller pieces.
Nobody agrees on what the entire "dependency management" process entails. If they supported everyone's use case in a standard tool, many people would be unhappy with how certain things were done, and many more people would ignore a huge fraction of it.
New workflow tools like Poetry, PDM, Hatch, uv etc. tend to do a lot of wheel reinvention, in large part because the foundational tools are flawed. In principle, you can do everything with single-purpose tools. The real essentials look like:
* Pip to install packages
* venv to create environments
* `build` as a build frontend to create your own distributions
* a build backend (generally specified by the package, and set up automatically by the frontend) to create sdists and wheels for distribution
* `twine` to put sdists and wheels on PyPI
The problems are:
* Determining what to install is hard and people want another tool to do that, and track/update/lock/garbage-collect dependencies
* Keeping track of what venvs you made, and which contains what, is apparently hard for some users; they want a tool to help make them, and use the right one when you run the code, and have an opinion about where to keep them
* Pip has a lot of idiosyncrasies; its scope is both too wide in some places and too narrow in others, it's clunky to use (the UI has never had any real design behind it, and the "bootstrap Pip into each venv" model causes more problems), and it's way too eager to build sdists that won't end up getting installed (which apparently is hard to fix because of the internal structure of the code)
* Setuptools, the default build backend, has a legacy of trying to be an entire workflow management tool, except targeting very old ideas of what that should entail; now it's an absurdly large pile of backwards-compatibility wrappers in order to keep supporting old-fashioned ways of doing things. And yet, it actually does very little in a modern project that uses it: directly calling `setup.py` is deprecated, and most of what you would pass to the `setup` call can be described in `pyproject.toml` instead; yet when you just properly use it as a build backend, it has to obtain a separate package (`wheel`) to actually build a wheel
* Project metadata is atrocious, proximately a standardization issue, but ultimately because legacy `setup.py`-exclusive approaches are still supported
This really just reads like someone who ignored literally every good practice for using python and then pikachu_shocked.jpg when he set his world on fire
Just having a virtual environment and requirements.txt alone would solve 90% of this article.
Also with python 3.12 you literally CANT install python packages at the system level. Giant full page warning saying “use a venv you idiot”
I expected something along these lines and was still disappointed by TFA
The author's complaint is that python is supposed to be a good language for people new to programming to pick up. But the default tooling manages dependencies in a way that is unsound, and this has been known for more than a decade. Yet the defaults are still terrible, and regardless of how many articles there are on "best practices" a lot of people get burned.
People who are new to programming have a long way to go before even the concept of "managing dependencies" could possibly be made coherent for them. And the "unsoundness" described (i.e. not having lockfile-driven workflows by default) really just doesn't matter a huge percentage of the time. I've been writing Python for 20 years and what I write nowadays will still just work on multiple Python versions across a wide range of versions for my dependencies - if it even has any dependencies at all.
But nowadays people seem to put the cart before the horse, and try to teach about programming language ecosystems before they've properly taught about programming. People new to programming need to worry about programming first. If there are any concepts they need to learn before syntax and debugging, it's how to use a command line (because it'll be harder to drive tools otherwise; IDEs introduce greater complexity) and how to use version control (so they can make mistakes fearlessly).
Educators, my plea: if you teach required basic skills to programmers before you actually teach programming, then those skills are infinitely more important than modern "dependency management". And for heavens' sake, you can absolutely think of a few months' worth of satisfying lesson plans that don't require wrapping one's head around full-scale data-science APIs, or heaven forbid machine-learning libraries.
If you need any more evidence of the proper priorities, just look at Stack Overflow. It gets flooded with zero-effort questions dumping some arcane error message from the bowels of Tensorflow, forwarded from some Numpy 2d arrays used as matrices having the wrong shape - and it'll get posted by someone who has no concept of debugging, no idea of any of the underlying ML theory, and very possibly no idea what matrix multiplication is or why it's useful. What good is it to teach "dependency management" to a student who's miles away from understanding the actual dependencies being managed?
For that matter, sometimes they'll take a screenshot of the terminal instead of copying and pasting an error message (never mind proper formatting). Sometimes they even use a cell phone to take a picture of the computer monitor. You're just not going to teach "dependency management" successfully to someone who isn't properly comfortable with using a computer.
I don't see any language in the blog post about "people new to programming to pick up".
In fifteen years of using Python, the only people I see getting burned are, conveniently, the folks writing blogs on the subject. No one I've worked with or hired seems to be running into these issues. It's not to say that people don't run into issues, but the problems seem exaggerated every time this subject comes up.
The design of requirements.txt is a bit outdated -- it commits the first sin of developer tooling, which is to mix manually edited and automatically generated (via pip freeze) files. Newer systems use separate lockfiles for that reason, and uv brings this state of the art to Python.
And that makes this, what, the 100th package management solution for Python? A big reason why Python package management sucks is because there are as many package managers as there are frameworks in JavaScript. Nobody ever knows what is the standard, and by the time that information propagates, it is no longer the standard.
Is it really such a big deal that there are multiple "good enough" solutions available? I understand it is suboptimal, especially for beginners, but dependency management in Python is a solved problem for a while now. And also there are not that many solutions. Previously Poetry was the best, now uv is looking to take its place. Not 100 different solutions imo.
That's why I have a requirements folder with separate files (eg - dev.txt, prod.txt) for various installation needs. If you want to include test dependencies in development, just add it into the file like you're installing a regular requirements file:
-r test.txt
And, to double-down, if you read the pip documentation (the second sin of software development?), you can use things other than pip freeze. Like,
python -m pip list --not-required
That option flag is pretty nice because it excludes packages that aren't dependencies (aka - the primary packages that you need). If you do that you don't to worry about dependency management as much.
(Typically you would call that file constraints.txt)
It must be done within one single install command. If you do it twice you can end up with version conflicts during the first install and the second install will either fail or start uninstalling stuff before it retries to find a compatible version.
Rants that pip does not support lockfiles are mostly uneducated, no judgement in that, because it's so easy to miss when it's not the default and the documentation doesn't mention this crucial feature until very late in the chapters.
99% of all pip complaints would be resolved if they only introduced a convention that constraints.txt was automatically detected and installation failed if not found, unless providing a --i-dont-care-about-lockfiles argument.
A few lines of shell script remedies that rather nicely. And there will be times when you need to choose between installing from fully resolved, pinned dependencies versus abstract immediate ones.
uv solves this better, including supporting Windows, and with significantly better error reporting. It even installs Python for you. uv also has platform-independent lockfiles and a number of other features. Oh, and it's a lot faster as well.
Nix solves this better, including supporting Windows, and with significantly better error reporting. It even installs Python for you. Nix also has platform-independent lockfiles and a number of other features. Oh, and it's a lot faster as well.
Have higher standards for your devtools.
Nix is quite a bit harder to use, and absolutely does not support Windows. (WSL is nice, but does not count as supporting Windows.)
Nix is also what I like to call "totalizing" — in order to use Nix you must fit everything into the Nix world. This has many advantages (as do Buck2 and Bazel which are similarly totalizing), but is also quite onerous and might not be desirable.
Further, I believe Nix makes pinning to old versions of a dependency, and maintaining custom patches on top of a dependency, somewhat difficult.
I'm a builder of, among other things, developer tools. From my perspective, the developer experience is the user experience I'm delivering.
From the perspective of someone new to programming, the developer experience we deliver is the user experience they encounter. It is extraordinarily important that it be good.
Again, I would simply rather use uv -- I really appreciate the thoughtful engineering that's gone into it. It solves this problem completely, along with many others. (For example, uv.lock is platform-independent.)
I'm okay with the occasional workaround, but I think we should generally expect, and aim for, excellence with our tools.
No good developers know and understand their tools. They care for them and they learn about them. Wood carvers don't use butter knives because "good woodcarvers don't care about their experience while woodcarving".
I think 'pip freeze' was introduced later and requirements.txt was not designed with such a use in mind. pipenv and other tools have lockfile equivalents for a while.
In my experience, one of the biggest factors driving people to have the author's experience is... backwards compatibility.
The old ways of doing things have existed for much longer than the new ways, and become well established. Everyone just accepts the idea of copying Pip into every new virtual environment, even though it's a) totally unnecessary (even before the `--python` option was introduced two years ago, you could sort of get by with options like `--target` and `--python-version` and `--prefix` and `--platform` and `--abi`) and b) utterly insane (slow, wasteful, makes it more confusing when your PATH gets messed up, leads to misconceptions...). And that's before considering the things people try to do that aren't officially blessed use cases - like putting `sudo` and `--break-system-packages` on the same command line without a second thought, or putting code in `setup.py` that actually tries to copy the Python files to specific locations for the user, or trying to run Pip explicitly via its non-existent "API" by calling undocumented stuff instead of just specifying dependencies (including "extras" lists) properly. (The Pip documentation explicitly recommends invoking Pip via `subprocess` instead; but you're still probably doing something wrong if this isn't part of your own custom alternative to Poetry etc., and it won't help you if Pip isn't available in the environment - which it doesn't have to be, except to support that kind of insane use case).
Another part is that people just don't want to learn. Yes, you get a 'Giant full page warning saying “use a venv you idiot”'. Yes, the distro gets to customize that warning and tell you exactly what to do. Users will still give over a million hits to the corresponding Stack Overflow question (https://stackoverflow.com/questions/75608323), which will collect dozens of terrible answers, many of them suggesting complete circumvention of the package-management lock. It was over a year before anything significant was done about the top answer there (disclosure: I contributed quite a bit after that point; I have a personal policy of not writing new answers for Stack Overflow, but having a proper answer at the top of this question was far too important for me to ignore), which only happened because the matter was brought to the attention of the Python Discourse forum community (https://discuss.python.org/t/_/56900).
It still doesn't make Python's dependency management not to be horrible. Every other modern language has a single tool for these built-in. Python doesn't, with half of the tools coming with the core and half of the tools coming from third-parties and many issues, conflicts, and incompatibilities. Even Poetry, which makes things much, much easier, makes decisions that are incompatible or makes managing dependencies more difficult in some cases.
One of the biggest usability problems with Python dependencies is that the name you import might be different from the name that you use to install the package.
So if you find some script on the web that has an `import foo` at the top, you cannot just `pip install foo`. Instead, you'll have to do some research into which package was originally used. Maybe it's named `pyfoo` or `foolib`.
Compare that to for example Java, which does not have that problem, thanks to Reverse Domain Name Notation. That is a much better system.
"install name == import name" cannot work in Python, because when you `pip install foo`, you may get more than one top-level package. Or you may get a single-file module. Or you may, validly per the spec, get no Python code whatsoever. (For example, you could publish large datasets separately from your data science library, as separate wheels which could be listed as optional dependencies.)
The lack of good namespacing practice is a problem. Part of the reason for it, in my estimation, is that developers have cargo-culted around a mistaken understanding of `__init__.py`.
Venvs are so clunky and probably the biggest stumbling block for beginners.
There was a proposal for a directory-based node_modules analogue which was unfortunately rejected.
I think that would have been the single biggest improvement to the Python onboarding experience ever.
> There was a proposal for a directory-based node_modules analogue which was unfortunately rejected.
There were many problems with the proposal. The corresponding discussion (https://discuss.python.org/t/_/963) is worth looking through, despite the length.
Installers like Pip could help by offering to install `--in-new-environment` for the first install, and Brett Cannon (a core developer) has done some work on a universal (i.e. not just Windows) "launcher" (https://github.com/brettcannon/python-launcher) which can automatically detect and use the project's venv if you create it in the right place (i.e., what you'd have to do with __pypackages__ anyway).
Indeed, python dug its own grave by not supporting in-directory venv.
One can emulate it with tools like poetry and uv but that incurs a performance penalty that every script has to go through `poetry run` and `uv run` which is often a few hundret ms and unsuitable for performant CLIs.
For me the vast majority of the pain occurs when _updating_ dependencies. It can be an all-day chore to find and force versions of everything which will work together, and occasionally you'll have an irreconcilable conflict in subdependencies. I have in the past had to hack in a second version under a different name and update all imports in the parent dependencies to match.
I know other languages have various solutions for this (to basically have package namespaces local to a branch of the dependency tree), but I don't know how much better that experience is.
In principle, yes. In practice, there are a lot of issues with that workflow.
For example, `pip install --ignore-installed --dry-run --quiet --report` will build sdists (and run arbitrary code from `setup.py` or other places specified by a build backend) - just so that it can confirm that the downloaded sdist would produce a wheel with the right name and version. Even `pip download` will do the same. I'm not kidding. There are multiple outstanding issues on the tracker that are all ultimately about this problem, which has persisted through multiple versions of the UI and all the evolving packaging standards, going back almost the entire history of Pip.
I appreciate the suggestions, unfortunately my problems aren't really with pip. It's more that packages ship over- or under-specific version requirements (e.g. dependency A wants numpy == 1.24.1, dependency B wants numpy >= 1.24 but actually doesn't work with 1.24.1 for some obscure reason (the authors probably didn't have A installed so they only tested with 1.24.2 and 1.26), but A actually will work fine with 1.25 so you need to override the 1.24.1 requirement). Or sometimes they have perfectly accurate but incompatible requirements and there's no standard or even good way to reconcile them.
Bit of a Chesterton's fence situation. One of the reasons that Python is popular at all is because the dependency management is terrible and easy to get wrong. The point is it enables relative novices to get started without engaging with important questions. Things like what environmental assumptions are being made, what dependency versions are supported, what a sustainable build pipeline looks like. They can just be vague and their project will work in the here and now - enough to do whatever they want to do.
There is a trade off here and the equilibrium is probably deliberate. The sort of person who tries to get dependencies right up is a professional programmer and although a lot of them use Python the language is designed for a much broader audience. Java is an example of much better dependency management by default and in the main it is only professionals with a view to the long term using Java. Setting up a Java project needs a tutorial and ideally specialist software.
Can you elaborate on the difficulties of setting up Java projects? I have never worked in Java at an enterprise level, but have tinkered a bit. IntelliJ makes setting up a maven-based project pretty much one-click. But I’m guessing that the complexity that you’re referring to wouldn’t be apparent to someone like me who is only using it for small hobby projects.
You have to have identified that you need IntelliJ and know what a maven-based project is (indeed, know what a 'project' is, the concept is a bit technical). And there is the split between the JDK & JVM. This is all to be learned before actually approaching the challenge of adding a dependency to the project.
Compare that to a Python beginner where they install Python & use the text editor that exists on their machine & it can be a general purpose text editor rather than something specifically developed to write Java code. There might be one `pip install` along the way, but no requirement to understand a folder layout, project concept or even create a file for dependency management. There is even a reasonable chance that Python is pre-installed on their machine.
I don't think the OP's complaint was that python's dependency management wasn't complicated enough or doesn't involve enough cryptic XML yet. The point was that the novice will find themselves massively frustrated the moment they try to run their script on a different computer.
Indeed, the novice shouldn't have to make all kinds of intricate choices about the build system to get something running. The language designers should have provided a good set of default choices here. The problem with python is that the default choices aren't actually the good ones.
Don’t we see this kind of rant pop up all the time? So much misery could be avoided by simply using a virtual environment. Plus, uv handles it all in a more elegant manner.
> Good dependency management means ...
> it is possible to safely evolve and update your environment when new versions of dependencies become available.
The post, despite its length, doesn't really spend time on this aspect, and I think it's one of the weaker areas.
Suppose my library depends on both A and B, and both of A and B depend on C, and in the current version of my code, there's some set of versions that matches all declared dependencies. But A releases a new version which has new features I want to use, and also depends on a newer version of C than B supports. I'm blocked from upgrading A with the normal tooling, unless I either ditch or modify B.
This can be a problem in other ecosystems too, but better tools support working around it. In the java world (and I may be out of date here), the maven-shade tool would allow you to effectively use multiple versions of C, e.g. by letting B use its older version but re-package everything under a non-colliding name. Or perhaps I depend on library B and B uses C but I don't use the parts of B that interact with C. I could build a fat-jar for my project, and effectively drop the parts of B that I don't need.
I think this is also most of the time in principle possible in python (though perhaps some relevant static analysis of seeing what is actually used in more difficult), but the ecosystem doesn't really support or encourage it.
>I'm blocked from upgrading A with the normal tooling, unless I either ditch or modify B.... I think this is also most of the time in principle possible in python, but the ecosystem doesn't really support or encourage it.
No, the main problems with this are at the language level. Languages like Java can do it because the import is resolved statically. Python's `import` statement is incredibly dynamic, offering a variety of rarely-used hooks (start at https://docs.python.org/3/library/sys.html#sys.meta_path and follow the links, if you dare) that ultimately (by default) put imported modules into a single global dictionary, keyed by name. If you want to work around that, you can, but you'll definitely have to roll up your sleeves a bit.
Even if you just want to choose between two different versions at runtime, the base `import` syntax doesn't have a way to specify a version. If the package version isn't explicitly namespaced (which causes more pain for everyone who doesn't care about the exact version) then you need to make special arrangements so that the one you want will be found first (normally, by manipulating `sys.path`).
But if you want to use both during the same run, you'll encounter far more problems. The global dict strategy is very deliberate. Of course it offers a performance benefit (caching) but it's also required for correctness in many normal cases. It makes sure that different parts of the code importing the same module see the same module object, and therefore the same module-level globals.
That is: in the Python world, it's common that the design of C involves mutating its global state. If you "let B use its older version" then it would have to be an entirely separate object with its own state, and mutating that would result in changes not seen by A. Whether or not that's desirable would depend on your individual use case. A lot of the time, you would want the change to be shared. But then, even if you could propagate the changes automatically, you'd have to deal with the fact that the two versions of C don't necessarily represent their state identically.
> by letting B use its older version but re-package everything under a non-colliding name. Or perhaps I depend on library B and B uses C but I don't use the parts of B that interact with C. I could build a fat-jar for my project, and effectively drop the parts of B that I don't need.
In principle, the B author can make the dependency on C optional ("extra"), for those users who do need that part of B. Or even for B to optionally depend on B-prime, which encapsulates the C-using parts.
This is actually not very difficult (and the above-described dynamic nature of Python is very helpful here), but in practice it isn't done very much (perhaps Python developers are wary of creating another left-pad situation). But I do wish e.g. Numpy were more like that.
But the "fat-jar" analog is also possible in Python, as long as it isn't a problem that you aren't sharing state between C-old and C-new. It's called "vendoring"; Pip does a lot of it; and the tooling that Pip uses to enable it is also made available (https://pypi.org/project/vendoring/) by one of the main Pip devs (Pradyun Gedam).
> No, the main problems with this are at the language level. Languages like Java can do it because the import is resolved statically.
> put imported modules into a single global dictionary, keyed by name.
> If the package version isn't explicitly namespaced (which causes more pain for everyone who doesn't care about the exact version) then you need to make special arrangements so that the one you want will be found first (normally, by manipulating `sys.path`).
> That is: in the Python world, it's common that the design of C involves mutating its global state. If you "let B use its older version" then it would have to be an entirely separate object with its own state, and mutating that would result in changes not seen by A. Whether or not that's desirable would depend on your individual use case
I haven't worked on a java project that needed this in some years so I may be out of date, but all the same things you describe are also (mostly) true of and relevant to the java ecosystem (or at least were), and these are the same considerations that come to the choices around shading in java land.
- the class name, method names etc are statically known at compile time, but version is not indicated by import or use. The compiler finds (or doesn't) the requisite definitions from name, based on what's available on the classpath at compile time. At runtime, we load whichever version of that class is on the classpath. If between compilation and running, some part of your build process or environment changed to make a different version of the class available, and names/signatures align, nothing at runtime knows of the change. Order still matters, and just as with python, java (outside of custom classloader shenanigans) also works hard to only load each class once, associated to a qualified name.
- shading works around these constraints by renaming. E.g. you might have two SDKs which wrap API calls to different vendors, which depend on different major versions of a serialization library. No functionality depends on passing any of internal classes of the serialization library between calls to these two vendors, so you're safe to shade one SDK's use of that library (package) to a new, unambiguous name. The key point is both the conflicted library C and the SDK that uses it B get rewritten. Note, this would break at runtime if either library had code that e.g. constructed a string which was then used as a classname, but this would already be pretty abusive.
- similarly in python, if in your project you use two libraries which each separately use e.g. pydantic at 1.x and 2.x, and your own code isolates these from each other (i.e. classes from B don't get passed through methods in A, etc), then you could pretty safely rename one version (e.g. `pydantic` 1.x in library gets renamed as `pydanticlegacy`) -- but the common tooling doesn't generally support this. Just as in the python case, if the library code does something weird (e.g. `eval`ing a string that uses the original name), stuff will break at runtime, very analogously to the java situation.
In both cases, the language on its own doesn't support a concept of import of a specified version, and at runtime each unique name is associated with code which is resolved by a search through an ordered collection of paths, and the first version found wins. What differs is the level of tooling support for consistent package/module renaming. If anything, I think the actual requirements here on the python side should be lower; because java shading must work on class files (so you don't have to recompile upstream stuff), it needs to use ASM.
>- shading works around these constraints by renaming.... one SDK's use of that library (package) to a new, unambiguous name. The key point is both the conflicted library C and the SDK that uses it B get rewritten. Note, this would break at runtime if...
> similarly in python, if in your project you use two libraries which each separately use e.g. pydantic at 1.x and 2.x, and your own code isolates these from each other (i.e. classes from B don't get passed through methods in A, etc), then you could pretty safely rename one version (e.g. `pydantic` 1.x in library gets renamed as `pydanticlegacy`) -- but the common tooling doesn't generally support this.
The thing is that Python offers way more opportunity to do weird things that would break this at runtime, while not providing the tools to enforce that kind of isolation at compile time (I don't know if type checkers would be powerful enough for complex cases there, because I don't use them; but the language implementation is free to ignore any complaints from a type checker anyway).
Some sent this in a group I am in 2 weeks ago. I will copy my response here:
"Articles like that over-dramatize things a lot. As a self-taught Python developer, I mostly learned from looking at code. When I started, I didn't know anything about packaging. I just installed stuff globally with pip and moved on. I didn't really build projects, just scripts. Once in a while I used a venv when I was following a guide using them. To give a bit of perspective, for a while I didn't really know how to make a class, only a function.
Fast forward a few months, I learned a bit more about projects and also started contributing and/or forking open source projects (again, never took classes, I just explore). I used poetry a little, it works nicely. Now, I use uv for anything new, and it works beautifully. It's really simple.
uv init, uv add any deps, and uv run to run the project or just activate the venv. And I never run into dependency issues. It's like really simple. And being able to manage python versions really simply is a really nice bonus.
My cicd doesn't even use a special docker container or anything. Literally installs uv (using curl | sh), uv sync, uv run. Finished. And very fast too.
So yeah, Python dependencies aren't automatically vendored. And yes, Python tooling has historically been bad. But now we have amazing tooling and it's really really easy to learn. uv is wonderful, poetry is also great, and anyone complaining it's too hard is either over dramatizing it or doesn't have 5 minutes to read a readme.
So yeah, people should stop over dramatizing something that really isn't dramatic."
I recently learned that uv is written in more than 100k lines of Rust code. Cudos to the developers! This is however also indicative of python dependency managment complexity.
This post feels like it was written 10 years ago. There are plenty of excellent dependency management systems for Python these days. Honestly, the biggest problem is that there are too many ways to set it up.
One easy trick they don’t want you to know about: stop importing dependencies. You probably don’t actually need numpy. You probably don’t need requests. The stdlib is incredibly broad.
This doesn’t always work, of course (especially for large projects), but for smaller ones, it absolutely does.
I think the biggest issue is that nobody can settle on what to use and how to use it. pip, pdm, poetry, and maybe something else I don’t know about. Some new flashy dependency manager comes along and that’s the new hotness. I'm probably just as guilty of that.
> You decide to re-install your operating system and vow to go back to doing everything in Excel.
Which itself can run Python now. Only in the Microsoft Cloud, not only to rake in that sweet subscription money, but probably also to avoid these headaches.
Everyone seems to praising uv in the comments, seems like no one read this article, it has its drawbacks, it can't install system level libraries and dependencies which many packages depend on.
No I'm referring to system level libs, that you need to install via your operating system package manager like `apt` for python packages that depend on them.
really sucks that we're stuck with python when languages of the same realm (JS and ruby for example) are so much better (as languages and with regard to the associated tooling)
FWIW, I've rolled, crashed, and burned on at least one attempt to use Python tooling due to forgotten about installs/environments which required uninstalling/deleting everything and start anew.
It would be nice if there was a consensus and an agreed-upon solution and process which would work as a new user might expect.
I create a venv. Pip install and keep my direct deps in requirements.txt
That's it. Never understood all these python dependency management problems dramas.
Recently, I started using pyproject.toml as well which makes the whole thing more compact.
I make lots of python packages too. Either I go setup.py or sometimes I like to use flit for no specific reason.
I haven't ever felt the need for something like uv. I'm good with pip.