They can be, pip developers just have to care about this. Nothing you described ...

woodruffw · 2024-06-12T17:48:20 1718214500

This is unreasonably dismissive: the `pip` authors care immensely about maintaining a Python package installer that successfully installs billions of distributions across disparate OSes and architectures each day. Adopting deduplication techniques that only work on some platforms, some of the time means a more complicated codebase and harder-to-reproduce user-surfaced bugs.

It can be worth it, but it's not a matter of "care": it's a matter of bandwidth and relative priorities.

lye · 2024-06-12T18:19:29 1718216369

"Having other priorities" uses different words to say exactly the same thing. I'm guessing you did not look at pnpm. It works on all major operating systems; deduplication works everywhere too, which shows that it can be solved if needed. As far as I know, it has been developed by one guy in Ukraine.

westurner · 2024-06-12T22:25:05 1718231105

Send a PR!

Are there package name and version disclosure considerations when sharing packages between envs with hardlinks and does that matter for this application?

Practically, caching ~/.pip/cache should save resources; From "What to do about GPU packages on PyPI?" https://news.ycombinator.com/item?id=27228963 :

> "[Discussions on Python.org] [Packaging] Draft PEP: PyPI cost solutions: CI, mirrors, containers, and caching to scale" [...]

> How to persist ~/.cache/pip between builds with e.g. Docker in order to minimize unnecessary GPU package re-downloads:

  RUN --mount=type=cache,target=/root/.cache/pip

  RUN --mount=type=cache,target=/home/appuser/.cache/pip