Hacker News new | past | comments | ask | show | jobs | submit login

Async in Python, for a long time, has been a horrible hack relying on monkey patching the socket module. The newer asyncio stuff is quite nice by comparison, but the problem is that Python, due to its popularity, has libraries that haven't been upgraded.

Python always had deployment issues, IMO. In Java, 99% of all library dependencies are pure JARs, and you rarely need to depend on native libraries. You can also assemble an executable fat JAR which will work everywhere, and the fact that the build tools are better (e.g., Maven, Gradle) helps.

Compared with Python, for which even accessing a RDBMS was an exercise in frustration, requiring installing the right blobs and library headers via the OS's package manager, with Postgres being particularly painful. NOTE: I haven't deployed anything serious built with Python in a while, maybe things are better now, but it couldn't get much better, IMO.




I worked at a place where we had machine learning systems with a big pile of dependencies that pip could not consistently resolve, I figured out what most of the technical problems where but I was still struggling with wetware problems and they eventually put me on a Scala/Typescript project instead.

One big problem is that pip just starts downloading and installing things optimistically, it does not get a global view of the dependencies and if it finds a conflict it can't reliably back out from where it is and find a good configuration. The answer is to do what maven does or what conda does and download the dependency graph of all the matching versions and get a solve before before you start downloading. Towards the end of my time on that project I had built something that assembled a "wheelhouse" of wheels necessary to run my system and would install them directly.

What I figured out was that you could download just the dependencies from a wheel with 2 or 3 range requests because a wheel is just a ZIP file and you can download the header and the directory from the end of the file and then know where the metadata is and download just that. Recently pypi got some sense and now they let you download just the metadata.

And that's the story of Python packaging. Things are really going in the right direction but progress has been slow because the community has mistaken "98% correct" (e.g. wrong) with "has 98% of the features somebody might want" It might have been a lot better if somebody with some vision and no tolerance for ambiguity had gotten in charge a long time ago.


A new dependency resolver was introduced [0] with pip 20.3 (in 2020) which sounds like it's meant to address the problem you're describing. Were you using an earlier version of pip or is this still a problem?

[0] https://pip.pypa.io/en/latest/user_guide/#changes-to-the-pip...


> In Java, 99% of all library dependencies are pure JARs

Yes, this is the difference...Python community has in practice chosen more native dependencies, Java has not. But Java JNI code (if you ever do have it) is just as painful.

> with Postgres being particularly painful

You want to a pure Python package, and those have gotten much better.

asyncpg is really, really good if you want async.

Otherwise, pg8000.


It got better, with containers.


Where I worked containers just gave the data scientists superpowers at finding corrupted Python runtimes. I don't know where they got one that had a Hungarian default charset, but they did.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: