Python imports and slow startup speed hurts Python developers on moderate and large projects. It's the downside of nothing having more static module exports. I believe JS/TypeScript folks try to avoid this trap. In Python, to know what things module exports you need to run the module, there is no static analysis way to find out it otherwise.
Making imports fast for data classes may solve some of the problems. Old big projects like Plone/Zope have solved this problem by making more generic lazy import system that it extensively being used.
>Python imports and slow startup speed hurts Python developers on moderate and large projects
What kloc is moderate and large? I have not run into a project where import took that long.
>I believe JS/TypeScript folks try to avoid this trap.
Have you ever had to run a node project? I haven't benched it but I think it would be faster to compile a rust project and start that up than use node. But I can't web so maybe there's special ways to make node start quickly that I just don't know about. (I think there's a law where you can get answers more quickly by declaring something impossible to trigger people into flooding you with great suggestions :-) ).
The problem I experienced multiple times in 50-200 KLOC projects is not the time needed to import the modules, but the memory consumption caused by the imports. Moving some imports from top-level module statements to inner functions' code could improve the memory consumption several times, e.g. from 250MB per process to 80MB per process.
> You should pronounce it as "kludg-in" as in "runnin" or "trippin". So, if someone asks "what are you doing?", you don't say "I'm using cluegen." No, you'd say "I'm kludgin up some classes." The latter is more accurate as it describes both the tool and the thing that you're actually doing. Accuracy matters.
I read the readme only after this comment and was surprised by how detailed this was. Why would someone put effort complicated tool, benchmark it etc. at then not really take it seriously? Until I noticed who the author was, then it suddenly made complete sense
I don't know the author at all really, but I do appreciate a professional in any field who can find humor in their work and avoid taking themselves too too seriously.
The writing in the README reminds me of people like Derek Lowe (author of the In The Pipeline pharma/bio/whatever blog) and John D. Clark (author of Ignition!) who can create exceptional things and deliver knowledge in a humorous and engaging way.
I don’t know, the code itself is shorter and easier to read than the readme initially was. And maybe this will dissuade folks from installing and using this if they’re not committed to maintaining it as part of their dependencies. Dependencies do require your maintenance and oversight of new code revisions, but few people schedule such time. It’s essential though, any dependency could arbitrarily change its behaviour at any time, technically breaking your code until you can produce a fix. So dependencies are technically yours to maintain — you just choose to limit your maintenance from a “fork” to a version string and API usage, but the maintenance burden still exists...
The main problem with this approach is that it conflates type hierarchies (through class inheritance) with developer convenience; there are reasons why both attrs and dataclasses chose the decorator approach.
Every ingredient in the rack is spice. There is a "proper" amount to use for the recipe. Let good taste be the guide. Never go Full [language I like to drag].
i'm curious about that __eq__. it generates code like
(self.x, self.y) == (other.x, other.y)
but i think this would be more efficient:
self.x == other.x and self.y == other.y
because you avoid creating a tuple (possibly heap-allocated) and if `.x` differs, you avoid a dictionary lookup¹ for `.y` thanks to short-circuiting. i think i benchmarked it at some point, but that was a while ago (on CPython 3.5)
i never got around to generating the methods lazily, maybe i should!
for correctness reasons i switched from interpolating raw strings to <array of (line, indent-level)> and a bunch of wrappers – generating if-elif-else chains via raw strings gets scary. but they work well enough here
---
1. iirc even though it's using slots, it still has to look up the descriptors for `.x` and `.y` in the class dictionary. another possible optimization would be to trade off memory (and extensibility) for time and "cache" those. something in the vein of
here, `get_x` and `get_y` would be closed-over in `eq`, so they shouldn't incur a dictionary lookup.
which of course adds significant complexity, but that may be a sensible trade-off
> Yes. Yes, you could do that if you wanted your class to be slow to import, wrapped up by more than 1000 lines of tangled decorator magic, and inflexible.
At first I thought: there's no way someone is this categorical about some obscure aspect of python's internals, then I noticed it's from dabeaz. What a man :)
Making imports fast for data classes may solve some of the problems. Old big projects like Plone/Zope have solved this problem by making more generic lazy import system that it extensively being used.
Use zope.deferredimport package for this:
https://zopedeferredimport.readthedocs.io/en/latest/narrativ...
Though I am not sure if zope.deferredimport has been updated to play nicely with the modern typing tools like editors and MyPy.