Hacker News new | past | comments | ask | show | jobs | submit | pdhborges's comments login

How much time did the PyAST_mod2obj actually take? The rewritte is 16x faster but the article doesn't make it clear if most of the speedup came from switching to the ruff parser (specially because it puts the GC overhead at only 35% of the runtime).


That's a good question. I don't have an easy way to rerun the comparison since this happened actually a while ago, but I do remember some relevant numbers.

In the first iteration of the Rust extension, I actually used the parser from RustPython. Although I can't find it at the moment, I think the RustPython parser was actually benchmarked as worse than the builtin ast parse (when both returned Python objects).

Even with this parser, IIRC the relevant code was around 8-11x faster when it avoided the Python objects. Apart from just the 35% spent in GC itself, the memory pressure appeared to be causing CPU cache thrashing (`perf` showed much poorer cache hit rates). I'll admit though that I am far from a Valgrind expert, and there may have been another consequence of the allocations that I missed!


It doesn't to be puzzling just read the motivation section of https://peps.python.org/pep-0703/


I know that document. It doesn't really answer this point though. It motivates the need for parallelism in slow python by noting that this is important once other performance critical code is in extensions.

But the mai point against multiprocessing seems to be that spawning new processes is slow ...

That single "alternatives" paragraph doesn't answer at all why mp isn't viable for python level parallelism.

I am no longer invested in python heavily, I am sure there are discussions or documents somewhere that go into this more. Might be that it's simply that everyone is used to threads so you should support it for sheer familiarity. All I am saying is it's not obvious to a casual observer.


> But once your tables start to grow it will become misery to work with.

I buy the data layout argument for databases with clustered tables without any smi-sequencial uuid support. But the storage argument looks vanishingly applicable to me: if someone needs to add a column to one of these tables it basically offsets a 4 byte optimization already.


8 Byte (assuming 128 bit instead of 64 bit) but yeah.

It's not quite as simple as saving 8 bytes per row though. It's 8 bytes for the UUID, plus 8 for at least the PK, plus 8 more for any other keys the UUID is in.

Then you need to do that for any foreign-key-fields and keys those are in as well.

However, unless your table has very few non-uuid columns, the rest of the table size will dwarf the extra n*8 bytes here. And if you are storing any blobs (like pictures, documents etc) then frankly all the uuids won't add up to anything.

In summary, whether using uuids or not is right for you depends a Lot on your context. An Insert Only log table, with a billion rows, is a very different use case to say a list of employees or customers.

Generally I'm very comfortable with uuids for tables of say less than 100m rows. Very high inserting though suggests tables bigger than that, and perhaps benefits from different strategies.

When it comes to data stored in multiple places (think on-phone first, syncd to cloud, exported from multiple places to consolidated BI systems), uuids solve the Lot of problems.

Context is everything.


I'll be blunt, having experienced the damage caused by DRF Model* accelerators in mid sized codebases I just hate it.

CRUD stops being CRUD quickly, these CRUD accelerators just intruduce more non linearity into the development. Its easy to hit a wall and hack around the accelerator to make a little bit custom behaviour leaving behind API implementations that are totaly "irregular" from each other each one mixing different low level changes in the middle of high level accelerators.


Thanks for raising an important point, @pdhborges. You've highlighted the limitations often encountered with traditional CRUD accelerators, especially as projects scale. Django Ninja CRUD is designed to tackle exactly these challenges. Its compositional approach offers flexibility to adapt and customize as needed, without the mess of hacking around the accelerator. It's all about making it easier for developers to maintain consistency across APIs while allowing for the unique customisations each project requires.

And to echo @WD-42's point, if a specific Django Ninja CRUD view doesn't meet your evolving needs, you can seamlessly switch it out for a custom view written in vanilla Django Ninja. It's designed to be flexible and developer-friendly, ensuring you're not locked into a one-size-fits-all solution.


I'm always sensitive to this issue, because I've seen it happen too. Starlite (now Litestar) has fairly good escape hatches I think: you can convert a database model to an API model with a method call, but you can also modify and add fields, or create a totally separate representation, or return multiple database models from one API call. So far so good.


I think that's why the author went with a composable approach, as opposed to say, DRF's ModelViewSets. Similar, but I think the escape hatch here is much easier to open. I like it.


I think Rails made a good decision by adding code generators and calling it 'scaffolding'. The rails CLI will give you all the CRUD you want. Of course, a lot of libraries try to abstract that with DSLs or extra boilerplate.

Doing it at runtime is a lot more difficult and complicated; why not just use parameterised templates to create the right files in the right place?


IMO, all of these acronyms are crap except for HTTP. Stop trying to write these one size fits all APIs and just provide an HTTP server that takes arbitrary data and responds with arbitrary data. If your app has functions that work this way, your API can work that way, too.


If a government is small does it have enough power to impose the rule of law and be an economic and social moderator ?


> If a government is small does it have enough power to impose the rule of law and be an economic and social moderator ?

That's a very binary take. Nobody argues for "smaller in all dimensions", just smaller in specific dimensions.


Depends on how small it is. What I can argue for is that there are probably parts of the government that could be removed. Bureaucracy in Western countries has only increased over the last 60 years, for example. Rules are almost never removed.


Congrats to the team. I think it is particularly impressive (and a bit scary) that DDL changes are also synchronized to the clients automatically!


They are not polyfills. Multiple scheduling modes are provided for libraries that are not thread safe (it's a total mess and I avoid these wrappers like the plague)


I've had to weld some async and sync Python together with queues and callbacks, it's not pretty.


Is this resource up to date? ALTER TABLE ADD FOREIGN KEY NOT VALID (PARENT) shows a stronger lock than ALTER TABLE ADD FOREIGN KEY (PARENT) while the PG docs says that the locks should be the same.


Isn't a Pareto optimal solution always at the edge of the feasible space?


Yet, it is usually in middel of the repsective dimensions, not on the extreme ends of either.


Should this be the take away? It looks to me that people are building pipelines without taking into account that they change roles every time they have to build a package, from a package consumer to a package packager. Even if the package developers set an upper bound on the build dependency it is the packagers responsibility to provide a deterministic build environment.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: