Hacker News new | past | comments | ask | show | jobs | submit login

`insert_all` seems to be an example of what I mean about how the framework encourages you to do the wrong thing. Here there is a lower-level hatch to do a bulk insert, but it says it doesn't run your callbacks/validations. So if you're using "good" design (or using libraries that work by hooking into that functionality), you can't use it. Laravel was the same way.

The new queue you linked is database backed, but the whole point is that you want to just run a job without needing to serialize anything outside of your process. It should just schedule it onto the thread pool and give you a promise for when it's done.

The Kafka thing also seems to be an example of what I mean: in Scala I'd just make a `new Queue` with a thread safe library, and have a worker pull off and do an insert every hundred rows or so, or after e.g. 5 ms have passed, whichever is first. No extra infrastructure needed, minimal RAM used, your queueing delay is in the single digit ms, and you get the scaling benefits. Takes maybe 10-20 lines of code.

You can then take that and abstract it into a repository pattern so that you could have an ORM that does batching for you with single item interfaces (for non-transactional workflows), but none of them seem to do this.




I supposed I've just been in Rails land for a while so I can't make an apples to apples comparison to how other frameworks approach things but I don't think insert_all is encouraging anything wrong - by the time a Rails team is reaching for it I can almost guarantee they understand the implications of it.

And again maybe I'm just not understanding but I really like having our background processes handled completely separately from our main web application. Maybe its just the peace of mind knowing that I can scale them independently of each other.


It's not that insert_all is encouraging anything wrong; it's that the normal way to use ActiveRecord does. insert_all is the right way to do things performance-wise, so you'd want to use it when possible, but if you were using other features of the framework like callbacks/validations for create/update, then you can't. The happy-path of an ORM tends to push you in a direction where bad performance all over the place is the default, and it does it in a way where if you didn't have properly calibrated performance expectations, you might think that the bottleneck is because IO is slow, but actually it could easily handle 10x the workload with better access patterns.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: