Hacker News new | past | comments | ask | show | jobs | submit login

But that requires like entirely rearchitecting the reddit backend to be less efficient.

A page has comments a comment has comments a comment has votes a vote has a user

That is to say that a vote is a many to many join from user-comment.

To do what you want, you'd also have to give the page the concept of "people who have voted on a comment that belongs to be or one of my children"

>User opens page, they get a list of comments with scores. And a possibly empty list of votes. 99.9% is probably less than 20 votes from that user so 160 bytes or less. Worst case is what 8 byte comment ID * 1,000 up-votes = 8k.

The problem isn't sending additional data over the wire, its that, say I've voted on 500 things on a page, then we're at 250,000 operations (for each comment check if its in my list of voted on comments), whereas a bloom implementation is O(n). And also the whole making your database structure repetative and icky.




In your approach comments have votes the server must check the bloom filter for each comment then do a fetch for each match in the case of 500 votes that's 500 fetches which are at least votes * (log comments votes) because you can't trust the filter.

My approach that's client side most of the time. Further it's O (n) to compare matches between two sorted lists so just sort them in O (n log n). Further log 500 is faster than most hashes.

And of course for the most common case of no votes it's a single lookup and done.

PS: As to changing the back end, yes if the initial implementation is bad then you might have a point. But, again fixing the problem also works and produces far less code to maintain.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: