Hacker News new | past | comments | ask | show | jobs | submit login

Thanks for taking the effort. Are tags implemented as just references from a tag "foo" node to the tagged item?



Yes and no. This is the "small lie" mentioned in there where I said that there's only one sort of rows. The StringListColumn maintains a separate row of string values and each reference to a tag just gets an index to the tag, but tags are not "first-class" nodes in that they're not a separate node in the database.

The first time that I implemented a system like this back in 2004 I did things that way. That's in theory more flexible, but since we had a specific class of applications in mind in this case it's for our uses faster to check if an item has a given tag just by having a list of tags associated with each item. The typical access patter for us means that we're already looking at an item and just want to know if it has a given tag.


Thanks for the detailed answer. I've a start-up idea that needs a big dag offline for the production of a smaller (in bytes, not nodes) one used online. My intention was to use Berkeley DB. It's good to see you say it's the fastest of the off the shelf options, but it only reaches 5% of your own code! Any thoughts on what BDB is doing "wrong"? Were you using hash or B-tree with BDB?


We used hashes with BDB. I didn't dig down deep to see what was going on since we weren't really considering using BDB because of its licensing -- GPL, and applications link directly to it, unlike, say, MySQL, and while we're currently only offering access as a web-service, we'd like to have the option open to licensing the recommendations engine in other ways down the line. So mostly we were trying it out to have another data-point to see how our implementation stacked up.


Interesting. Looks like Berkeley DB will be good enough for me to prove the concept then, and if, or seemingly when, it becomes the bottleneck I'll know it's possible to improve on it. Thanks again; it's great having first-hand access to those that have done it instead of just theorising about it, like me. :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: