I've recently looking around at possible back ends for a recommendation engine for my social news site. I've been interested in using neo4j for a while, but I was concerned about dual licensing and the costs. I'm bootstrapping, and my budget is pretty limited. So, I emailed neo and asked for a quote.
To make a long story short, Peter Neubauer from Neo called me, and we had a really great conversation for about 20-30 minutes. He took the time to forward some papers to me about using graphs in recommendations engines.
Their subscription prices were crazy reasonable. I was expecting Oracle pricing. I was actually suprised their prices were so low. Up to a million primitives are free. Up to 10million primitives is $49 a month.
I'm really glad to hear they got funding. So far, I've been really impressed by them. I'm looking forward to checking out their database.
I looked in to neo4j the other day, but was put off by the combination of the AGPL license (which, as I understand it, makes it impossible to use for any commercial purpose since you are required to GPL any software that so much as talks to the database) and a lack of any indication of how much a commercial license would cost on the web site. I know this is how "enterprise" stuff works, but personally I have a strong aversion to evaluating software when I have no idea how much it would cost should I chose to use it - that's one of the reasons I like open source.
I'm trying to see how a graph database would be significantly faster than a key-value database. The only thing that comes to mind is that with graph databases a node's edges contain direct pointers or offsets to the other nodes, instead of containing the "public" string key of every other node, which has to be looked up via the B-tree index. This conclusion is supported by http://blog.directededge.com/2009/02/27/on-building-a-stupid...
So if key-value stores gave us access to their internal record IDs or offsets, both to read and search by, we would see a considerable improvement. The trade off would be sticking to an append-only data structure without compaction, or doing some extra work to update the offsets on every compaction.
Great news for Neo. Most of the comments I see about neo4j are along the lines of "great technology, but doesn't quite cut it from a performance perspective".
I've seen some of the same comments, but I tend to take them with a grain of salt...in part because none of the commenters seem to have asked Neo or the community for help/advice when they've experienced performance issues. (The fact that the Neo demos I've seen have been pretty decent performance-wise leads me to believe at least some of the performance issues could have been solved if the people trying Neo out would have simply asked for help. It's usually difficult to wring the best performance out of a database of any kind without some hard-earned experience. The fact that many of the experimenters probably have little graphDB experience in general couldn't have helped the situation any.)
Definitely true. There are of course patterns that can slow certain graph traversals down, like having millions of relationships on one single node. Otherwise Neo4j handles up to a couple of billions of nodes/relationships/primitives out of the box, which is a good starting point for most of the cases.
Definitely agreed. Some of the comments I've seen from the neo4j author definitely lend to the fact that for most cases, the performance of neo4j itself won't be a problem at all.
To make a long story short, Peter Neubauer from Neo called me, and we had a really great conversation for about 20-30 minutes. He took the time to forward some papers to me about using graphs in recommendations engines.
Their subscription prices were crazy reasonable. I was expecting Oracle pricing. I was actually suprised their prices were so low. Up to a million primitives are free. Up to 10million primitives is $49 a month.
I'm really glad to hear they got funding. So far, I've been really impressed by them. I'm looking forward to checking out their database.