Hacker News new | past | comments | ask | show | jobs | submit login
Sophia: A modern embeddable key-value database – v1.2.2 released (sphia.org)
72 points by pmwkaa on April 12, 2015 | hide | past | favorite | 42 comments



There seems to be a lot of embedded KV store already, and SQLite is pretty much the defacto embedded relational DB. Is there a good embedded Graph DB around? Specifically, a embedded property graph DB. Hypergraphdb is embeddable, but it's not property graph, and while neo4j has an embedded version, I don't think it works for non-java use case.


There are a lot of platform specific solutions (neo4j, networkx, Core Data, etc) but I'm not aware of a generalized solution. I would like to know this too, because I'm often constrained to certain languages/platforms but would like to use something like neo4j.


If you are using NodeJS then http://github.com/amark/gun might do the trick. Embedded graph database with realtime push notifications. Properties are just regular ol'JSON.


Working on one! Give us a few more weeks though, we don't wanna release something we aren't 100% happy with, even as a 0.1 .


There's Sparksee/Dex. Not sure it has everything you're looking, and it's proprietary and a bit pricey.


I believe CoreData should count as an embedded Graph DB. It's backed by SQLite but Cocoa specific.


Be sure to checkout Tarantool[1]; it uses Sophia for on-disk databases

[1] http://www.tarantool.org


Never heard about it before, it looks interesting.

  > Tarantool combines the network programming power of Node.JS with data persitence capabilities of Redis.
Is that sarcasm? I can't tell.


I don't think so.

Tarantool uses an async evented IO model, but uses Lua coroutines and not Javascript. There are not callbacks, just 'yield points'.

Also, the primary data store backend is an in-memory database with optional 'snapshoting' to disk. An alternative backend uses sophia, so it's not 100% in memory.


I tend to see Tarantool as a Lua powered database. In this sense, you could easily implement a Redis like system on top of it using Lua


Do you know which version of Sophia it uses?


The latest. Sophia was created as a disk-based engine for Tarantool, and is also available as a standalone embeddable library


Ah, thanks - didn't know that. Tarantool is very interesting, btw.


Does anyone have recommendations on a constant DB optimized for sequential integer keys? Running LZ4 over things is cool, but using delta encoding or more clever schemes, you can work right on the compressed key data. (And even more fun if the value is also just a restricted set of integers, like an inverted index.)


LMDB has optimized support for integer keys, as well as for sequentially sorted data. http://symas.com/mdb/doc/


comparisons to kyoto cabinet, leveldb, and rocksdb (on features, maturity, and performance) would be great if anyone has any to share.


also add lmdb/boltdb to that list. there seems to be a convergence around a certain feature-set for embeddable key/value stores: MVCC semantics and ordered keys.


As well as comparison to LMDB.


https://charlesleifer.com/blog/completely-un-scientific-benc... does some basic performance comparisons for unqlite, vedis, dbm and kyotocabinet, leveldb, rocksdb, sqlite and redis.


Just to be clear, does embeddable have a specific meaning here? I'm a firmware developer, often writing code for small CPUs and microcontrollers. Does this apply? It seems like here "embeddable" means it can be compiled into an app, as opposed to getting accessed through a server. Is that correct? Thank you.


It seems like here "embeddable" means it can be compiled into an app, as opposed to getting accessed through a server. Is that correct?

Yes

whether or not it is a good fit for small CPUs and microcontrollers is another characteristic that I can't comment on.


(since LMDB has been mentioned in this thread... LMDB is embeddable in every sense of the word. It can work with as little as 64KB of memory and is already deployed in a number of MCU-based products. Unfortunately I don't have permission to name names.)


Is LMDB running on any operating systems that do not offer mmap support?


Not currently. That's kind of a fundamental component of LMDB's design.


Thanks. That makes your comment about MCU-based systems that much more intriguing. :)


I believe embeddable is referring to the definition you inferred - that is, you can compile it into your application rather than running it as a separate service.


Since we're throwing around names here, depending on your use case something as simple as cdb can be amazing: http://cr.yp.to/cdb.html


@pmwkee: http://sphia.org/pv12.html doesn't tell us the scaling characteristics. The cited performance page is DB at steady state of 6.0M keys. how does it behave under dynamic load? Various scenarios to help your potential users determine if the software is a good fit for their use-case, would be helpful.

Glanced at the code and the arch doc. Looks promising and shows careful crafting. Well done!


Cool! I was looking for a simple key-value alternative for SQLite3 and was going to use redislite[1]. But this is perfect, I think it has the potential to replace SQLite3.

[1]https://github.com/seppo0010/redislite


SQLite4 is being designed as a key-value alternative to SQLite3[1]. SQLite3 and SQLite4 are meant for different use-cases and are expected to co-exist. Unfortunately, SQLite4 hasn't yet been released, but wanted to let you know that the SQLite developers are actively working on addressing the need for an embedded key-value store with SQLite4.

[1] https://sqlite.org/src4/doc/trunk/www/design.wiki


BSD licensed and implemented as small C-written library with zero dependencies.

What's not to like...


So why not BerkleyDB? I couldn't find a comparison to the old standard on the site (granted did just a cursory glance).


Why BerkleyDB?

BerkleyDB is now AGPL3[0], which some projects have a problem with. (Of course, you can buy a commercial license. Some projects have a problem with that too).

But the main reason, almost regardless of context, as to "why not BerkleyDB" is LMDB[1]. It works way better than everything else, in just about every practical use that has more reads than writes.

The only downside as far as I can tell, is that right now it relies on memory mapping the entire database, so you're limited to ~1GB overall database size on 32 bit systems. There is no practical limit on 64 bit systems. Also, I recall Howard Chu (main LMDB developer) mentioned that in the near future LMDB will gain the ability to manage memory manually - thus removing this restriction as well (for a performance price if used this way).

[0] http://www.oracle.com/technetwork/database/database-technolo...

[1] http://symas.com/mdb/


The feature for re-mapping on 32bit systems is available here https://gitorious.org/mdb/mdb/source/69d7cb8d44e04f02d8d0c92...:


When will it be merged to the mainline? (or is it already there?) Will it eventually be a run-time option, or always a compile time option?

The latest reference I can find is http://www.openldap.org/lists/openldap-devel/201410/msg00001... - were the problems solved?

Thanks for LMDB. It is awesome.


It still needs heavier testing before going to mainline. I expect it will only ever be a compile-time option. On 64bit it's pointless, and 32bit is going the way of the dodo.


BerkeleyDB is awesome, but its license may not be suitable for many organizations. Also, the different design would probably result in different behaviors under different scenarios. It seems like Sophia is optimized for inserts.


BerkeleyDB API is ugly. And probably its extra features impact performance.


Xodus from JetBrains (Java only / Apache 2 License) looks promising.


This looks very promising. The code is very clean and optimization is taken into consideration.


Also, what exactly is a database traversal? Is it a random read benchmark? If so, what is the distribution - uniform, zipf, or something else?


What's going on with that logo? It is completely illegible.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: