Hacker News new | past | comments | ask | show | jobs | submit login
Review my software: Keyspace consistently replicated key-value store (scalien.com)
33 points by Maro on July 15, 2009 | hide | past | favorite | 21 comments



If you want quick adoption I would have your server also speak the memcached protocol on memcached's default port if it is all possible. You'll then have a ton of client libraries already written for you and the pain of testing and rollout is greatly reduced.

Very nice homepage btw.

Edited: added "also" between "server speak". I didn't mean you should dump your current API.


We originally had memcached support but discontinued it somewhere on the way. We think supporting the memcached protocol is confusing to our end-users, as there are strong differences between memcached and a consistently replicated (disk-persistent) key-value store. See for example my answer to the post below about safe and dirty reads. Also, we support a much richer set of commands than memcached.

However, we are certainly going to bring back memcache protocol support if users demand it (the old memcache code is still in the source tree right now).

Thanks.


Yeah, I could see where it could muddle your marketing message - you might end up looking like a "memcached clone" in people's mindset. However, I think if you positioned it right you could come across as a superset of memcached's features with the bonus of an easy migration for existing memcached users. Something like "we do memcached and so much more".


You're certainly right.

One of the reasons we focused on an early release is to gauge user (client) interest. What's the level of interest in replication, consistent replication or distributed systems in general? Are people willing to use a newcomer's software? And so on..

Right now we have a few beta users, and are hoping to pick up new users with this release.


Here's what I understand: at any one time a Keyspace cluster has one master server which ALL writes go to - the other servers in the cluster replicate from it. If the master goes down another master is elected automatically, giving you your fault tolerance.

You can only write to the current master, and reading from slaves may suffer from replication lag - so if you need to write and then read back the same value you need to read against the master as well.

Your underlying storage is BerkeleyDB, so your write speed is limited to BerkeleyDB's write speed (which is pretty fast). If I wanted to scale writes I'd need to run multiple sharded keyspace instances, in the same way I might run sharded MySQL databases.

Is this accurate, or did I misunderstand something?


It is accurate and contains no misunderstanding.

Some terminology: we differentiate safe-reads, which go through the master and always reflect previous writes which have been OK'd and dirty-reads, which can go through any server, even if no other servers are up; there are absolutely no guarantees with dirty-reads, but it scales linearly, although this is not the indented primary use-case of Keyspace.

Keyspace does not contain horizontal partitioning aka. sharding, we plan to build such a layer on top of Keyspace. The exact details of this layer are still being debated, so it its exact position in design space is unclear --- it partly depends on input we get from this release.


This sounds much like the way we were using MySQL when I worked on a large consumer site. We both sharded and replicated to scale out. And, we used MemCached to bridge the replication lag. For better or worse, all writes auto-committed independently. What strategies should Keyspace users employ to counteract similar replication delay issues?

I presume Keyspace would offer much better performance because it doesn't have to worry about 95% of the stuff a SQL db does. Drizzle seems like it falls somewhere in the middle of the solution in the above paragraph and Keyspace. Is that a fair assessment?

One thing I wanted (among many) with our solution was the ability to specify async disk forcing (an oxymoron?). I'm willing to allow a database to return control to me before writing if it thinks it will be able to write something to disk within N seconds. The increase in performance for most non-transactional systems (meaning that a few seconds of data loss occurring once a year wouldn't hurt too badly) would more than make up for the risk. Terracotta, Gigaspaces, etc. are interesting to me because they facilitate async persistence.


Without being familiar with the details of your use-case, one of the nice properties of Keyspace is that it is a high-performance (async, C/C++, epoll on Linux) UNIX server --- we are able to get 100.000 writes/sec with n=3 replication on fast Linux PCs with SCSI disks.

If you use memcached appropriately, then I suppose Keyspace should be able to handle the "remaining" reads and writes?

With Keyspace, you currently cannot turn off disk syncs --- note that Keyspace is still very fast with disk syncs --- as this would break our replication algorithm, Paxos. We might make this an option if users request, which would result in a weird replication consistency somewhat like "eventual consistency", but differing from what is usually refered to under this term. Btw, making this an option is quite easy, roughly 2 minute of work (TXN_NOSYNC with BDB).


We were writing first to the master MySQL instance and then immediately putting the same data in memcached. When we read, we'd go to memcached first and then go to one of the read slaves. This way we would limit the effects of replication delay. Note that I don't necessarily like this scheme, but it's the one we used.

Is there a replication delay issue with Keystore?


Something I noticed here and on your blog - I recommend you adopt the American comma convention for writing "100.000". Many people will read that as one hundred (with a decimal point); 10e6 is a lot more interesting than 10e3!


1.0e5 is a lot more interesting than 1.0e2 :P


Great, while I will go deeper into the whitepaper, what is the business model you're counting on and what is the primary usage you're seeing?

Very interesting!


Our original vision is related to cloud computing. We suspect that Amazon will not be the only player, and other cloud providers will pop up to compete with them. Also, many small-to-medium size companies we talk to don't like Amazon (too expensive) but have a a few ten servers they wish to use as a distributed system (eg. database). We're attacking the distributed system market bottom-up, using open-source software to gain a foothold. Keyspace was originally planned to be a controler/meta-server, a low-level building block for other distributed systems, similar to how Google uses its Chubby server for GFS and BigTable --- Keyspace, like Google's Chubby, has an implementation of Lamport's Paxos consensus algorithm at its heart. As development proceeded, we made it into a more general key-value store that can also be used in eg. web 2.0 scenarios when high-availability is required.


You should make an Amazon EC2 image and offer an already set-up service (with maintennance and whatnot) for 110% the cost of EC2. That gives me a low barrier to trying it out (it should cost me like 20 bucks for a low usage trial or less) before making the time commitment to install it.


Interesting idea. We're already using EC2 as one of our test platforms anyways. We'll look into it.


The blog with some more info is here:

http://blog.scalien.com


eeww, AGPL is a big fail here if you want folks to actually use this thing.


Why?

Our intention, with the AGPL, versus the GPL is to close the SaaS loophole, so if you base a service on top of a modified Keyspace, you have to publicize your changes. But you can use it, sell it, distribute it, etc. just as you can with the GPL license.

Also, you can license Keyspace under different terms, ie. under a non-open license, at a reasonable price.


Just keep in mind that if your project is under active development, people already have an incentive to contribute their private patches back to the mainline because it makes it easier to take advantage of other people's improvements. Otherwise they have to deal with managing their own fork.


It's a big leap to try to be the very first to succeed with a dual AGPL/commercial licensed product. Hell, if you did that you'd also be the first AGPL project of any sort to succeed!

It's quite a gamble you're taking with the license, and one fairly orthogonal to your actual project.


You're right if you mean by "orthogonal" that licensing was not a major concern. We needed an open-source license, I looked around and chose the AGPL over the GPL, and that was that. I didn't think it was going to be an issue.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: