Aerospike goes Open Source

ZenoArrow · on July 27, 2014

Worth pointing out that unlike some NoSQL engines, Aerospike does have access to its own query language (AQL) that is syntatically similar to SQL... https://docs.aerospike.com/pages/viewpage.action?pageId=3807...

From the AQL query documentation you have... SELECT name, age FROM users.profiles WHERE age BETWEEN 20 AND 29 ...which is pretty easy to understand.

There´s also a Python client (Apache licensed)... https://github.com/aerospike/aerospike-client-python

ZenoArrow · on July 27, 2014

This video shows off how easy it is to manage, pretty impressive (I don't work for Aerospike by the way, I'm just happy to share about it)... https://www.youtube.com/watch?v=CF83TmR-NME

yeukhon · on July 27, 2014

The APGL license caught my eyes. Does anyone take this license into account (deciding between an Apache/MIT/BSD DB vs APGL DB) when they use it in their service / software stack.

For example, this openstack thread always keeps me alert using APGL when I am developing a solution. http://lists.openstack.org/pipermail/openstack-dev/2014-Marc...

and here is MongoDB's FAQ explaining APGL in plain English: http://blog.mongodb.org/post/103832439/the-agpl

throwaway6829 · on July 28, 2014

Where I work[1], AGPL software is strictly and unconditionally forbidden to use for anything, even things that are completely internal and will never see a public user.

The fear that our lawyers have is that, since putting up the software in a service counts as a derived work, our whole software stack (including the stuff we don't open source) will have to be opened along with it. There have to be clear service boundaries between the AGPL software and the stuff we write ourselves, and the lawyers don't trust us to write in appropriate boundaries.

It's really kinda tragic, because we actually do submit source code upstream when we make changes to open source software that we run internally. As in, if it's an OSS product that we just use for some dumb internal automation thing, we'll submit patches if the license is BSD or MIT, but as soon as GPL (especially AGPL) hits anything suddenly the lawyers get paranoid because of what constitutes a "derived work", which can be interpreted as anything that links against the software to make a complete product.

The upshot of this is, if the OSS software is on an unrestrictive license like BSD or apache, we contribute upstream. If it's GPL or especially AGPL, we simply don't touch it, ever.

[1] A very, very well known technology company.

_delirium · on July 28, 2014

That's often not an unwelcome side-effect. If you're at a large technology company, making sure companies like yours don't use the free version is a common motivation. Aerospike sells a commercial enterprise license; in that kind of a "dual-license" setup, the GPL/AGPL can function as a useful poison pill to keep from cannibalizing enterprise sales with the free version (where a common alternative would be to just not open-source at all, for fear of that cannibalization).

belorn · on July 28, 2014

A lawyers job is to think about all the things that could go wrong and to prepare for that possibility. Their job is to think what would happen if the company goes under, or what if the newly bought property would burn down, or what if the business partnership you just signed up for went sourer. Their job is not do cost-benefit analysis, or even consider how high risk something actually is. Their job is to handle the what-if's.

So the reaction you are talking about is natural behavior of lawyers being exposed to legal documents and contracts. If there is anything that could be interpreted to impact the company, their job is to consider it and think "what-if".

The question comes down to, what is a healhty way to handle the result of lawyers paranoia. Best-practice is to do a cost-benefit analysis and balance the benefits with the insight of the legal advice. Second-worst is to avoid anything with a risk, regardless of benefits, in order to avoid it. Worst choice is to ignore the lawyers. Most companies, including the "very well known technology companies", pick the second-worst option for anything that is not critical to the company survival. Its clearly not the best option, but it keeps the status quo.

ZenoArrow · on July 27, 2014

I can't see it being an issue. As mentioned in the useful MongoDB link you shared, the licence will require sharing only when modifying the database code, but not require automatic sharing of the rest of the software stack.

Have there been any court cases involving AGPL violations? I wonder if some of Gil Yehuda's fears are partly out of lack of clarity on where the reach of the AGPL ends? For example claiming the MongoDB drivers 'violate the AGPL license', I´d prefer to see a response from GNU on this.

e12e · on July 28, 2014

I seem to recall neo4j (could be it was another nosql db/company) had some strange ideas about the agpl. Of course if the drivers are covered by agpl and linked into your application, then you might have to create a new api "border" (eg:HTTP REST) between whatever code you want to license differently and the agpl server. I don't think the agpl should be a problem in most cases - at any rate you don't have to contribute patches up-stream, only to those of your users that ask for the code (if any). Unless upstream is also using your service, that is.

A perhaps more interesting question is how they manage contributions and pull requests - if i'm using the server under agpl, it seems natural to contribute code under agpl. But now that code can't be used in the commercially licensed upstream "fork" unless that fork is sold under the agpl... So either contributors will have to donate code, or sell code to upstream for use in the "closed" project.

I could see that get a bit hairy with user-contributed bug fixes?

gyehuda · on July 30, 2014

So far, AGPL has not been tested in court. GPL has. The scope of Derivative Works (reach) has as well (but the case law in this area is more complicated than most open source developers are aware of since copyright law is understood differently based on which federal circuit court hears the case). The more challenging consideration for me is the AGPL/Apache (DB/Driver) method that these companies are using -- and especially when you have Apache licensed community contributions (these apparently violates AGPL's own terms). I see this as an area that can backfire against the open source community -- and indeed I'd also like to know if FSF, SFLC, or GPL-violations takes a position on the use of Apache drivers to AGPL DBs. If so, it would be best for the open source community to know about this soon rather than after many have placed themselves in a position where their best option is to pay the vendors that have somehow managed to achieve lock-in. Ironically, this is the opposite of the goals that many in the open source community endorse. So if FSF takes a strict position on AGPL, it could accidentally result in the opposite of the freedoms we have been working for.

I'm also fascinated when companies use a license and then state in their FAQ that they didn't really mean all the terms of the license. If this comes up in court, the judge might take the FAQ into account and interpret the intent of the licensor, or might just look at the license text itself. It'll be interesting to see what happens. My advice for now is to assume the text of the license to be what the licensor intended and my hope is that companies that use open source licenses simply use the licenses that match what they mean. If they really mean that they want you to pay, they should just be clear about it and people can decide if they want to pay.

yeukhon · on July 27, 2014

Well in the case of companies like Google or Facebook, they might actually modify the database for various of reasons (e.g. internal policy, speed, etc). I don't have first hand experience with modifying databases so I thought I would ask :)

_delirium · on July 28, 2014

If someone modified the database and deployed it publicly (or as part of a webapp that was deployed publicly), they'd need to share the modified source, yes. AGPL is roughly GPL but where the traditional definition of "shipping software" is expanded to include deploying on a network.

My guess is that they have two motivations, both of which are fairly traditional GPL motivations: 1) sell AGPL exceptions to commercial licensees; and 2) prevent a competitor from making a private commercial fork, where the competitor improve the DB and licenses their version to clients, without sharing the source to their improvements.

e12e · on July 28, 2014

To be pedantic: deployed it a an external service, not necessarily a public service (I suppose to a legal entity that would not normally share license rights, such as an inividual not part of the organisation or to another organisation). Those external users would have to be given an option to access the full source with modifications.

As for the "part of a web service"-bit, I'm not sure what the agpl's actual "reach" is. My understanding is that with a (modified) db under agpl powering eg a web app runing in php, the end users (accessing only the web server) would not be entitled to the db source. If the agpl covered the web srrver itself or a php library on the other hand, the users would be entitled to that code?

Similarly if one sold a modified db-as-a-service, modifications would be covered by the agpl.

belorn · on July 28, 2014

While its interesting to theorize about databases and AGPL, I am rather sure there hasn't been any actually case or a legal reason to think that the database would become a derivative work when used together with a webserver.

If it was, the EULA for SQL server and oracle would have to include copyright permission for derivative works. That we do not see that should be a clear sign that the scope of copyright has not reached that far yet.

richardfontana · on July 30, 2014

Matt Asay of MongoDB, Inc. (disclosure takes care of Asayroll Rule<tm>) responded to that openstack-dev posting, FWIW: http://lists.openstack.org/pipermail/openstack-dev/2014-Marc...

seiji · on July 27, 2014

AGPL is essentially the "corporate coward" license. They want to capture all of your private changes to basically get free (legally mandated, zero "community good will") development resources. Companies think it's "safer" than BSD or straight up GPL because lawyers feel "omgz, source codes, zero cost IP copies, instant competition!"

The next step after AGPL will probably be BrainGPL requiring you to publish all thoughts you have about any code you look at ever.

arrryarr · on July 28, 2014

You could just pay for a license if you're that paranoid. It's remarkable to what extent people will complain about free not being "free enough". I assume you haven't contributed a line of code to the project, so it's a bit presumptuous to knock it.

throwaway6829 · on July 28, 2014

> I assume you haven't contributed a line of code to the project, so it's a bit presumptuous to knock it.

It's not that simple. Even if you're a company that wants to use the software and contribute changes upstream, it's still a dumb idea to use AGPL software.

Because of the loose definition of what constitutes a derived work, modifying AGPL software (even if you put those modifications upstream) means you might have to open source your entire software stack. The AGPL is not like the LGPL, it doesn't contain any exceptions for linking against it in an overall product. Unless you have clear service boundaries, you are creating a derived work every time you use AGPL software in a service.

In my company, our lawyers outright forbid the use of AGPL software altogether for this reason. Even software that's only used internally. As _delirium mentioned above[1], that's not entirely an unwanted thing from the licenser's perspective: they typically do this so you'll buy their commercial license. But don't assume the only people who don't like AGPL are non-contributing freeloaders.

We submit tons of stuff upstream on a regular basis at my company, but we simply can't do it with AGPL software. Seeing any reference at all that our company (which has some of the deepest pockets in the planet) uses AGPL software anywhere in its stack will open ourselves up to lawsuits immediately. So we just don't touch it, ever.

[1] https://news.ycombinator.com/item?id=8095024

michaelmior · on July 28, 2014

You make it sound like it would be better not to open source code at all.

riyer · on July 28, 2014

Frankly speaking I wouldn't care if I can read code and learn something new ....

pwnna · on July 27, 2014

You know, when I first saw this I thought it's the aerospike engine: http://en.wikipedia.org/wiki/Aerospike_engine

wlesieutre · on July 28, 2014

Same here, I'm assuming that's what the name is borrowed from.

ZenoArrow · on July 27, 2014

That's pretty cool in its own right, thanks pwnna.

remon · on July 28, 2014

"A database literally ten times faster than existing NoSQL solutions, and one hundred times faster than existing SQL solution".

The odds of that claim being true and/or supported by benchmarks is somewhere well below the 1% mark. Why do companies keep making those sort of obviously questionable claims knowing the negative backlash that surely will follow. Boggles the mind really.

bbulkow · on July 28, 2014

CTO, co-founder, initial coder here.

Aerospike really is a lot faster than Mongo and Cassandra. It's open source, and you can run whatever benchmarks you'd like yourself. It's about as fast a well-tuned multi-core sharded Redis system, except you don't have to write configure the sharding, and you can have a combination of RAM or Flash, different data in each, of course Flash is cheaper/slower but that's why we give you both.

You can run a single c3.8xlarge on amazon and see 1m tps, or 250K on a c3.2xlarge. We're doing a lot of benchmarks on EC2 and GCE because they're "reference platforms" that you'll all believe. More details in the coming weeks from us, or publish your own.

Just try it yourself; this isn't marketing.

Everyone I talk to coming from Cassandra is seeing a server reduction of 4x~5x, with higher levels of stability (overhead for peaks). I was at a conference late last week and the company I was with (adform's founder, Jakob) said they had a major Cassandra outage that week that cost them a lot of money, and Adform is a Cassandra contributor and knows what they're doing.

Same thing with Mongo shops. They do about 5x reduction and see much higher performance.

Technical points of why we're faster:

* Coded in C, multitheaded, with reference counting.

* Avoid malloc, but if you have to malloc, avoid the CLib memory allocator. We do a lot of slab allocation (a la memcache) and use JEmalloc for variable sized allocs.

* Use epoll directly and be careful about IO. Don't use mmap, which is 4x slower than read and write.

* Code directly to device, with your own data layout. Databases are a reliability layer, everything else is extra complexity. O_SYNC is better than fsync.

There's a lot of smaller tricks in the code, but it all adds up to speed, and I don't expect you to believe me. I've spent 25 years in silicon valley writing high performance software, and so has most of the team. We come from a strong background of embedded, settop box, cell phone programmers.

Let me tell you a short story. I brought my particular bag of tricks to a streaming video server company in the mid 90's. I produced an internal product that was 100x faster - that is, required 100x lower cost hardware than the company's existing product (133mhz Pentium instead of high end sun machines). The product got buried - because the sales guys couldn't make their commission checks.

I'm tired of that mentality.

Aerospike has been running in production at seriously high loads for years. I work with a lot of guys who say - "What else am I going to use?" For the use case where you want KVS, with decent API support (redis-like lists and UDFs), and a little analytics, and scale-out adding nodes under production load, it's the right choice.

If you're thinking of a Mongo KVS, Cassandra, Redis, you really need to look at Aerospike. Do yourself, and your startup, a favor.

( And, yes, the name is based on the Aerospike engine, but we were thinking more of the Trident II D5, which uses an Aerospike at the front, to essentially extend the aerodynamic length of the missile. The problem with sub-based missiles is they have to be short to fit in the sub, and a use of the aerospike was one of many techniques for making the US based deterrent accurate. We used the name Aerospike because there are a lot of small techniques that make an "unbelievable" difference - that's what engineering is, compared to theory. )

PM me directly if you're having trouble running benchmarks or anything.

_benedict · on July 29, 2014

As a Cassandra Committer, I would really appreciate you updating your ycsb integration so we can corroborate your benchmarks, as I currently doubt their authenticity/honesty. It looks very much like you compared non-durable performance in Aerospike to durable performance in Cassandra.

It also appears from your documentation that you do not support any kind of safe active-active multi-dc mode, even in your paid-for offering (http://www.aerospike.com/docs/architecture/xdr.html), so even if faster than Cassandra, users should carefully read the fine print before deciding to use Aerospike, unless as the application grows in size and necessity (i.e. uninterruptibility in event of disaster) you want to find multi-dc is not actually viable for you.

NB: I'm unaware of any adform contributions to Cassandra, although a senior developer there has filed a few bug reports.

remon · on July 29, 2014

100x faster != 100x cheaper. Anyway, I'm familiar with the codebase of both Cassandra and especially MongoDB and I just don't see room for a 100x performance improvement without sacrificing something. A bit of I/O tweaking certainly isn't going to result in such improvements nor will "Coding in C" which in and by itself does very little for performance at all.

That said, you are correct that the proof is in the pudding and everyone is free to do their own benchmarks. I'll do my own and see what's what.

derekchiang · on July 29, 2014

Hi Brian, could you recommend some resources for learning to write high-performance code?

And also, would you care to explain why O_SYNC is better than fsync?

Thank you!

regularfry · on July 28, 2014

> Don't use mmap, which is 4x slower than read and write.

For what sorts of access patterns?

bbulkow · on July 28, 2014

I believe my numbers were 4k random access. Admittedly, this was an ages-ago kernel (2.6.18 derivative). mmap as a pattern has issues with concurrency, because you burn threads, and you can't cancel them. The only way out is to try to predict when an mmap page access will block and thread it differently, but then your code path gets longer.

There might be single-thread single-core patterns where mmap works best, or if newer kernels have changed. The reality: you have an IO, you put this "action" aside, you need to be woken up when complete, do you want to burn a thread or an IO context?

We also have recent numbers about using Linux's epoll / eventfd / signal mechanism, like Nginix seems to use, and its so deeply inferior to doing Linux AIO that its hard to choose that path, as seductive as single-event-loop is.

mxpxrocks10 · on Aug 5, 2014

hi! how does it compare to tokumx?

ZenoArrow · on July 28, 2014

It's likely from the marketing team, not the engineering team. That doesn't mean the product itself cannot be solid.

remon · on July 28, 2014

Nobody's arguing that the product isn't solid. Such claims are just a recipe for negative attention and are almost certainly not correct. And actually it'd be very odd for a marketing team to not verify their front page claims with the engineering team. This is a tech company afterall.

perlgeek · on July 28, 2014

Now I'm waiting for aphyr to test aerospike under cluster partition :-)

remon · on July 29, 2014

Indeed.

alexnewman · on July 28, 2014

OK I can't find any tests. It also disturbs me when people "open source" projects without any real revision history.

Also is it true that https://github.com/aerospike/aerospike-server hasn't been updated?

cstivers1978 · on July 28, 2014

Aerospike maintains a comprehensive set of tests for the server. Every commit goes through functional and regression tests. Each release goes through a gauntlet of performance and clustering tests. The test system is a standalone system from the database, and is integrated with our CI system. Unfortunately, we have not been able to publish our test system, yet.

ryanobjc · on July 28, 2014

A system without comprehensive tests is one that cannot be changed.

The big question is, do they have tests or not?

alexnewman · on July 28, 2014

AFAIK basically not. Check my above comment. Those they have seem to be cryptic and few and far between.

TheCondor · on July 27, 2014

It's an impressive cache. Last I looked at it they were using lua and it looked like they were going squarely after mongo

ZenoArrow · on July 27, 2014

The core code is written in C, but they're definitely using Lua in places, I believe they've incorporated some code from AlchemyDB... https://code.google.com/p/alchemydatabase/

From what I've read it could easily surpass Mongo, just look at the cost savings... http://www.datanami.com/2013/09/06/aerospike_says_secret_to_... "The second comparison (a video ad serving platform) had much bigger requirements, including a 5TB database processing 500,000 TPS. The hybrid SSD-DRAM setup running the AeroSpike database was able to handle the load with just 14 servers, at a total cost of $322,000, compared to 186 servers using NoSQL running on clusters of servers that use a lot of DRAM and cost $5.6 million."

They´re ACID compliant as well, which Mongo is not (AFAIK)... https://www.youtube.com/watch?v=nnxj77NNEeg

nlavezzo · on July 28, 2014

They don't have ACID transactions. We had to put up a page to dispel some of the misuse of the term ACID. Aerospike's excerpt:

"Aerospike does not provide true ACID transactions. Just like Cassandra 2.0, Aerospike only provides compare and set, and misleadingly labels it as ACID."

https://foundationdb.com/acid-claims

ZenoArrow · on July 28, 2014

That's interesting, thank you. In practical terms, what does ACID provide that compare and set does not?

nlavezzo · on July 28, 2014

The page linked goes into more detail, but among other things, Compare and Set operations are only concerning one data element at a time - meaning you cannot do Atomic updates to multiple keys. The ability to do multiple key updates atomically makes it possible to build higher level abstractions by reliably combining data from multiple keys under concurrent workloads.

Basically, transactions that can span an arbitrary number and set of keys are what makes it possible to build rich data models from simple ones. SQL databases are a perfect example of this - most use a simple transactional data store on the bottom to store complex relational data structures. A single SQL operation may require many key-level updates - but this is OK if you can wrap them all in ACID transactions.Without ACID transactions you can't guarantee data consistency because keys will be getting updated at different times, allowing for a mix of old and new values.

It's a shame to see vendors trying to change the meaning of ACID to fit the limitations of their databases. It means more confusion and bad decisions in a market that needs clarity and honesty for people to make the right decisions for their applications.

baotiao · on July 28, 2014

Yes, I really hate this. When I first saw that Aerospike support ACID transaction, I thought, Wow, it is really amazing. After I read the paper, That "ACID" means one key transaction, it make me dispoint, I don't even want to see there code again..

riyer · on July 28, 2014

Well while claiming ACID specifying that it is only for single record transaction seems to stated clearly.

As you were describing, there could be many changes performed when doing single record operations as well e.g multiple column updates / secondary index updates.

Does it really need to be multi record transaction to claim ACID ??

theossuary · on July 28, 2014

I'm not going to comment on whether or not Aerospike is ACID, I have no idea, but I can give a bit of info on CAS vs ACID.

CaS (Compare and Swap, or Compare and Set; they're almost identical, and don't really have a clear distinction) is the process of validating a record's value before performing an update. This helps a lot with Consistency (the C in ACID), but doesn't guarantee the other three (Atomicity, Isolation, and Durability).

Aerospike might have systems in place to address all of ACID, but if they're claiming CaS is ACID then they're just lying.

Look at the Wiki page on ACID ( http://en.wikipedia.org/wiki/ACID ), it's actually pretty good.

deschutes · on July 28, 2014

A straightforward usage of CAS doesn't provide transactions. In other words, there is no way to update multiple records atomically.* For instance, perhaps you wish to update two balances to reflect the result of the transaction. Using CAS, a reader may observe that only one balance has been updated -- the reader sees inconsistent data.

* no sane (or performant) way.

wheaties · on July 28, 2014

It's called marketing and as evidenced by MongoDB, people will believe anything you tell them. Web scale my ass Mongo.

ris · on July 27, 2014

"just look at the cost savings"

Purported cost savings.

arrryarr · on July 28, 2014

There's an article from an AppLovin (RTB ad-network) engineer (http://liviutudor.com/2014/07/24/of-advertising-and-scaling-...) talking about how Aerospike DB is essential to their serving up 15B (as in billion) ad impressions per-day.

ZenoArrow · on July 27, 2014

An alternative example... http://www.aerospike.com/blog/big-bucks-for-bluekai-snapdeal... "Snapdeal replaced 10 MongoDB database servers with the Aerospike database on two servers on Amazon EC2—an 80% reduction—while continuing to rapidly scale the business."

ris · on July 28, 2014

Again, an alternative example from the database vendors. Please, display some critical thinking.

arrryarr · on July 28, 2014

Critical thinking includes not suggesting that someone should prove a negative. Those are independent benchmarks with the sole 'violation' of being re-posted on the vendor's site. Unless you can provide metrics that disprove them, they stand.

ZenoArrow · on July 28, 2014

It's an example case from a customer of the database vendor. I'm unlikely to find anything better until after people start playing around with the open source version.

meritt · on July 28, 2014

I don't think this press release has nearly enough buzzwords.

jaseemabid · on July 28, 2014

Yep. Everything about Aerospike so far have been filled with buzzwords. There was a flash talk by an employee of theirs recently at a conference in Bangalore and it was just marketing BS. I'm still skeptical.

arrryarr · on July 28, 2014

I think the independently written article from AppLovin (link above) points to it actually working as promised.

rasz_pl · on July 28, 2014

I dont know, are they web scale?

alexnewman · on July 28, 2014

Common atleast has some tests posix4es-MacBook-Air-3:aerospike-common posix4e$ cloc src/main/ - 6603 posix4es-MacBook-Air-3:aerospike-common posix4e$ cloc src/test/ - 1247

I worry about test coverage stats like that

Not to mention if you look at the tests

/ * TEST CASES /

TEST( msgpack_roundtrip_integer1, "roundtrip: 123" ) { as_integer i1; as_integer_init(&i1, 123);

        as_integer i2;
        as_integer_init(&i2, 456);

        as_val * v2 = roundtrip((as_val *) &i1);

        assert_val_eq(v2, &i1);

        as_integer_destroy(&i1);
        as_val_destroy(v2);

}
Not exactly terse and readable

shabinesh · on July 28, 2014

Aerospike guys were at a big data workshop at Bangalore last week. It's performance is pretty impressive. They claimed about >1M TPS with just 3 nodes and compared it with Couchdb which is claimed to achieve 1M TPS in 330 nodes. But unclear about their benchmarking method.

riyer · on July 28, 2014

I believe that was cassandra... latest from netflix on that http://techblog.netflix.com/2014/07/revisiting-1-million-wri...

remon · on July 28, 2014

"But unclear about their benchmarking method". That can be said for every single performance claim they make. There's a rather distinct lack of objective facts.

manojlds · on July 28, 2014

Fifth Elephant - https://fifthelephant.in/2014/

qmaxquique · on July 31, 2014

You can try Aerospike in a Terminal.com container. I've created a snapshot with a simple CRUD example at https://terminal.com/tiny/yEITIwNyLT

rolfvandekrol · on July 28, 2014

Hmm, if the community edition server is AGPL licensed, are they even allowed to have an enterprise edition that is not AGPL licensed? I suppose the enterprise edition is an altered version of the community edition, so can they be 'forced' to publish their changes?

infinite8s · on July 28, 2014

This seems to be a common misconception about the interplay of copyright and GPL style licenses. Since they own the copyright, they can relicence it however they choose. The GPL and other open source licenses just give non-copyright holders additional rights beyond what copyright law provide (which for most works is nothing beyond a bit of fair-use). In that way the GPL is a clever hack of copyright, since it relies on the default of no-rights granted by copyright to enforce its terms.

rolfvandekrol · on July 28, 2014

I get that part, but how does that work when other people also contribute changes for which they own the copyright and which are also AGPL licensed, but then to the Aerospike devs?

nl · on July 28, 2014

Usually the way it works is the company requires copyright assignment to them.

coldpie · on July 28, 2014

Right. The owner could also refuse outside contributions, or perhaps there simply haven't been any outside contributions.

infinite8s · on July 28, 2014

Well, since the project was just released as opensource, presumably all work on it so far has been by employees of Aerospike (who automatically grant copyright to their employer by virtue of the work-made-for-hire exception).

gyehuda · on July 31, 2014

I see this as one of the fundamental problems with the AGPL/Apache licensing patterns. The company can create an Apache driver, but it's not clear when a member of the community creates a driver if they can also use the Apache license. If all they do is present the driver, then presumably yes, but if they publish a service using their driver calling the AGPL licensed code, their driver should be subject to AGPL's terms which would apply the AGPL to it.

Alternatively they could assign the copyright of their code to a company? But why would they be motivated to do so?

This creates an asymmetry of rights where the company can do things that community members cannot. This seems to the very opposite of what the Open Source movement has been all about.

acdha · on July 28, 2014

Most projects like this have a contributor release which either requires outside contributions to either be entirely licensed to the project or have a specific grant allowing the founding company to use it as part of their non-GPLed release.

weitzj · on July 28, 2014

If the server uses the AGPL instead of the GPL, why does it matter that the clients are under an Apache License? I thought if you use the AGPL you have to contribute back the client code as well, when you use the server remotely.

DoubleMalt · on July 28, 2014

That is not true.

If you expose an service based on AGPL licensed service, you have to make the source code available to the services that use it.

For example you could modify WordPress (which is GPL licensed), put it on your server and let it serve pages without providing your modified source code to anyone.

If WordPress was AGPL licensed you would have to provide your modified source code to anyone using the system.

This also effects services that use libraries that are AGPL licensed (like newer versions of iText), but not services, that use other services.

The point is AGPL only adds that if you consume it over the network, you have the right to the source code. If you use it as a network service, for your webapp, your webapp is the consumer.

MongoDB has the same licensing model, and nobody sued Foursquare for the source code, so I guess this is legally tested ;)

teddyh · on July 28, 2014

I’m sorry, but people seem to believe all manner of wild and crazy things about the AGPL – this means that you have to back up your claims with references.

ZenoArrow · on July 27, 2014

To give some idea of speed...http://www.aerospike.com/blog/aerospike-doubles-in-memory-no...

It's got some impressive responses from people in industry too, check out this post about its use at eBay... http://www.aerospike.com/blog/ebay-helps-retailers-know-your...

ris · on July 27, 2014

"To give some idea of speed"

Purported speed. Please, we've all seen enough wildly hyped NoSQL databases now to remain a little cynical, haven't we?

ZenoArrow · on July 27, 2014

Here's a benchmark from 2013 from Thumbtack Technology, comparing Aerospike, Cassandra, MongoDB and Couchbase... http://www.odbms.org/2013/01/ultra-high-performance-nosql-be... http://www.odbms.org/wp-content/uploads/2013/11/NoSQLBenchma...

In this benchmark, Couchbase gets some impressive results, but it does appear that Aerospike is the overall winner when it comes to speed and reliability. Anyway, the code is free to install, it's easy enough to validate the speed claims... http://www.aerospike.com/blog/aerospike-doubles-in-memory-no...

ris · on July 28, 2014

In the former study, if you read it carefully, Aerospike were essentially able to choose the hardware for the test.

"Anyway, the code is free to install, it's easy enough to validate the speed claims"

So why don't you do so and come back with your own results that can at least pretend to be neutral instead of spreading empty hype around here?

hackalyst · on July 28, 2014

/me aerospike employee here - we did a live demo of running aerospike server on AWS EC2 during the fifth elephant last weekend at bangalore. The demo had 1M TPS and latency for 80/20 load (80% read, 20% write) was <1ms for >99.8% of queries.

This demo was done on 4 r3.4xlarge nodes - We did earlier runs on r3.2xl as well with similar results.

https://twitter.com/anshprat/status/492971667493122048

I didnt do a latency screenshot grab but those who saw the demo can comment..

ZenoArrow · on July 28, 2014

"So why don't you do so and come back with your own results that can at least pretend to be neutral instead of spreading empty hype around here?"

I don't own a computer, unless you count the smartphone I carry in my pocket. I somewhat suspect I'm in the minority on this on a site like HN.

threeseed · on July 28, 2014

I see Aerospike being the fastest but nothing is mentioned anywhere about error rate or reliability.

Also that test is interesting in that it favours Aerospike's use case i.e. when you have enough data to comfortably fit on SSDs. Somewhat unfair given that the majority of people using Cassandra would be doing so with large data sets.

blitzprog · on July 28, 2014

How does this perform in comparison to Riak (cluster, v2.0)?

simi_ · on July 28, 2014

> Aerospike’s mission is to rain bullshit on the entire field of databases by offering an addictive proposition: a database literally ten times faster than existing NoSQL solutions, and one hundred times faster than existing SQL solutions.

Gotta love Disrupt to Bullshit: https://chrome.google.com/webstore/detail/disrupt-to-bullshi...