Hacker News new | past | comments | ask | show | jobs | submit login
NoSQL is a bullshit marketing term (groups.google.com)
69 points by jamesgolick on Nov 13, 2010 | hide | past | favorite | 39 comments



All marketing terms are bullshit, but what is the point?

The point is to use a single word that people can rally behind to be disruptive. In my world, NoSQL is a movement about using the right tool (especially open source) for the job. It is a movement about building these tools to the point where they can be consumed and put into production quickly.

There is no value in being precise (in marketing) since then we are just cats that need to be managed. By being imprecise and use a bullshit marketing term, we gain collective market power.


Yep, giving names to high-level abstract concepts is one of the most important skills humans have.

I mean, you wouldn't want to say "database management systems that differ from classic relational database management systems in some way, which may not require fixed table schemas, and usually avoid join operations and typically scale horizontally" in every third sentence would you?


Or, as we say at the meetup: "DMSDCRDBMSWMNRFTSAUAJOTSH"

Now, that is marketable.


I think you mean D23H...


I don't know that it's really much to do with using the right tool for the job.

Risk, reliability, performance expectations, etc. I don't know if I've ever met a project of any significant complexity that would have been more of a success in any area, deployed any quicker, or developed any easier by not using a traditional RDBMS.

There are those projects out there, I just don't have any personal experience with them. But what's more, I seriously doubt 99% of people using NoSQL solutions do either.

Databases cross a funny line. There's language, syntax, a lot going on under the covers, and complex technical decisions and trade-offs at every turn to maintain durability and performance on modern hardware.

Sure, I can set work_mem in Postgres to 1GB and sort this query in-memory like a king, but then I'm (worst-case-scenario) reducing my 20-connection server by 20GB it might have used for cache instead speeding every other query up.

I can imagine issues are as simple as having the right LRU in front of my data, or I can guarantee durability and achieve really amazing performance if I'm willing to invest myself into learning why particular design decisions, made over years by some genuinely brilliant people, were made.

Truth is, just to use one of these systems, you mostly don't have to care to meet your own requirements. And I think people take that for granted. When performance requirements are no longer being met, it's easy to blame the tool (the RDBMS), but I think a much safer assumption is to blame the developer and hit the books. The PostgreSQL 9.0 Administration Cookbook and PostgreSQL 9.0 High Performance books are surprisingly good reads. It doesn't take all that much personal effort really. Less certainly than switching tool-chains.

This is one of those comments you write, goes off-track, and then decide to just delete. Well I'm going to post it damn-it. :-)


> There is no value in being precise

Are you an engineer? I ask because I find it very hard to believe an engineer would write that.


I am.

I'm thinking on how to rewrite that. My aim was to say that when we communicate to non-engineers, it should be imprecise and more sales pitchy.

When we need to sell our ideas. It doesn't help us to build an awesome taxonomy when we need to sell the right tool. More often enough, when we try to get precise with non-engineers, it owns us. It creates the need for a manager to organize the cats.

If we were precise, then we would have 10+ movements. a "column store" movement. A document movement. A map reduce movement.


Who the dick is selling anything? I thought we were building things. Call it DongDB, I don't give a damn, I just need to push some shit that's not going to break.

It does not matter what we call the products or the product category or the "movement". Products are built on technology, and the vast majority of the technology in the NoSQL space is pre-existing and appropriately named. A column store is a column store, not a CassandraStore(tm). A bloom filter is a bloom filter, a vector clock is a vector clock, consistent hashing is consistent hashing, ad nauseum. Those who are actually interested in understanding the tools spend their time reading technical and academic documentation that describes the foundation, building blocks, patterns, and concepts that are in play.

What matters for products is not what they are called, but whether they will work or not for your problem set. The best way to help this "movement" is to write code, run code, and provide examples, use cases, and data about real world environments that will help people choose the appropriate tools and approach to manage their data. Compared to actually using and building things, chatting it up about terminology isn't going to make a god damn bit of difference. Which is to say, get things done or shut up.


It doesn't help to build something if you can't sell it.

The NoSQL movement could be called a marketing and sales effort to sell people those ideas that you listed. Just because you know what vector clocks and consistent hashing mean doesn't mean that all the managers and all the technical directors in the world do.


Speaking as an engineer, there is great value in being imprecise.

Think of it as a shallow search. When we start a conversation, I first want you to be overview-y and high-level. Then I will (or you will) steer the conversation into the direction where more detail is required.

Wouldn't want a conversation polluted with unnecessary detail. It would drag on forever and bore even the most engineering of minds ... but damn it would be precise!


I'm an engineer and I agree with Swizec and mathgaldiator. As an engineer I see great value in abstracting layers until any specification or detail is needed. If NoSQL as a term helps me convey a meaning of "newish edgy non relational datastore" then the term serves it's purpose, the same way people used Web 2.0 to describe pretty gradients, huge plasticky buttons, and shinny bubble navbars.

It may seem like a fad to most people but it still is the easiest, shortest way to describe a group of aesthetics, use cases, or features in only one word. Think of it this way: We do it all the time with races, skin colors, and religions. White, Black, Muslims, Christians, Latinos, Asians... RDMS, and NoSQL...


"Are you an engineer? I ask because I find it very hard to believe an engineer would write that."

How about "In this case, the cost of being precise outweighs the benefits"?


NoSQL = "We don't have to solve every damn Database problem with Oracle 10G, despite what our (brought up in enterprise IT) tech-services group keeps insisting to our COO every time the engineering group rolls out a new product that uses something different. It does mean that we can't rely on the crutch of (1) optimizer/statistics gather, (2) RAC, (3) Dataguard, (4) MASSIVE spindle group SANs w/ vertically scaled Sun M9000/IBM P795 Database Server and our small group of DBAs to solve all of our query, HA, DR and scaling problems, but instead have to shift the responsibility for solving those problems onto the Engineering team, which may or may not make sense depending on whether we believe our core competence should be solving those problems with engineering instead of (1),(2),(3), (4).

NoSQL is a phrase that means "use the right tool for the job" instead of just throwing endless amounts of money at low-risk, but 100x (1000x?) more expensive solutions that will eventually fail at internet scale load anyways.

NoSQL means taking on more of the engineering risk, rather than shifting it to your vendor.


Oracle Coherance seems pretty NoSQL too (sharded key value store, no or very slow JOINs) too BTW. So even the 'Oracle guys' seem to be adopting this approach for some apps...


I thought the whole point of the term was not to refer to products which solve a particular problem, but to simply refer to anything other than the traditional relational databases that use SQL.


In my opinion, NoSQL is bullshit because it markets itself on what it's not rather than what it is, which is not so easily explained in one simply buzzword.

Grouping Cassandra, CouchDB, MongoDB, Redis, etc. is the same as grouping MySQL with git - they all store data in one way or another, but the data they store and the way they store it varies wildly.

It's also sad, because all these "alternative" databases offer a lot of features that RDBMS' don't, and they should be promoted instead of stuck under an "umbrella" term to protest against SQL databases.


Quite frankly, the anti-NoSQL-as-a-term is getting a bit tiring. As someone who has been working on "NoSQL" systems since 2000, I have been extremely thankful that there is finally a broad-based movement exploring alternative datastore architectures. It used to be extremely depressing to have discussions with fellow engineers on the benefits of alternative architectures and for them to simply reject it on the grounds that "SQL DBMS must be the best since they've had decades of work behind them". At the very best, someone might have been radical enough to contemplate the use of an Object Database.

In contrast, thanks to the NoSQL movement and the exploration of alternative models that it has encouraged, the quality of discussions is extremely different today. More and more engineers are aware of the benefits and issues with various models and are much more open to alternatives. So when I say that I am extremely thankful to whoever coined the damn term and for efforts like NoSQL Summer, I really mean it. It has really improved my quality of life.

Now I get that "NoSQL" isn't a 100% accurate term. But what marketing term ever is? Take something like AJAX — most "AJAX" apps have never dealt with XML, yet the term has been extremely useful. It helped solidify a broad-based effort to explore using JavaScript and thanks to the "AJAX movement" of a few years ago, we now have XHR in all modern browsers and awesome libraries like jQuery!

The real issue as I see it is that projects are keen to differentiate and are thus reacting to being lumped together with extremely different systems. Now no-one who understands the technologies is ever going to compare the likes of Redis, CouchDB, Neo4j, Cassandra and Hadoop as equivalents, but it is understandable that projects are afraid of being considered equivalent by those who are simply choosing a NoSQL system for their project without understanding the differences.

This follows onto another issue — a leader (or two) often emerge once a new domain has been established and the "smaller" projects are cautious of being sidelined by "big boys" like Cassandra/MongoDB/Hadoop. To continue with the AJAX example, in the early days there used to be a whole bunch of options regarding JavaScript libraries: Prototype, jsolait, MochiKit, MooTools, jQuery, Dojo, etc. In contrast, nowadays, jQuery is the default choice for the vast majority of developers. I don't think that there is such a clear winner in the NoSQL field yet. In fact, given the massive fragmentation, we probably haven't even heard of the final winner yet!

That is not to say that the concerns of the various projects aren't totally valid. But the issue is not with the "NoSQL" term but rather with differentiation and understanding — both of which can only be solved by better communication. Phrases like "Online Request Processing Systems (ORPS)" or even "Alternative Datastores" aren't exactly catchy marketing terms. NoSQL may not be perfect, but it's here and it's more than good enough. So can we please stop bashing it and focus on coming up with clearer differentiators? Thanks!


You didn't read the post, did you?


I don't see any point in the parent's comment that indicates he didn't read the original messages. He's making clear that he doesn't like the new trend of calling the NoSQL term marketing bullshit, as the term actually serves to differentiate one type of store from another. I'm pretty sure this has to do with the original post.


I think I see your confusion. The headline is not the message thread. It's a link you need to click. Then you can read the actual message thread. Hope this helps!


I don't see why you where down voted, but I did read the message thread and both my comment and the comment of the grandparent make sense after having read the series of messages.


The parent to your comment is a great example of the kind of content-free snarky comment that Reddit rewards handsomely and Hacker News tends to punish. Ben's new here, but he'll get the hang of it. Took me a bit, too.



Your aggressive posturing in this snarky response caught my attention. Seeing your bio, I was intrigued to discover that you work for a multinational agricultural biotechnology corporation? I had no idea that makers of "Roundup" were involved in info security. Looking forward to learning more from you.


I think you might be confusing Matasano with Monsanto — unless Thomas hasn't told us something... ;p


I really didn't mean to sound aggressive. Sorry.


aside: has anyone read a good discussion of Codd's original relational paper? http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.98.... Thanks!

I'm confused by his five example for "1.2.3. Access Path Dependence" (page.378), where an app would fail if the data representation changed, because I think an app using a relational store would also fail if the relations were organized differently. I can see some possible resolutions, but the paper doesn't address the issue...

I concede that it's hard to assess a proposed approach when it doesn't actually exist yet; but I think that if you raise an issue with the existing approaches in a paper, it's reasonable to also assess your own proposal with respect to that issue.

e.g. maybe he imagined automatic views to convert the underlying relations (so that different relations are identical if they represent the same information...); or a manual conversion layer with views (but the same could be done for the other store!); or maybe he was only thinking of different physical representations when he wrote that part and it didn't occur to him that different relations also might be used

EDIT http://www.aisintl.com/case/library/Date_Birth%20of%20the%20...

I think he's saying that while the relational model has the same problem of retaining compatibility for old apps when it evolves, it this is * easier * to do this with the relational model. ie. the "number of access paths" for old apps becomes "excessively large" for non-relational models. He talks a bit about the complexity of representing different queries later on, but somewhat obliquely and doesn't draw the connection (and I don't quite follow what he means in the second last paragraph of section 1.5, where he mentions n!, 2n-1 and n+1 - I understand it so little, that I think there might be a typo).

Ah! He seems to have addressed it more directly in a previous, less-cited IBM-only paper from 1969... to which I happen to have a link right here: http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.173.5...


NoSQL is a bullshit marketing term coined to insight ire in the heart of DBA's.

No it's not. It's a typical human response to needing to give names to groups of items. Collective nouns, even misjudged ones, bring concepts to light and allow us to discuss them collectively without lengthy explanations. The elements behind AJAX existed before the term was coined, but its coining gave everyone a single point to discuss and support and its usage exploded rapidly after its coining.

RadioLab's "Words" show - http://www.radiolab.org/2010/aug/09/ - goes into the value of words as a way to represent concepts and feelings and, specifically, how those things fail to exist without the words to define them.


All marketing terms are bullshit? Wrong. All marketing terms are a way to provide a simplified way for consumers to conceptualize a product. In that respect, "NoSQL" is not bullshit at all, even though it may not be as technically accurate as some would prefer.


Who are these mythical consumers that are going to use NoSQL products but are too dumb to understand what they are without some Jeffrey genius to explain it to them in simple marketing terms? Last I checked, the commercial NoSQL efforts aren't trying to take the dumb money away from Oracle. They're just trying to keep the doors open so they can keep coding and continue helping engineers solve problems.


So you're of the opinion that technology markets itself? That's pretty naive.


I experiment with NoSQL stores because they're fun to work with. Who gives a shit what the proper classification scheme is? Why are people so religious over this?


I agree. It should come down to "right tool for the job." All these religious zealots are disgusting at times.

Just because someone wouldn't use "it" doesn't mean one shouldn't use "it."


Show us on the doll where NoSQL touched you.


Funny thing is when you go see the profile of the guy you realise he's a functional alcoholic.


You realize that's a joke, right?


Not really. Me and whiskey have moved in together and are contemplating having some kids.


> to insight ire

Ouch. Nothing hurts your credibility quite like an egregious spelling mistake just as your rant is taxiing onto the runway.


Your rite. Next time I will spellcheck before taxing the runway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: