Hacker News new | past | comments | ask | show | jobs | submit | pbogdan's comments login

I've been struggling with this one a lot - could someone explain where a document store is a good solution?


Imagine an RDBMS table where many columns are nullable and many rows contain nulls in practice. Dealing with these is really painful in traditional database systems (you don't know how painful until you try a document store).

A document store flips the default. It makes dealing with data that has lots of nullable columns much, much easier. (It also makes dealing with hierarchical data a breeze)

There are lots of details, but this is the gist of it.


How is it better than an XML field in SQL Server, which allows indexing, schemas (if you want), and full querying inside the document? I think Postgres also has similar functionality with JSON, now, too.

Certainly a lot of applications would benefit from having a full RDBMS they can opt-in to document-style data when they feel like it?

Built-in horizontal scaling is one selling point for non-RDBMS stores, but large systems seem to just shard on top of RDBMSes anyways, right?


> How is it better than an XML field in SQL Server

It changes the default, which results in a drastically different programming experience. The difference is difficult to describe in the same way a dynamically typed programming language is difficult to describe to someone who's never tried one.

I'd encourage you to try a document store (Mongo, Rethink, whatever) for a throw-away project. A ten minute tutorial walkthrough is worth a thousand HN comments when it comes to stuff like this :)

Here's the Rethink tutorial: http://rethinkdb.com/docs/guide/python/. Just play with it and see if you like it!


OK, I will try it out for something and see how it feels different.

Related: The comparison to a dynamically typed language makes me suspicious. I spent a bit of time trying to find any examples of dynamic code that actually provided any benefit. Even read "Metaprogramming Ruby" and was dismayed to see examples of reading a CSV - big deal if I save a few quotation marks. The others were just places where the static type system wasn't good enough (duck typing), or dynamic code was a pain to get going (poor reflection/codegen APIs).


I'll try to add some additional color here. My background is that I work for Google (where the vast majority of our data is stored in what amounts to document databases, generally protobufs in a key-value store), I prototype in Python and Javascript, and I write production code in Java and C++.

Both document databases and dynamic typing are at their best when you don't understand your problem domain. They let you express what you do know about your problem domain concisely, and then fill in the blanks later on. So in a document database, when you find that you want to record a new bit of data - just add it as a field to newly-created XML/JSON documents, and only display it in the UI if it's present. Or pick a default value if you need to perform computations with it. Don't bother with data migrations, don't bother with schemas, don't bother trying to backfill previous data. Try out your idea and see if it works first, because chances are, it doesn't.

If you always work on projects where the requirements are handed to you, specs are complete, and the problem domain is understood, this will seem terribly irresponsible to you. And it is - if you understand your problem domain, you should capture as much of that knowledge in the software system you build to understand it.

But if you are working in startups, or in consumer web, where you absolutely have to be on the leading edge or die and the only opportunities that haven't been picked over yet are the ones that nobody understands - being able to try things out without having to flesh out all your assumptions is crucial. You will run circles around the people who spend time defining their data model and speccing out their objects. And then when consumer tastes change - which happens quite regularly - you can adapt to them immediately instead of throwing out all the work you did under the old assumptions.

The other bit of context I'll toss in is to get in the mindset of solving a problem that you don't know how to solve and assume that your first 10 solutions won't work. For example, if you're reading a CSV - everybody knows how to do that, dynamic typing doesn't really help there. If you're cloning Stack Overflow, you can probably figure out what your database schema should be. But what if you're trying to figure out a new way for people to socialize over mobile phones? Where do you start there? That's the use case for dynamic languages and document DBs. The problems where technology is a tool for understanding & manipulating vaguely-defined social behaviors.


Thank you for the explanation. I can understand the logic.

In F#, adding a field requires "field : type". If I don't care about type checking, I can just add "Props : dict<string, object>" and go to town. Or I can opt-in to the dynamic features and just do "foo?bar <- baz". When I change types around, things either just work due to type inference, or the compiler helpfully points out every place that'd be a runtime error. I've never felt this slows me down. I feel the type checking and autocomplete is worth the tiny amount that specifying a record takes. (I've spent days finding minor issues in JavaScript, stuff that'd instantly be caught by a type checker.)

Databases make it a more cumbersome, and it takes more than one line to start using a new field. I totally sympathize with the flexibility issue there. Even with a document type, most syntax I've seen doesn't have truly first-class querying support (not as easy a column, anyways). And it feels ugly to have some fields defined in schema, and some in a document. But that seems like a minor tooling issue -- there's no fundamental reason SQL can't let me do "WHERE x.SomeDoc.SomeField.OtherField > 5" (perhaps some minor scope resolution issues to ensure I'm not referring to some other multi-part name).


I'm using it to collect stats from various systems, events and alerts too. It makes it easy to collect new data points and add fields to new events and query things in a consistent way across the entire data set.

When the schema is not rigid and likely to change on a daily basis I prefer a document database over an SQL one. There are probably other use cases as well, this is just my favourite.


One example is a CMS. An article these days is a title, body and 5-10 comments, and whatever other metadata that you would want to present on a page. Don't update the corresponding NoSQL record till there are writes in the RDMBS. Serve from the NoSQL system, with at least another caching layer in front. You get the best of both worlds and just one query instead of many.


Why not just use a materialized view inside your SQL system? No need to cache invalidation; let the system handle it for you. Same benefits, truly a single query, and only one system.

If you're going to concatenate relational data into a document, I'm not sure why a simple KV table doesn't fit the job.


Thanks, had to read up on what exactly is a materialized view. Not something I was familiar with.


So I might be missing something but please show me how Docker packages are being distributed via official repositories in Arch and Gentoo?

EDIT: Not to mention that packages mentioned at http://docs.docker.io/en/latest/installation/archlinux/ no longer even exist..

As far as I can tell they do as good of a job as Debian - they don't package it all.


Could you share any details about your setup?


I didn't really do much customization. For FireFox, I just used a different theme and got rid of all the UI bits I didn't use. For Chrome, I only set it to use system borders. I didn't even configure XMonad very much--the defaults are very well thought out.

In practice, it looks like this:

Firefox: http://jelv.is/xmonad-ff.png Chrome: http://jelv.is/xmonad-chrome.png Tiled: http://jelv.is/xmonad-tiled.png

The only real problem I've had is that form widgets are rather ugly. I think this should be easy enough to change, I just haven't gotten around to it yet.


Not sure of the quality of it but Wordpress does have automated test suite: http://make.wordpress.org/core/handbook/automated-testing/ https://unit-tests.svn.wordpress.org/trunk. Do you have any experience with those?


Well I am happy to eat my hat on this one, with a side of crow.


Would love to have a look - @pbogdan


Slightly off-topic, but could anyone recommend any good resources for getting started with iOS development?


Buy a book. Make an app . Release on app store. Dont worry about finding the perfect book, just buy one (I'm partial towards the LaMarche book), work through the examples, then build your own app. IF you run into any hurdles while building your app, google and stackoverflow are your tech support.


I've heard this is a really good introduction book, http://www.amazon.com/Objective-C-Programming-Ranch-Guide-Gu...

Having read their Cocoa programming for OS X book (http://www.amazon.com/Cocoa-Programming-Mac-3rd-Edition/dp/0...), I can wholeheartedly recommend their books.


I've looked at most of the online resources. The best tutorial videos seem to be on Lynda.com. Simon Allardice is really clear and concise in his explanations. The best practice exercises I've found are on Treehouse. The Mobile Makers program mentioned above is really for people who want an immersive program that combines the social, constructive, and cognitive learning methods with mentorship and career opportunities. But I would say, get started with what's online and only commit to a program when you've graduated from curiosity to genuine interest.


Check out the Stanford University iOS classes on iTunes University - very good video resources stepping through all the basic building blocks. Also finding decent open source components on GitHub and alike then manually copying the code is helpful to build up a muscle memory of the API's used


You might want to have a look at JMeter (http://jakarta.apache.org/jmeter/).


It's been slow for me as well past 1-2 weeks and yes, it's problem with static files - they take 20-30s to load.

Caching seems to be fine - I'm getting 304s for stylesheets and images (and they are served from local cache according to firebug and chrome dev. tools) which means it actually takes 20-30s of wait time for the server to just return the headers.

I wonder if it might be problem with Keep-Alive which is on (with 15 sec. timeout I think) - there's probably bunch of apache processes / threads sitting idle waiting for connections to close and with the amount of traffic hn gets I would imagine there's quite a few of them.


It seems to be on SliceHost (if nothing changed in the mean time) - http://news.ycombinator.com/item?id=1283430.


Please, correct me if I'm wrong but I don't think their niche overlaps too much with paypal customers. So why would they market themselves as a direct competitor to paypal? Why would they, given this premise, advertise to paypal customers at all?

If you read the whole story you will get the impression they are paypal competitors - that is, unless you visit their website and read comments here. You will find out they are nothing like paypal and they won't replace it.

They're riding paypal disappointment tide cough diaspora cough yet they don't offer viable replacement.

There's been a lot of negative opinions about paypal coming from this very community but, in my opinion, if they position themselves as paypal competitor a lot of people will be disappointed in their offering right now.

Do you think it's a good way to market themselves? Or should they focus on their core audience via other channels? Or maybe paypal customers are indeed wepay's customers in the making?


they are not a direct competitor. their offering is not equal to paypal's. they only do a few things, but try to do them well.

the services they provide are similar to some things that people are doing with paypal.


Have you used the service? I've been using it for close to a year, and I love it.

I've been using it to manage sponsor payments for Hackers and Founders for some time. It's wonderful, easy and simple to use.

And, the reason that they are building this is because PayPal sucks.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: