Hacker News new | past | comments | ask | show | jobs | submit login

For a database like postgres, or even for something like SQLite, this mostly becomes a distinction without a whole lot of difference, since the database can index and access JSON structures just like regular columns.



JSONB columns are incredibly useful if you have data that doesn't fit well into a strict relational schema. And they are very powerful in terms of query ability in Postgres, but still quite far from a conventional relational schema.

Queries that act on the insides of a JSONB column are much harder to write than the equivalent conventional queries. They are also much, much slower in certain cases as the DB has to read the entire JSONB blob in many cases, and because they don't have proper statistics. The performance can range from slightly slower to completely pathological query plans that take ages.

I'm a big fan of JSONB in Postgres, but it is no replacement for a relational schema, there are far too many problems you inflict on yourself if you try to use it like that.


Access, but not constrain or type-check.

And it's just harder to use, I'm with it being generally a 'mistake' (in the 'you will regret this later' sense) - it's better than text, but I'm in the camp of as much structure as possible ('you will thank yourself later').


> Access, but not constrain or type-check.

You can create CHECK constraints involving JSON operations in postgres, which is the same way you'd enforce any constraint.


Manually, right? Django autogenerates all of the SQL schema for you on non-JSON fields. You could replicate that work, but that sounds tedious.


As others have mentioned, it definitely does have downsides when it comes to writing queries and especially when using frameworks/libraries to interface with the db.

But for the right job, JSONB columns are awesome!


What’s a real world example where you would choose to index the JSON as you said rather than putting them into the fields/schema.


Any time you'd otherwise use EAV.


EAV is considered an antipattern. While wouldn't rule out using it, it probably means it should be avoided if there is another solution.


I've been working on a project using schema.org objects (100's of types of objects) in postgres in a graph/tree like structure. While structured the objects are not consistent enough for me to want separate schemas for them. Using JSONb in postgres allows me to index separate fields within the JSON while still having all the objects in a single graph.

If I couldn't use JSONb with indexes (and triggers for custom foreign key checks) it'd be much harder to do this in in a relational database (and the data is relational, it just isn't consistent across all types).

Perhaps this would be better in a Triplestore but there are a lot of features of postgres that are not available in them.


I used to believe in these database antipatterns (e.g. avoid EAV, avoid attribute tables, avoid denormalization, avoid overnormalization, avoid inheritance, avoid wide tables, only use synthetic keys, only use natural keys, never use autoincrement, always use autoincrement), but these days I'm almost certain all of them are wrong, because they are independent of business domain needs. Business domains have antipatterns for their data, data in general decidedly does not. For example, storing physical addresses is subject to a number of antipatterns, including stuff like "natural keys are not a good idea for this".

GP cited EAV as an antipattern, yet it is one of the most reasonable approaches to user-defined fields (especially when suitably augmented), which are frequently a business requirement.


Not even just user-defined fields, any time where the fields need to be defined as data values rather than part of your concrete data model. Product attributes on a big multi-category e-commerce site being another example. You could just stick them in JSON, but it's still essentially EAV, just with a different storage mechanism.

Another advantage of EAV as a pattern, is it makes it straightforward to add metadata to values, like "where did this value come from?". For systems with audit trails, just having keys and values isn't enough.

EAV wouldn't be the first tool I'd jump to -- I've definitely seen it go wrong -- but it feels like it's going too far to call it an anti-pattern.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: