JSON Support in SQL Server 2016

biot · on May 19, 2015

If I'm reading this correctly, the unfortunate part about using nvarchar rather than a native JSON data type is that it's entirely possible to store malformed JSON into the database, thus putting the responsibility for validation on your business logic. When querying, you need to exclude malformed JSON by using the ISJSON() operator.

I get the argument that nvarchar makes it work with all tooling, but one could make the argument that if your tools don't support JSON already then perhaps they need to get with the times.

Imagine if SQL Server didn't have any date types and you had to query an nvarchar data type with ISDATETIME() and then get the year via DATE_VALUE(t.OrderDate, '$.year'). Seems a bit too hackish.

bjg · on May 19, 2015

The article points out that you can use their ISJSON function as a check constraint on your column, guarding it from malformed data:

> ISJSON( jsonText ) that checks is the NVARCHAR text input properly formatted according to the JSON specification. You can use this function the create check constraints on NVARCHAR columns that contain JSON text.

biot · on May 19, 2015

Thanks... I somehow missed the check constraint ability, which addresses the malformed data issue.

__Joker · on May 19, 2015

Not to nitpick. But you want this validation to be default. In this case which it can't be.

Question is what stopped MS to just wrap this NVARCHAR with json validation and json access and give it as json type. May be native json type will be added in future versions. I think just they wanted to ball to rolling.

taspeotis · on May 19, 2015

> Question is what stopped MS to just wrap this NVARCHAR with json validation and json access and give it as json type

A large part of the blog post [1] is dedicated to this issue?

> In SQL Server 2016, JSON will be represented as NVARCHAR type. There are three reasons for this:

[1] http://blogs.msdn.com/b/jocapc/archive/2015/05/16/json-suppo...

ahachete · on May 19, 2015

Surprised that they mention PostgreSQL. Surprised even more that they seem to admit PostgreSQL is more advanced now than what they are releasing next year.

However, they incorrectly mention that PostgreSQL supports BSON. It doesn't. It rather has a data type called "jsonb", which is a binary JSON serialization, but has nothing to do with BSON (which happens to be MongoDB's serialization format, and a relatively poor one, by the way).

Edit: blog post has been updated by the author to reflect that the type is now "jsonb" rather than BSON. Thank you!

irrigation · on May 19, 2015

As someone who extensively uses pgsql and SQL Server, I don't think it's fair to say that PostgreSQL is "more advanced" because it has one particular, somewhat trendy feature. There are many critical features that pgsql either lacks (query parallelization), or came many years after SQL Server (index-only scans, true materialized views, etc).

And really json in the database is just the more modern version of xml in the database, which of course SQL Server has supported for years. The problem with xml in the database was not a fundamental one, but rather an implementation issue -- where the json model in most solutions is some variation of toss-some-structured-data-in and then work with it, the XML model of SQL Server required significant poorly documented, confusing to implement configuration to work with in any meaningful way. It really killed the feature.

Personally I still think XML is superior to JSON, but it got usurped by the architectural astronauts who kept layering noise on it to the point of being unusable.

ahachete · on May 20, 2015

I thought it was implied in the context, but the "more advanced" obviously meant "more advanced in terms of JSON support". Sorry if there was clarification needed.

What SQL Server 2016 will have next year seems to be more limited than what PostgreSQL 9.2 (by 2012) had. PostgreSQL at least had a native datatype ("json") to store text that syntactically validated JSON, while in SQL Server, 4 years later, you may have non-valid JSON data on a column expected to have JSON (unless you setup the validation as a constraint, which is prone to error and cannot be used in functions which expect or return JSON data anyway). Plus PostgreSQL had by 9.2 many functions to query JSON types in a very easy manner, while I don't see the same functionality coming to SQL Server 2016.

Regarding indexing, by 9.2's time you could use functional indexes to index arbitrary JSON paths (one index per path). In SQL Server you'd only be able to use computed (persisted) columns to index JSON paths, which seems to be a less elegant and less performing solution (more storage required) to the same problem.

So all in all, I honestly think that in terms of JSON support SQL Server 2016 will be less advanced than 2012's PostgreSQL 9.2, and definitely way less advanced than current's 9.4 or even more this year's 9.5. But that's, of course, only my opinion :)

karmajunkie · on May 19, 2015

json's been around for over ten years and is the default serialization format for most web frameworks. To call it trendy in the same breath as singing the praises of XML is a bit gauche.

irrigation · on May 19, 2015

JSON is used for persistence in a limited number of projects, and still lacks fundamental things like a validation schema or even a date type. Yes, it absolutely is "trendy", and it absolutely pales compared to XML. They both fill virtually the identical requirement, one just slowly repeating all of the mistakes of the other, as history tends to repeat itself.

rpedela · on May 19, 2015

JSON is an improvement overall even though there are certainly issues. It is easier to read, easier to parse, and usually fewer bytes for the same data. Validation schema can be provided by JSON Schema [1]. The lack of date type and comments is annoying, but not that big of a deal. I think parsing and reading are far more important to get right which XML didn't.

1. http://json-schema.org/

WorldWideWayne · on May 19, 2015

> but not that big of a deal. I think parsing and reading are far more important to get right which XML didn't.

...in your opinion, for your use cases. I think XML is eminently more readable due to the fact that it has named types and also because I never have to read the following and figure out what goes where:

    }}}}}]]}}]]}}

JSON is great for sending data to Javascript though and I'm pretty sure that's the only reason anyone is using it.

rpedela · on May 19, 2015

You have the same problem in XML with that amount of nesting. Then you pretty print the XML or JSON and problem solved.

drawkbox · on May 19, 2015

Postgres became our goto RDMBS because of JSON support. It is really sad it has taken Microsoft this long to get JSON native database support. It makes using it for .NET projects more difficult than competitors or NoSQL databases. When clients pick .NET we always get a little annoyance every time we need to use JSON in the database which we solve with caching/redis/mongo/file io/compression depending on the situation.

JSON support has lagged in all areas of Microsoft development platforms. They didn't even embrace it or most .NET developers weren't even using is or knew what it was until WCF in 2006-2007 when it had been heavily used for years. Newtonsoft really pushed it in the Microsoft/.NET world which is awesome. 10 years later after rest web services and SOAP went away in favor of REST/JSON, it is finally getting into MS SQL Server. MS SQL Server had support for XML searching early, poor showing in JSON support but glad they are getting it done.

reagan83 · on May 19, 2015

SQL Server 2005 and newer continued to invest in the XML datatype which was a poor product choice in hindsight.

While I applaud this product feature, I'm not convinced of the value of storing JSON in a relational database (any more than storing XML).

baq · on May 19, 2015

turns out relational databases make for quite good nosql datastores, it just takes a little bit of thought instead of jumping onto a bandwagon of the month.

ahachete · on May 19, 2015

Or not ^_^ Have a look at ToroDB (https://github.com/torodb/torodb). It is a mongo-compatible database sitting on top of PostgreSQL. You don't need thought, just use it :)

Disclaimer: I'm a ToroDB developer

ahachete · on May 19, 2015

Surprised by the downvote. It is an open source solution that does exactly what the parent comment was asking (not having to think about how to map json to a relational database). I don't see a problem mentioning some software (again: open source) as a viable solution to that problem.

threeseed · on May 19, 2015

It's trendy on HN to like SQL databases and bash MongoDB.

moron4hire · on May 19, 2015

Really? I thought it was the other way around.

Try not to make generalizations of a rather diverse crowd.

ahachete · on May 19, 2015

I know. But there was no MongoDB bashing -- at least in the comment ;)

M8 · on May 19, 2015

SQL =>

Step 1: remove data integrity.

Step 2: remove powerful query language.

Step 3: only store strings.

=> No-SQL

It's interesting that the technology is named after a missing feature.

dorfsmay · on May 19, 2015

As someone who had to do aggregation on hstore... Gosh! Why? It's not that difficult to add a column!

threeseed · on May 19, 2015

So I take it you have zero experience with NoSQL databases then.

HiveQL is far more powerful than SQL and MongoDB QL is pretty powerful for querying document data structures. The others do 95% of what most people are doing in SQL. And the overwhelming majority of NoSQL databases support data types.

And there is no issue with the data integrity of NoSQL databases. Do you really think the hundreds of top companies would use them if there was e.g. Apple, Google, Twitter, Netflix etc.

pjmlp · on May 19, 2015

With the added value of constraints and data integrity.

I am yet to do any project where a NoSQL database makes sense.

azinman2 · on May 19, 2015

Looks a lot like what Postgres originally did with their Jason support pre-9.4 where it got significant beef ups.

Olap84 · on May 19, 2015

But can it parse and output a proper CSV yet?

manigandham · on May 19, 2015

What's the best use case for this? I've seen lots of requests but I'm struggling to find a really good scenario beyond just avoiding the work of normalizing into regular DB tables and all the schema management that comes with it (which isn't necessarily a bad thing).

Any top level fields can just be made columns and we already have 1:M and M:M mappings done well in relational tables, is it purely just being able to serialize into 1 field and then query on that? Why not just use a document database then?

NDizzle · on May 19, 2015

Leaving out JSON indexes seems like a tremendous oversight.

viggity · on May 19, 2015

if you really need indexes, extract the fields from your json and keep them as additional columns in the table. Key value stores have always had problematic indexing on anything that wasn't the key. RDBMS does indexing really well, lean on that side for what it is good for.

est · on May 19, 2015

> extract the fields from your json and keep them as additional columns in the table

You can do that only for simplest key-value json, not the more complicated ones IRL.

E.g. an array

ID, book_title, book_intro, book_tags

here book_tags is a json array. Now try index that!

Mongodb could index it, full text could index it, pg 9.4+ could, but not other RDBMS.

Support nested data structure in RDBMS is hard. You have to implement flat/unflat voodoo in a weird & lame, non-SQL DSL

raphaelj · on May 19, 2015

> here book_tags is a json array. Now try index that!

You just need to create a new table which links tags to books. That's how relational databases are normalized ...

est · on May 19, 2015

Expanding a billion row table into a hundreds of billions of rows table is not really a good idea

MongoDB handled this really well, create two index entries pointing to the same row.

blumkvist · on May 19, 2015

w00t? A billion attributes on hundreds of billions of rows? Could you think more of an edge case? You're just coming up with shit so you can show off how l33t you are. If you have that kind of a need, just go for a NoSQL server, nobody is stopping you. This feature is a nicety to have, not a replacement of NoSQL.

ahachete · on May 19, 2015

You could also do it in pg pre 9.4. Just create functional indexes on the paths you wanted to index. The advantage of 9.4 is special indexes for jsonb, which may index the whole field or just some paths. So even pg 9.2 was better than SQLServer's approach...

rosser · on May 19, 2015

Postgres perfectly supports indexing attributes within JSON fields, whether stored as the native JSON type, or using the hstore extension. I just created a couple of indexes on our DML audit logging archive table this afternoon, which stores old and new row state in hstore fields.

mattmanser · on May 19, 2015

"if you really need" = do loads of work you can't afford to do

"it just works!" = means you will actually use it

Your post is one of those that is both technically correct and practically useless.

Avalaxy · on May 19, 2015

What's the use of this? Something like JSON.NET makes it extremely easy to map between JSON and objects and vice versa. That way you'll only store the useful data and not all the JSON markup.

ffk · on May 19, 2015

At first sight, this looks like a competitor to json based data stores such as mongodb. However, it looks like json is just a line format and you still need to write T-SQL.

taspeotis · on May 19, 2015

Well Entity Framework 7 is meant to have support for non-relational data sources [1] so ... at some point you probably won't have to write Transact-SQL for dealing with JSON data in your cookie-cutter OLAP applications written with .NET.

[1] https://msdn.microsoft.com/en-us/magazine/dn890367.aspx

pjmlp · on May 19, 2015

Knowing SQL is a way to win hands down most ORM mappers.