Hacker News new | past | comments | ask | show | jobs | submit login
JSON-LD and Why I Hate the Semantic Web (sporny.org)
106 points by DamonHD on June 3, 2017 | hide | past | favorite | 46 comments



Would you be so kind as to add (2014) to the end of the title? Since the post starts off by saying JSON-LD was made official "last week", it could mislead readers who don't notice the date in a small font between the title and the body.


I'm glad you pointed this out; without this, I'd spent the entire article thinking "how long did it take you to agree on this!?!"


Sorry: didn't see this in time to edit the title.


Good point. It’s from 2014--an oldie but a goodie.


This is a great essay.

I love so many of the quotes:

[the RDF Working groups] continue[s] the narrative that the Semantic Web community creates esoteric solutions to non-problems.

When you chair standards groups that kick out “Semantic Web” standards, but even your company can’t stomach the technologies involved, something is wrong. That’s why my personal approach with JSON-LD just happened to be burning most of the Semantic Web technology stack (TURTLE/SPARQL/Quad Stores) to the ground and starting over.

As someone who has actually built a non-trivial RDF/SPARQL/Quad Store based app (and would love to avoid having to do that again..) I've avoided JSON-LD because of it's association with the semantic web. Maybe I should reconsider.

I'll add another quote for good measure:

"The semantic web is the future of the internet and always will be." - Peter Norvig.


As someone who has actually built a non-trivial RDF/SPARQL/Quad Store based app (and would love to avoid having to do that again..)

Having worked on WinFS, I'm a huge believer in linked data... Can you share more about the app and your experience with semantic web tech?


Uh, oh: "It was initially developed by the JSON for Linking Data Community Group before being transferred to the RDF Working Group for review, improvement, and standardization."

https://en.wikipedia.org/wiki/JSON-LD


Right, that's what TFA is about.


"let’s create a graph data model that looks and feels like JSON". This was à stupid idea.

N3 is, and has always been, the only readable (and human-editable) serialization format for graphs.

N3.parse() can recreate an in-memory graph from it, and provides an API to find a node by id, or all nodes of a given type. (So your entry points to the graph are your decision, not materialized once and for all by fixed root nodes, like in JSON)

From there, you travel the graph just as you would after a JSON.parse(). I.e people.class.teacher.name. And that's it.

(Think of N3 as a set of JSONs being able to have, in them, references to each other's. And N3.parse() as its API.).

More philosophically, JSON-LD made the same mistake as RDF/XML. It chose the trending data exchange, and bastardized N3 to fit into it.

But N3 is indeed a superset of both JSON or XML (or whatever tree-oriented data exchange structure).

I will also add that N3.stringify() is quite easy to implement.

So imho, we should promote N3 and its API.


Just gonna mention that N3 stands for Notation3 [0]. I'd never heard of it before, and just googling for N3 doesn't lead to any direct results.

My immediate reaction from reading about N3 is that it doesn't seem very easily approachable, especially when compared to JSON. If I type "json" on Google, the first result [1] is a fairly human-friendly spec with a huge list of implementations at the bottom. If I then type "json example" on Google, the first result takes me to a page with lots of examples [2] of JSON side-by-side with XML. In comparison, looking up Notation3 takes me to a w3 spec page for which my first reaction is to say "tl;dr", and looking up examples takes me to two pages [3] [4] which fail to immediately clarify anything.

[0] https://www.w3.org/TeamSubmission/n3/

[1] http://www.json.org

[2] http://www.json.org/example.html

[3] https://www.w3.org/2000/10/swap/Examples.html

[4] https://www.w3.org/2000/10/swap/Primer.html


N3 in a nutshell is:

-----------------------------------

1) an object has an id and some types. Both are prepended with a ':'

+

an object definition ends with a '.'

For example:

:teacher1 a :Teacher.

2) an object has property values. Each property name is prepended with a ':'. A property value of type string is surrounded with '"""'. A property value of type integer is as-is. Several values for the same property are separated by ','. Properties are separated by ";"

For example:

:teacher1 :firstNames """John""","""Michael""";

          :lastName """Doe""".
3) Objects properties are properties whose value is another object. Such an object property value is simply the id of the referenced object.

For example:

:teacher1 :studiedAt :Yale, :Stanford, :Miskatonic.

4) Your object definitions may be split into multiple parts.

For example:

:teacher1 a :Teacher.

:teacher1 :firstNames """John""","""Michael""";

          :lastName """Doe""".
:teacher1 :studiedAt :Yale, :Stanford, :Miskatonic.

:Miskatonic a :University.

:Miskatonic :locatedIn :Arkham;

            :name """Miskatonic University""".
:Arkham a :City.

:Arkham :famousTeacher :teacher1.

-----------------------------------

Nothing especially difficult, in my humble opinion.

The great benefit of N3 (and the Semantic Web in general) is that the object definitions can come from separate datasets and be merged into a single graph, because all UUIDs are aligned in the Semantic Web.


The triple double quotes are horrible.


I infer from that statement that the other characteristics of N3 did not troubled you that much.


Oh god, my eyes are completely refusing to count these


This is absolutely optional. You can use " instead. You then need to escape the " as you would in JSON.


Note: This is a completely optional feature of N3:

N3 can also express rules to apply upon data. For example, add this to the above definition:

{ ?x :studiedAt ?y } => { ?y a :EducationalInstitution }.

and that infers the following object definitions:

:Yale a :EducationalInstitution.

:Stanford a :EducationalInstitution.

:Miskatonic a :EducationalInstitution.


> This was à stupid idea.

Sorry, but I cant figure this out. Was "à" a typo, and if so what kind of keyboard are you using that would make that easy to write by accident?


Not op, but I guess thé answer is "french soft keyboard on mobile with autocorrect on"


Bingo!


Does anyone on HN use JSON-LD? I hadn't even heard of it before. It looks it's useful in SEO (https://developers.google.com/search/docs/guides/intro-struc...) but that's about it?


I spent some time attempting to work with the W3C Web Annotation Data Model. That data model is serialized as JSON-LD.

After spending about 50 hours reading the documents and attempting to implement some of it, I have a general idea what JSON-LD is.

I wasn't really trying to achieve anything, so I basically quit once something seemed opaque enough I couldn't figure it out in a short period of time. When I visited the JSON-LD Test Suite page to see what implementations are expected to do [0], I found:

> Tests are defined into compact, expand, flatten, frame, normalize, and rdf sections

I had a hard time figuring out what each of these verbs meant, and they were about all that the various implementations I found did. For example, the term normalize doesn't even appear in the JSON-LD 1.0 specification [1]. shrug I'm sure I could have figured out more if I spent the time to actually read the whole thing and all the related documents.

[0]: https://json-ld.org/test-suite/

[1]: https://www.w3.org/TR/json-ld


JSON-LD is RDF. Or rather is what RDF would look like if it was serialized in JSON instead of XML. If you look from the Semantic Web angle JSON-LD is just a serialization format, like e.g. Turtle, but using JSON as JSON is popular nowadays.

Sometimes I wonder why this is not said directly, probably because Semantic Web and RDF are passe now.

Actually the post's author addresses this point:

> I made it a point to not mention RDF at all in the JSON-LD 1.0 specification because you didn’t need to go off and read about it to understand what was going on in JSON-LD.

...

> Tests are defined into compact, expand, flatten, frame, normalize, and rdf sections

These are just sub-formats of JSON-LD, information represented is the same but JSON looks a little bit different. Some sub-formats are easier for tools to process, some are better for humans.


> Sometimes I wonder why this is not said directly, probably because Semantic Web and RDF are passe now.

It seems to me that on one hand JSON-LD wanted to bootstrap the network effects by bringing along people who were doing RDF both technically (the JSON-LD spec says "JSON-LD is a concrete RDF syntax as described in [RDF11-CONCEPTS].") and socially (published by the RDF WG), but on the other hand the negative brand equity of RDF is recognized as an obstacle for bringing along even more people, hence the OP professing "Hate the Semantic Web" and "Kick RDF in the Nuts".

It's kinda weird how mentioning the RDF connection of JSON-LD in the sense of past experiences with RDF having any bearing on JSON-LD is treated as a social no-no. Despite the above-quoted bit from the spec saying "JSON-LD is a concrete RDF syntax as described in [RDF11-CONCEPTS].", we are supposed to play along with JSON-LD totally having nothing to do with RDF.


> Sometimes I wonder why this is not said directly

There are multiple flavours of RDF and I think json-ld only supports a subset of one of them. Its been a while since I read the spec but I believe there are various fudges with lists, reification and datatype coercion.

Any time I get close to an RDF stack I find its a broken mess. Its complexity seems to almost guarantee incompatibility instead of interoperability.


I use JSON-LD for the ConceptNet API at http://conceptnet.io .

I'm not sure if anyone cares that it's JSON-LD as opposed to any other decent JSON API, to be honest.

But here's an oblique benefit. I used to be asked why, as a knowledge graph, ConceptNet wasn't in RDF. The undiplomatic answer was "because the RDF technology stack is a pain in the ass and I hate it". But now JSON-LD is a form of RDF that I don't hate.


I posted the story because, egged on by Google's 'Rich Data Cards' item in their search console, I finally snapped and I thought that I'd try it on one of my sites.

I am not encouraged by lots of handwaving in various docs, apparent schoolboy errors in (eg) Google's examples, and apparent incompleteness in the spec for such things as temporalCoverage (can an open-ended and still-continuing data set be indicated with "temporalCoverage":"20140721T10:13Z/" ?).


Turns out that the open-ended temporalCoverage thang is an open issue (and I may have correctly guessed the 'right' answer):

https://github.com/schemaorg/schemaorg/issues/1365

In a way what I learned with my 30 year-old AI degree was that meaningful semantic networks are hard ("is A", anyone?) but I'd have thought that some of these basics would have been nailed in a practical schema by now!


A year ago I implemented JSON-LD using Google's documentation and validator on a brand new website. Months later, it had zero impact on the results and it didn't even picked up the city, only after I added it to the footer that results started changing (like "business + city").


A practical benefit for using JSON-LD in API development is being able to link every field in the response directly to a definition, ideally a shared one such as schema.org. This overlaps a bit with definition formats like OpenAPI (Swagger), RAML, et al, but to completely replace those, you'd need a media type that documents the API semantics.

It's possible to use JSON-LD in an opaque fashion by remapping all of its reserved keys to whatever your JSON looks like, i.e. from "@id" to "href" which is more familiar. It doesn't have to look like an ugly mess of "@" prefixes, for example: http://micro-api.org/#finding-resources

(disclosure: I am the author of Micro API)


I build tech with json-ld.

This uses json-ld: http://beta.einnsyn.no

And this uses json-ld: https://difi.github.io/dcat-ap-no-validator/

I like JSON-LD because we have a graph database in the back, and JSON-LD can support graph data.

In the frontend we transform JSON-LD into the javascript object model. JSON-LD is great for this, and much better than JSON, since it supports datatypes, languages and many to many relationships (eg. a book has many authors, authors can write many books).


Would be interested in learning more about how you use JSON-LD with a graph database.


That's really the easiest part.

We use Stardog, but there are a bunch of other databases you can use including Oracle Spatial and Graph, Fuseki, GraphDB, Virtuoso, AllegroGraph, MarkLogic and Blazegraph.

Many of these database will answer queries in json-ld, but since you are likely to want to have a backend anyway you can use Apache Jena or Eclipse RDF4J (previously Sesame) with Java to connect and extract data. Both Jena and RDF4J will let you output that data as JSON-LD.


It's the format used for reservations and other rich data in email. I dealt with it quite a bit when implementing the email parser that would look for your reservations, and found the format quite easy to approach.


Yes! In web application, where underlying restful service returns JSON-LD; Because underlying db was using RDF database.


We added JSON LD to our site at work so that our results in Google would be richer. It's annoying always having to play catch up with whatever new toy Google decides to implement on their search engine which is for all practical purposes the heart of the entire web.

JSON+LD is extremely mediocre in my experience. It's more human readable and easier to edit than XML but it suffers from some of the same issues where specifications quickly turn into a mess that is anything but specific. Do a Google search for any kind of product that returns enhanced results and look at how each of those sites implements their JSON+LD data and even if they are using the same spec each will have come to different conclusions about what it means.


> Reading sections of the specification that have undergone feedback from more nitpicky readers still make me cringe because ease of understanding has been sacrificed at the alter of pedantic technical accuracy. However, I don’t feel embarrassed to point web developers to a section of the specification when they ask for an introduction to a particular feature of JSON-LD. There are not many specifications where you can do that.

Seeing as how there are many opportunities to provide documentation for beginners and only one place to provide authoritative answers to pedantic technical questions, I feel that the author has misplaced priorities.


Indeed: approachable and readable are good, but not at the cost of delivering resolution for knotty ambiguities, as Algol found in its day...


hope graphql would provide some standard for defining schemas with schema.org vocabulary...

graphql can be designed to incorporate most of json-ld specs (and hydra too)...

Then everyone can benefit because it's machine-friendly-annotated data everywhere :)


A similar thought crossed my mind too. I've never used JSON-LD. I'm not quite sure where I would use it. One of the ideas was what you said.

The other idea I had is that it may help with transforming payloads from their serialized format into a static type system like TypeScript. It could be nice to have a system that can see an incoming response, validate the type using JSON-LD, and construct an instance of that object with it. It can be done without JSON-LD, but it may be useful.


...or FB can create a 'sdk-code-generation' like they did with Thrift, along with typescript generation.

Also, your idea is possible with current Typescript 2.3's language-server specs :)

But, this is quite separate from 'generic-machine understandable' data - which is what json-ld is all about.


A message from the (now defunct) XML community: "We've gotten to a point where a human-readable, human-editable text format for structured data has become a complex nightmare where somebody can safely say "As many threads on xml-dev have shown, text-based processing of XML is hazardous at best" and be perfectly valid in saying it. -- Tom Bradford"


I never used xml. can anyone who used it at a time when it was vogue give a brief overview of "what happened"? I'd love to read it, if you have the time (or point me toward a good summary).


Very basically, XML is a document format. Inherited from SGML. The point of the statement is to say that all the "clever tricks" that XML permits (entities, encoding, well-formedness, validation, ...) make it unsafe to manage by hand. (I.e by appending strings together to build an XML, or by tokenizing a XML input string). From there, you need tools and API to manage your XML representation. And mapping technologies to recreate objects in memory from an XML content. All these (over engineered) tools make things overly complex when you want to exchange documents. And a true nightmare when you exchange data structures.

The alternative of JSON and JSON.parse() is extremely lightweight in comparison. JSON is basically Javascript already. And you can create JSON from any in-memory data structure, with any programming language.

Today, XML should be used only to exchange structured documents. (Which was its primary goal, btw ;)


Linking to a video for a basic introduction of the concept will not persuade anyone.

No triples for the semantic web is a joke, right? Facts are stored in context, not as pure facts. If this then this is that. Subject -predicate - object. William eats apples. Not William apples.

Why should I give up N3 or it's bloated xmlfied webtransports?


Everytime I feel compelled to serialize data with relationship information, all I have to do is look at the extra file size -- with all the namespaces and trivial definitions -- and I immediately trash the idea.

I do like separate data files and schema files, though I haven't seen them pop up in a while.


I'm experimenting something using "noise" the new database from Damien Katz (CouchDB). I can feed JSON-LD document and query them. Very useful for microservices because each services only need to know one part of the overall schema/data model




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: