How to Use JSON Path

jawns · 2024-05-03T10:47:38 1714733258

One of the best tools I've found for manipulating JSON via JsonPath syntax is https://jsonata.org

In addition to simple queries that allow you to select one (or multiple) matching nodes, it also provides some helper functions, such as arithmetic, comparisons, sorting, grouping, datetime manipulation, and aggregation (e.g. sum, max, min).

It's written in JS and can be used in Node or in the browser, and there's also a Python wrapper: https://pypi.org/project/pyjsonata/

snthpy · 2024-05-04T18:05:39 1714845939

Thanks for this! The 5-min intro really undersells it as that part is not that much better than JSONPath. I started watching the London Node User Group video next and was about to switch it off but decided to hang on a bit longer because I busy with something anyway. Then it finally started getting into its real differentiators: constructing new objects and reductions.

I even had to click around the documentation link three times until I figured out how to get to the real docs. It's a really well thought out query language that is actually Turing complete. I really want to try it out on some real data in anger to see how well it holds up in practice. I'm also thinking about what could be taken over into PRQL which I'm somewhat involved in.

replwoacause · 2024-05-03T12:59:52 1714741192

Looks awesome, can't wait to try it out on my side project.

ashconnor · 2024-05-03T18:30:31 1714761031

Is it using JsonPath? This looks more intuitive to me.

tylerneylon · 2024-05-03T13:51:12 1714744272

Not sure if this is a common problem, but I built a tool to help me quickly understand the main schema, and where most of the data is, for a new JSON file given to me. It makes the assumption that sometimes peer elements in a list will have the same structure (eg they'll be objects with similar sets of keys). If that's true, it learns the structure of the file, prints out the heaviest _aggregated_ path (meaning it thinks in terms of a directory-like structure), as well as giving you various size-per-path hints to help introduce yourself to the JSON file:

https://github.com/tylerneylon/json_profile

wodenokoto · 2024-05-03T09:40:38 1714729238

Is there a general name for the kind of data structure JSON represents?

We see this kind of Nete’s data all over the place (json, yaml, python dictionaries, toml, etc, etc) and I’m thinking wouldn’t it be nice if we had a path language that worked across these structures, just like how we can regex any strings?

So we can have a pathql executable that we can feed yaml and json data to, but I can reuse the query in Python when I want to extract values from a json stream I just deserialized.

lucianbr · 2024-05-03T09:46:26 1714729586

Is it anything else than a tree with properties for each node?

I think you could well apply jsonpath to yaml, except for the different data types, which is what makes you need xmlpath, jsonpath, file path, css and so on. If you're willing to do some automagic conversions, you could probably do that right now, if you write the code for it.

exceptione · 2024-05-03T10:10:24 1714731024

A tree with properties indeed.

They vary in syntax and how they deal with scalars, objects and collections. Having to write 'myKey' is a bit unfortunate in json for example. Still, for any document larger than 20 lines yaml will fall apart easily.

Xml (dialects) mark the beginning and end of nodes explicitly, which deviates from json/yaml. Xml can represent nodes within the value, which is impossible for json/yaml. To convert such xml value to the latter, you have to break up such a node in fragments and represent them as a collection in json/yaml.

tsimionescu · 2024-05-03T13:58:25 1714744705

> Xml can represent nodes within the value, which is impossible for json/yaml. To convert such xml value to the latter, you have to break up such a node in fragments and represent them as a collection in json/yaml.

Are you talking about XML like `<text>Something something <para> inside </para> something else</text>`? I thought this would also presented as the text element having three children, the text "Something something", the <para> tag with its subtree, and the text "something else". Am I misremembering?

exceptione · 2024-05-03T17:07:19 1714756039

Exactly, that is the «breaking apart» I was talking about. Xml is the more natural way for humans to express that particular use case.

tsimionescu · 2024-05-03T17:57:04 1714759024

But my point is that an XML parser would still present that as three children, right?

It's just when writing the document that you get the advantage, the data model is the same.

exceptione · 2024-05-03T18:29:58 1714760998

Jup, it is human ergonomics, but a mismatch for your average programming data structure.

I think that is one reason why json took of for exchanging machine readable data, xml is too expressive, and leans towards document authoring.

tsimionescu · 2024-05-03T18:42:38 1714761758

Agreed, a lot of the features that make XML very nice for embedding tags in text documents make it very inefficient at expressing explicit tree data structures.

rileymat2 · 2024-05-03T22:44:52 1714776292

I doubt that, if I had my money on it I’d guess the main reason is var obj=JSON.parse(string) and its ease.

exceptione · 2024-05-03T23:38:01 1714779481

That is half the reason. JSON was limited to Javascript when SOAP/WSDL was king for data exchange. If I remember correctly, XML apis predates json in Javascript. IE5 and XML were really hot back then. xml.loadString() is equally simple.

Json got introduced because XMLHttpRequest was an IE-only component. Data exchange was done mostly between servers, back then the front-end was really dumb. Why xml got slowly replaced is because that for machine generated data, json proved to be sufficient. Any json library is easier to work with then their xml counterpart, exactly because of what we discuss here.

PurpleRamen · 2024-05-03T13:47:01 1714744021

The complicated part is, that this tree has different types of nodes. Some support properties and child's, some not. So I would call it a chimera tree or mixed tree, or something like that.

lucianbr · 2024-05-03T15:09:05 1714748945

For a query language, you can generalize to all nodes supporting all things. You will be able to write some queries that will never return results, at least in a given input data format. That does not seem so bad.

int_19h · 2024-05-03T23:49:54 1714780194

Coincidentally, this is exactly what XPath and XQuery do. E.g. text nodes don't have children nor attributes, yet it is perfectly legal to do text()/foo or text()/@foo - and the result is an empty sequence.

If you want this kind of semantics with JSON, take a look at https://www.jsoniq.org

hughesjj · 2024-05-03T19:44:17 1714765457

No it makes so much more sense to introduce nil and then constantly pepper your code with ? (Null-colasecing) Operators everywhere

(Looking at you, jq, though to be fair JSON itself supports null...)

PurpleRamen · 2024-05-03T13:51:53 1714744313

I think one popular name, independent of the format, is to call them documents. Or more specifically, the type of database which use them often runs under that term. Maybe specify it as data-document to not confuse it with freeform office-documents.

But one problem is, that each format has slightly different ways it works. Some have nodes which support properties, some not. This makes building a proper query-language a bit more complicated if it should not end up ugly.

lucianbr · 2024-05-03T15:14:37 1714749277

What's the problem with nodes supporting or not some properties? For querying you can treat all nodes as supporting all things, and the user just needs to write a query that acutually works, which is what they would need to do anyway.

Yes, if you want to make the query language reject as incorrect queries that specify a property on a node that can't have properties, it is messy. But is this really necessary? You can write faulty queries anyway. And it's not a programming language for complex systems, where it really helps to prevent mistakes early. SQL works fine with no type safety and so on.

PurpleRamen · 2024-05-03T15:43:29 1714751009

The problem is that you can't easily reuse a query with different formats, with a swallow designed language. You will still end up writing format-targeted queries, which then open the question of why you would even bother in the first place using a watered down language, instead of a format-optimized one.

jkaptur · 2024-05-03T16:11:54 1714752714

Another issue is that the term "document" is also used to refer to XML or HTML.

PurpleRamen · 2024-05-03T19:36:50 1714765010

XML is for data-exchange, competing with JSON and others. So I have no problem putting it to the data-documents, even though it's more a frankenstein. But HTML is an office-document, used for freetext, nobody really should use it for data, even though sometimes it's used that way.

hgyjnbdet · 2024-05-04T05:56:48 1714802208

XML is also used, literally, for office documents (Word, Excel, etc).

PurpleRamen · 2024-05-04T08:53:53 1714812833

So are JSON and YAML. The point is whether you have a programmatic and structured handling of a document's content, or whether it's random, where every line and word can demand for a different parser.

qazxcvbnm · 2024-05-03T12:22:34 1714738954

As to cross format path querying, I see limited value in such an endeavour; the reason that one deals with such formats is generally because it is used as an interface to configure some tool, but moving across different tools, besides the path queries, even the configuration schemas differ, and having a cross format converter would do you no good. The fact that many such configuration formats support subtly (or drastically) divergent types of objects does this idea no good either.

Speaking of this topic, I would be very interested in a CLI tool that translated between different dialects of regex. As opposed to general data or configuration, I find the case for cross format conversion very compelling for regexes; the objects that different regex languages deal with are functionally completely identical. I would be very happy to be able to craft a regex to search for something in vim, then convert the regex to grep or use another tool on the shell, then perhaps adapt it to some other scripting languages e.g. Javascript/Ruby/Perl.

Indeed I am sad that I am not finding any CLI tools (non web-based) for this beyond https://github.com/Anadian/regex-translator . Am hopeful for more suggestions.

crabmusket · 2024-05-03T10:26:21 1714731981

I feel like `fq` has a query path language that's kind of generic across lots of file types. It can be fairly verbose for that reason. I was using it to debug MsgPack documents and it was a lot less intuitive than just using some dotted string paths with `jq`.

https://github.com/wader/fq/

wwader · 2024-05-03T13:30:36 1714743036

Hey, fq author here. Happy to hear it's useful! could you elaborate a bit more how it was less intuitive? fq's query language is jq with some small additions so i wonder if you might mean the decoded structure is more detailed/verbose as it includes all the "low level" details? maybe your looking for the "torepr" function that converts the detailed structure into the "represented" value?

crabmusket · 2024-05-03T23:28:51 1714778931

Yes that's exactly what I meant, the MsgPack documents had quite a detailed structure.

torepr didn't quite work for me as I was dealing with objects containing large binary blobs and it was awkward.

fq is a great tool and I shouldn't have suggested this was a problem unique to it! I think this kind of "issue" is inevitable when dealing with so many types of input. And to be honest I struggle hard using jq as well for anything other than very basic paths, due to infrequent usage.

wwader · 2024-05-04T15:42:30 1714837350

I see, thanks for replying and no worries! yeap some of the "self-describing" formats like msgpack, cbor etc will because of how fq works have to be decoded into something more of a meta-msgpack etc.

About blobs, if you want to change how (possibly large) binaries are represented as JSON you can use the bits_format options, see https://github.com/wader/fq/blob/master/doc/usage.md#options, so fq -o bits_format=md5 torepr ...

I can highly recommend to learn jq, it's what makes fq really useful, and as a bonus you will learn jq in general! :)

crabmusket · 2024-05-04T22:53:50 1714863230

Oh that's a great tip, I will try to remember this next time I'm touching the msgpacky parts of my app!

ranger_danger · 2024-05-03T14:43:01 1714747381

Never seen this tool before but it looks quite handy. Thanks!

wwader · 2024-05-03T18:54:27 1714762467

hope it can be of use!

sesm · 2024-05-03T11:31:34 1714735894

'Nested maps and arrays of strings, numbers and booleans'? I propose the name AMofBNS :)

zzo38computer · 2024-05-04T19:57:39 1714852659

Although there are similar data structures, not all of them are supersets nor subsets of the data structures of JSON. For example, some data types that JSON lacks are:

- Integers (including 64-bit integers and longer)

- Non-finite floating point (Infinity, NaN)

- Keys of types other than strings

- Non-Unicode strings (e.g. byte sequences, TRON code, etc)

- Date/time

- Links

Additionally, they may differ of whether or not the order of keys should be retained.

irgolic · 2024-05-03T15:51:41 1714751501

The general name is semi-structured data, as opposed to structured data (tables) and unstructured data (free text).

TachyonicBytes · 2024-05-03T17:35:46 1714757746

The Readings in Database Systems book calls it "general purpose hierarchical data format" at some point, which seems the most fitting name for me.

You generally don't see yaml or XML be called that, but there's some information on the net that you can find about it.

err4nt · 2024-05-03T20:20:43 1714767643

Structually, it's a tree. JSON cannot include circularity (like JavaScript objects can) so it's either an object, or nested objects.

itronitron · 2024-05-03T11:17:09 1714735029

You could say it is 'object-oriented' since the ON stands for Object Notation.

specialist · 2024-05-03T22:10:00 1714774200

grove - tree decorated w/ key-value pairs. eg JSON, XML w/o text nodes

object graph - grove plus object references. eg serialization formats

knowledge graph - object graph where parent-child relations are explicit, using subject-verb-object clauses. like how Java Spring "flattens" object graphs.

document - grove plus random text nodes

emmelaich · 2024-05-03T17:39:09 1714757949

https://augeas.net/

froh · 2024-05-03T12:10:15 1714738215

nested records and arrays?

layer8 · 2024-05-03T15:10:09 1714749009

Tree-structured data

err4nt · 2024-05-03T20:19:15 1714767555

In the past I've used XPath, and CSS selectors using this library to filter and find data in JSON: https://github.com/tomhodgins/espath

The approach is to take the JavaScript object, convert it to XML DOM, run the query (either using standard XPath, or standard CSS selectors) and then either convert the DOM back into objects, or another way I've seen it done is to keep a register of the original objects and retrieve the original objects.

In this way, JSON, and any JavaScript object with non-circularity can be sifted and searched and filtered in reliable ways using already-standardized methods just by using those technologies together in a fun new way.

There is not necessarily a need for inventing a new custom syntax/DSL for querying unless you don't want to make use of CSS and XPath, or have very specific needs.

simonw · 2024-05-03T19:14:23 1714763663

SQLite includes a subset of JSON Path in the core database these days, used by functions like json_extract()

I wrote up my own detailed notes on that subset a while ago: https://til.simonwillison.net/sqlite/json-extract-path

eknkc · 2024-05-03T09:26:55 1714728415

I've used postgresql's jsonpath support to create user defined filtering rules on db rows. It made things a lot easier than whatever other methods I could come up with.

porsager · 2024-05-03T10:14:25 1714731265

That sounds very interesting. Could you elaborate on that?

dherikb · 2024-05-03T11:23:32 1714735412

Insomnia and Bruno has a feature to filter the responses using JSON Path. It's really useful.

justin_oaks · 2024-05-03T15:29:01 1714750141

I never noticed that small bar at the bottom of the response section in Bruno. That IS very useful. Thanks for the tip.

jgalt212 · 2024-05-03T13:06:49 1714741609

Is the necessity of tools like JSON Path really just an indication that APIs are increasingly returning too much junk and / or way more data than the client actually requested and / or needs?

In dev mode, our internal APIs return pretty printed JSON so one can inspect via view-source, more, or text editor.

Manfred · 2024-05-03T13:10:35 1714741835

Not necessarily. JSON is used in a lot of places, also for large documents in data lakes and archives. It's useful to be able to query them with tools.

xnorswap · 2024-05-03T10:23:05 1714731785

I look forward to the inevitable JSON path injection attacks given how widespread XPATH injection used to be. ( See https://owasp.org/www-community/attacks/XPATH_Injection for more info. )

pwdisswordfishc · 2024-05-03T10:50:16 1714733416

Only as inevitable as the dearth of interpolation/parametrized query primitives… though whether the industry has actually learnt the bitter lessons of SQL injection remains to be seen. I don’t hold my hopes up too much.

pydry · 2024-05-03T11:52:07 1714737127

You can just bypass the injection risk entirely by hardcoding the values as this example demonstrates:

https://news.ycombinator.com/item?id=40246089

(I'm being sarcastic, obviously. You are 100% right)

wmil · 2024-05-03T13:00:22 1714741222

This first came out in 2007 without picking up much popularity, so I wouldn't worry about it becoming widespread.

philstu · 2024-05-03T20:13:35 1714767215

The standard was developed because its being used by so many people in so many ways with so many different implementations that a standard was required to align them all, so im not sure the "nobody is really using it" argument holds much weight.

gonzo41 · 2024-05-03T10:36:45 1714732605

Yep. Yep, Yep. You have to wonder why we can't just leave nice things alone.

deepakarora3 · 2024-05-05T02:04:32 1714874672

JSONPath is good when it comes to querying large JSON documents. But in my opinion, more than this is the need to simplify reading and writing from JSON documents. We use POJOs / model classes which can become a chore for large JSON documents. While it is possible to read paths, I had not seen any tool using which we could read and write JSON paths in a document without using POJOs. And so I wrote unify-jdocs - read and write any JSON path with a single line of code without ever using POJOs. And also use model documents to replace JSONSchema. You can find this library here -> https://github.com/americanexpress/unify-jdocs.

Powdering7082 · 2024-05-03T15:57:07 1714751827

I am kind of surprised that they don't mention jq at all, it seems like a similar tool that is fairly wide spread.

riquito · 2024-05-03T18:26:30 1714760790

Well, the article is about JSONPath and jq doesn't use it

turadg · 2024-05-03T19:23:43 1714764223

Same. I was curious what the differences are.

JSONPath can only pull data out, like XPath. jq can do much more, like perform transformations.

jq is also more concise:

  .book[0].title

versus JSONPath:

  $..book[0].title

Here's a discussion with more comparisons: https://github.com/serverlessworkflow/specification/issues/2...

philstu · 2024-05-03T20:14:38 1714767278

jq is wonderful but not relevant to the discussion.

This is about using JSONPath for OpenAPI Overlays and automated API Style Guides like Spectral.

You cannot use jq for either of those things.

Notbing against jq, just a different discussion.

thinkmassive · 2024-05-03T11:32:38 1714735958

kubectl has JSON Path support built in, it’s very useful

https://kubernetes.io/docs/reference/kubectl/jsonpath/

itslennysfault · 2024-05-03T17:17:06 1714756626

AAH-HA... That's why this felt familiar to me. I haven't used K8 in over a year now, but I used this all the time at my previous job. Didn't know it was "JSON Path" just knew it as something I used in kubectl often.

user3939382 · 2024-05-03T22:23:01 1714774981

I wish there weren’t so many JSON path syntaxes. I’m comfortable with jq, then there’s JSON path, I forget which one AWS CLI is using, MySQL has their own. It’s impossible for me to get muscle memory with any of them.

thecosmicfrog · 2024-05-03T23:51:29 1714780289

AWS CLI uses JMESPath.

throwaway413 · 2024-05-03T18:34:01 1714761241

Has anyone had performance issues using JSONPath? We are processing large pieces of data per request in a node express service and we believe JSONPath is causing our service to lock up and slow down. We’ve seen the issue improve as we have started refactoring out JSONPath usage for vanilla iteration and conditional checks.

There are a lot of factors at play so we can’t quite put our thumb on JSONPath, but it’s the current suspect and curious if others have run into anything similar.

philstu · 2024-05-03T20:16:27 1714767387

It would depend far more on the implementation than the concept/spec of JSONPath itself. What implementation are you using?

sxg · 2024-05-03T12:48:41 1714740521

This is exactly the type of thing that LLMs are very good at generating and explaining for you. I've done this countless times to create and understand regex patterns.

andrewingram · 2024-05-03T13:59:44 1714744784

We're using JSONPath to annotate parts of JSON fields in PostgreSQL that need to be extracted/replaced for localization. Whilst I'd naturally prefer we didn't store display copy in these structures, it was a fun thing to implement.

Contrived example:

    @localizedModel({
      title: {},
      quiz: { jsonPath: '$.[question,answerMd]' },
    })
    class MyQuiz {
      title: string;
      quiz: JSONObject;
    }

jeremiahbuckley · 2024-05-03T11:44:13 1714736653

1. Create a mock json document that has the structure you are trying to query.

2. Ask [newest LLM] to write the proper json path to get to the element you want to reach.

TachyonicBytes · 2024-05-03T13:15:30 1714742130

Seems easier to create a program where you can click the element and it shows the jsonpath itself

jeremiahbuckley · 2024-05-04T04:19:25 1714796365

There’s nothing thing wrong with that approach if you’re working with jsonpaths on the regular. It’s all about time management, I guess. With json, and xml, and probably yaml, there is this recurring long-term pattern of:

1. Creat tree-structure document format that is flexible enough to handle all use cases.

2. Write a ton of content in this format.

3. Have to figure out a query pattern to accurately retrieve good info out of these structures.

Generally, I feel we’ve become good at querying normalized table data. But—-and maybe it’s just me being stupid—-wending through tree-structured data is still tricky. And I recently discovered LLMs are great at solving for it, if you ask clearly.

TachyonicBytes · 2024-05-04T12:15:34 1714824934

The thing about querying tree-structured data being currently humanly harder than tabular data rings true to me, I always struggle with some very simple tree-sitter queries.

Tade0 · 2024-05-03T11:38:41 1714736321

I've used JSONPath in a small side project that was supposed to be a linter for TypeScript with JSONPath querying over the abstract.syntax tree:

https://github.com/Tade0/permit-a38/tree/master

Ultimately the linting rules proved to be easier to write than read.

swah · 2024-05-03T16:35:04 1714754104

Alternative: use jless and copy the path of the thing you want, then (maybe) generailze

philstu · 2024-05-03T20:15:11 1714767311

That wouldn't help anyone working with OpenAPI Overlays or Spectral which is what this article is about.

billbrown · 2024-05-03T20:37:50 1714768670

If you're on a Mac, OK JSON is a tremendous aid in working with documents and includes a decent JSONPath query dialog. (Very happy user)

https://okjson.app/

pydry · 2024-05-03T09:49:42 1714729782

These types of languages are a bad idea, just as XPath was. They are complex enough to be a maintenance/bug risk AND don't bring any additional benefit to just writing code in your normal programming language to do the same thing.

You can take my list comprehensions from my cold, dead hands.

There isn't a use case I've seen where these types of mini languages fit well. Ostensibly, you could give it to a user to write to query JSON in a domain-agnostic way in an app but I think it would just confuse most users as well as not being powerful enough for half of their use cases.

Sometimes it's better just to write code.

Silphendio · 2024-05-03T10:24:40 1714731880

JSON Path:

  $.store.book[?@.price < 10].title

Python:

  [x['title'] for x in data['store']['book'] if x['price'] < 10]

Javascript:

  data.store.book.filter(x=>x.price < 10).map(x=>x.title)

sbarre · 2024-05-03T10:56:00 1714733760

All these examples assume an understanding ahead of time of your data structure.

Can you write examples in Python and Javascript where you'd extract those titles from an arbitrary JSON structure? ;-)

aeonik · 2024-05-03T12:14:09 1714738449

I tried this once, and I accidentally invented a poorly implemented and incomplete version of Lisp.

g4zj · 2024-05-03T12:30:54 1714739454

> All these examples assume an understanding ahead of time of your data structure.

How would one use JSONPath to extract all book titles from an arbitrary JSON structure?

Also, when might that use case apply? I can't think of when I've ever needed to do something like this, but I'm interested in learning. :)

sbarre · 2024-05-03T21:53:49 1714773229

JSONPath has excellent documentation, you can absolutely answer that question for yourself very easily.

Silphendio · 2024-05-03T11:47:49 1714736869

You mean searching for keys?

  $..book[?@.price<10].title

Yeah, I don't think javascript has that function in the standard library. Writing one is not super complicated, but having to put that into every file (or importing it) is not ideal.

  find_key = (data, key) => {
    if(data instanceof Array){
     return data.map(x=>find_key(x, key)).flat()
    }
    if(data instanceof Object){
     let res = Object.keys(data).map(x=>find_key(data[x], key))
     if(data.hasOwnProperty(key)){
      res.push(data[key])
     }
     return res.flat()
    }
     return []
   }

  find_key(data, "book").filter(x=>x.price < 10).map(x=>x.title)

arter · 2024-05-03T16:44:19 1714754659

Yep.

You wouldnt use json path as replacement to any language. It might be marginally useful in configurations or passing queries between different services. But the complex syntax limits it in both cases, because you cannot easily automatically modify the query. In the case of configs it would be great to analyze hundreds of configs on different systems and change them automaticaly, same with queries exchanged between services which might even get stored in a database.

I do not understand why domain languages aren't designed with limited syntax in mind. In the style of lisp for instance. Because actually being able to programatically work with the language is a massive advantage that imo far outweights your own frustration with typing a paranthesis or two extra.

ruuda · 2024-05-03T12:42:40 1714740160

RCL (https://rcl-lang.org):

    [
        for book in input.store.book:
        if book.price < 10:
        book.title
    ]

fmbb · 2024-05-03T10:33:37 1714732417

That javascript version is obviously objectively the best alternative.

arethuza · 2024-05-03T13:18:29 1714742309

I actually prefer the JSONPath version, probably I used to like XPath and I'm not hugely fond of JavaScript.

OskarS · 2024-05-03T11:03:17 1714734197

Yeah, it's not bad. Guido famously preferred the list comprehension style over the functional style, but when you have nested data types and the "monadic style" (ish) functions, it does really make sense. You could imagine it in Python (lets pretend lists have map and filter):

    data['store']['book'].filter(lambda x: x.price < 10).map(lambda x: x.title)

Yeah, not nearly as good. The syntax sugar of a.b instead of a['b'] and arrow functions instead of lambda really does make a pretty big difference.

vbezhenar · 2024-05-03T11:37:51 1714736271

Best syntax for closures I've ever seen in Scala. Would look something like

    data.store.book.filter(_.price < 10).map(_.title)

Every language should just adapt it.

fwlr · 2024-05-03T13:34:53 1714743293

I’m a fan, but let’s go even further. JavaScript has pleasant definitions of functions with

    filter((x) => x.price < 10)

but why can’t we just write

    filter(x.price < 10)

and add a rule to the JS engine that says “when you encounter a ‘syntax error: undeclared identifier x’, rewrite the code to add `(x) => ` in front of where the syntax error occurred, if and only if this rewrite prevents the syntax error”.

You might protest that reacting to syntax errors by inserting extra code and checking if the errors go away is an insane strategy, but I would note that JavaScript is actually a semicolon-terminated language in which most developers never write a semicolon, and the JavaScript engine is already using this insane strategy on nearly every line of modern JS to insert a semicolon whenever it encounters a syntax error, so it’s obviously practical.

chuckadams · 2024-05-03T15:23:25 1714749805

The problem with your approach is that `filter(x.price < 10)` is perfectly valid syntax, it's filter with a single boolean arg. You need something else to trigger the magic: change `x` to `it` and you have Kotlin and Groovy's shorthand syntax -- you just can't define a variable called `it` anymore. If you want closure semantics on arbitrary undefined variables, I think I might have to slap you on general principle ;)

lizmat · 2024-05-04T09:50:42 1714816242

That's exactly what the Raku Programming Language does. Except that the "x" is represented by "*", so it reads:

    grep(*.price < 10)

This is referred to as "Whatever-currying": https://docs.raku.org/type/Whatever*

zzo38computer · 2024-05-04T19:00:02 1714849202

I dislike that syntax, but it might be better to instead use a special syntax that can't be the name of any variable, like the asterisk in Raku as mentioned below. In such a case, perhaps the function that would call it should check if the value is already the correct type and use that instead of calling it as a function, since then it would be possible for the filter condition to be a constant true or false that does not depend on the values being filtered (this is not usually useful, but it is more consistent and sometimes it is useful).

I think that the automatic semicolon insertion is a bad feature of JavaScript.

tubthumper8 · 2024-05-03T15:15:59 1714749359

Some languages do this already but with a designated placeholder, like

    filter(_.price < 10)

That may not work because plain underscore is already a valid identifier but another placeholder could potentially be used and no need for the parser backtracking / function insertion (which I don't like the idea of, there may be cases where an undeclared identifier was a bug and it shouldn't be turned into a function)

lylejantzi3rd · 2024-05-03T12:19:31 1714738771

I will never understand why people find the need to cram as much "mystery meat" code into one line as humanly possible. It makes it much harder to understand, debug, and optimize.

im3w1l · 2024-05-04T09:20:37 1714814437

The less clutter there is, the more the remaining elements pop.

lylejantzi3rd · 2024-05-04T18:35:06 1714847706

Cramming a bunch of chained commands on a single line doesn't reduce clutter. In fact, it increases it.

williamcotton · 2024-05-03T12:23:21 1714739001

It cuts out the programerese and makes it easier to read?

williamcotton · 2024-05-03T11:48:25 1714736905

F# just added this syntax:

  let possibleNow = 
      people 
      |> List.distinctBy _.Name
      |> List.groupBy _.Age
      |> List.map snd
      |> List.map _.Head.Name
      |> List.sortBy _.ToString()

chuckadams · 2024-05-03T15:27:24 1714750044

Meanwhile JS is still struggling to define even version 1.0 of a pipe operator, and the proposal has been bikeshedded into shabby oblivion with a syntax that's worse than just pulling out lodash pipe() or similar. TC39 does not fill me with hope.

lizmat · 2024-05-04T09:46:28 1714815988

The Raku programming language has, with some tweaks:

    data.store.book.grep(*.price < 10).map(*.title)

Although personally I would write that as:

    data.store.book.map: { .title if .price < 10 }

which combines the filter / map into a single operation.

https://raku.org

exceptione · 2024-05-03T17:28:15 1714757295

Kotlin has it instead of _

    val numbers = listOf(20, 19, 7, 12)
    val multiplied = numbers.map { 3 * it }
    // [ 60, 57, 21, 36 ]

OskarS · 2024-05-03T11:43:52 1714736632

Raku does a version of that as well, it's sick.

librasteve · 2024-05-06T18:44:32 1715021072

in Raku, this

    data.store.book.filter(_.price < 10).map(_.title)

would be written as

    data.store.book.grep(*.price < 10).map(*.title)

BossingAround · 2024-05-03T13:12:49 1714741969

You know, I think you're right. I never realized that Python list comprehension is basically filter/map (and reduce is an external function). That is actually pretty horrible syntax for it from that perspective.

tgv · 2024-05-03T12:29:47 1714739387

It's also the one that's least optimizable.

Zenzero · 2024-05-03T12:52:08 1714740728

If you're that focused on optimization you wouldn't be traversing JSON.

As an aside the js example above could be simplified to a reduce().

int_19h · 2024-05-04T00:02:50 1714780970

With https://www.jsoniq.org, you can do either:

   store.book[][$$.price lt 10].title

or the more verbose:

   for $book in store.book[]
   where $book.price lt 10
   return $book.title

exceptione · 2024-05-03T17:27:39 1714757259

I think in the Javascript version you have to check for null as well.

wmil · 2024-05-03T13:05:22 1714741522

Those become a bit messier if store or book can be null.

pydry · 2024-05-03T11:30:07 1714735807

So do you regularly hard code the number 10 in your code?

I think you missed my point.

In realistic code you'd be using string interpolation to put the number 10 into this query language, and worrying about injection vulnerabilities while you did it.

Or even calling another arbitrary function to do the filtering. Which this query language can't handle at all.

Adiqq · 2024-05-03T10:11:41 1714731101

> don't bring any additional benefit to just writing code in your normal programming language to do the same thing.

In some cases advantage is that you don't create new code and you just use some relatively standard tool. You just fetch some public package that handles various edge cases and you just prepare script that describes what you want to do with some program. This is useful, if you work in containerized environment and configuration exists as json or yaml. Often I just use jq or yq, instead of reinventing wheel to just read or write some values.

samatman · 2024-05-03T19:25:10 1714764310

One of the advantages of this sort of DSL is that they're easy to share between languages. One needs one implementation of JSON path per programming language, and then it's easy to, for example, iterate on the query in the REPL of a dynamic language, then copy it to a fast compiled language. Or share a query between the browser and the server. That sort of thing.

Regex has similar advantages, if one sticks to the subset of regex which is commonly understood between languages: so less so, for that very reason.

Another plus is the principle of least power. A JSON Path will halt, and it won't make syscalls. There are circumstances where that's useful.

Drakim · 2024-05-03T10:12:22 1714731142

I agree, it gives the same vibe as wanting to somehow bring back the simplicity of excel formulas rather than having to write normal code, to revive that dream of convenient one-liners.

But there is a reason excel has a ceiling of maintainability that always turns it into a spaghetti mess once it's big enough.

stoperaticless · 2024-05-03T10:47:55 1714733275

Are you also against file paths?

psnehanshu · 2024-05-03T11:22:14 1714735334

How's that related?

HarHarVeryFunny · 2024-05-03T11:51:42 1714737102

It's pretty much the exact same thing - let's you specify a file to access with a string (pathname) such as "/foo/bar/cat" rather than having to go step by step first open/reading directory "foo", then open/reading directory "bar", then finally accessing file "cat".

With XPath and JSONPath you're just dealing with DOM nodes and children rather than directories and children.

exyi · 2024-05-03T12:59:13 1714741153

Except that XPath and JSONPath are overpowered. I'd welcome a simple standard path syntax for JSON, for instance, I'd want to use for reporting schema errors in JSON document ($.users[10].name must be a string). I can use JSONPath, sure, but even this subset is annoying to parse - compared to a filepath which you can parse with `path.split('/')`.

HarHarVeryFunny · 2024-05-03T13:56:42 1714744602

Agreed - for a lot of use cases all you need is a pathname-like way to refer to elements in the DOM, not a search/filtering mechanism.

hgyjnbdet · 2024-05-03T18:37:01 1714761421