> [[]] | implode crashes jq, and this was not fixed at the time of writing despite being known since five years.
Well, taking into account that jq development has been halted for 5 years and only recently revived again, it's no wonder that bug reports have been sitting there for that time, both well known and new ones. I bet they'll get up to speed and slowly but surely clear the backlog that has built up all this time.
It's so awesome when projects shout out other projects that they're similar to or inspired by or not replacements for. I learned about https://github.com/yamafaktory/jql from the readme of this project and it's what I've been looking for for a long time, thank you!
That's not to take away from JAQ by any means I just find the JQ style syntax uber hard to grokk so jql makes more sense for me.
Very nice in this regard is gron, too. It simply flattens any json into lines of key value format, making it compatible with grep and other simple stream operations.
Nice find. I think I'll try it out. Although I was hoping for a real SQL type experience. I don't understand why no one just copies SQL so I can write a query like "SELECT * FROM $json WHERE x>1".
Everyone seems to want to invent their own new esoteric symbolic query language as if everything they do is a game of code golf. I really wish everyone would move away from this old Unix mentality of extremely concise, yet not-self-evident syntax and do more like the power shell way.
> Although I was hoping for a real SQL type experience. I don't understand why no one just copies SQL so I can write a query like "SELECT * FROM $json WHERE x>1".
With somewhat tabular data, you can use sqlite to read the data into tables and then work from there.
While i agree about the general sentiment on preferring well defined and explicit standard as opposed to "cute" custom made languages. In this case i am not convince that SQL would be the best candidate for querying nested structures like JSON.Something like xpath maybe.
I agree, it wouldn't be the best to handle all json edge cases, but it would be a super easy way to quickly get data from a big chunk of simple json and you could just use subqueries or query chaining for nested results.
For anyone who hasn't used powershell, this is the difference I'm talking about. I would not be able to write either of these without looking up the syntax. But knowing very little about powershell, I can tell exactly what that command means while the bash command, not so much.
```powershell
$json | ConvertFrom-Json | Select-Object -ExpandProperty x
```
On the other hand, I find the bash one clear and concise. That PowerShell example is so verbose, it'd drive me crazy to do any sort of complex manipulation this way! To each their own, I guess.
I personally don't understand why people aren't willing to learn instead. It's not hard to sit down and pick up a new skill and it's good to step out of one's comfort zone. I personally hate Powershell syntax, brevity is the soul of wit and PS could learn a thing or two from bash and "the linux way".
We seem obsessed with molding the machine to our individual preferences. Perhaps we should obsess over the opposite: molding our mind to think more like the machine. This keeps a lot of things simple, uncomplicated, and flexible.
Does a painter wish for paints that were more like how he wanted them to be? Sure, but at the end of the day he buys the same paint everyone else does and learns to work with his medium.
> I personally don't understand why people aren't willing to learn instead
You misunderstand. As programmers we learn every day, obviously that's one of our strong points.
The real problem is that every single tool wants you to go deep and learn their particular dyslexic mini programming language syntax or advanced configuration options syntax. Why? We have TOML, we have SQL, we have a bunch of pretty proven syntaxes and languages that do the job very well.
A lot of these programmers authoring tools suffer from a severe protagonist syndrome which OK, it's their own personal character development to grapple with, but in the meantime us the working programmers are burning out because everyone and their dog wants us to learn their own brain child.
> We seem obsessed with molding the machine to our individual preferences. Perhaps we should obsess over the opposite: molding our mind to think more like the machine.
How so? Everything in "the machine" was created by other humans; from the latest CLI tool, to the CPU instruction set. As computer users, given that it's practically impossible for a single person to be familiar with all technologies, we must pick our battles and decide which technology to learn. Some of it is outdated, frustrating to use, poorly documented or maintained, and is just a waste of time and effort to learn.
Furthermore, as IT workers, it is part of our job to choose technologies worth our and our companies' time, and our literal livelihood depends on honing this skill.
So, yes, learning new tools is great, but there's only so much time in a day, and I'd rather spend it on things that matter. Even better, if no tool does what I want it to, I have the power to create a new one that does, and increase my development skills in the process.
>I personally don't understand why people aren't willing to learn instead.
Mostly because if you don't use it that often then it ends up forgotten again. I can smash out plenty of trivial regexes, but anything even slightly complicated means I'm learning backreferences again for the 6th time in a decade.
I have that same problem, the advanced features I use too little to remember. Then I started working on a configuration language that should have a non-surprising syntax (json superset, mostly inspired by Python, Rust, Nix). And it turns out, this works well as a query language for querying json documents. https://github.com/ruuda/rcl Here is an example use case: https://fosstodon.org/@ruuda/111120049523534027
While I appreciate the sentiment for bending your mind, rather than the spoon, the practical reality is that developer time is far costlier than compute time.
It is easier to map compute structures and syntax to existing mental models than to formulate new mental models. The latter is effortful and time-consuming.
So, given the tradeoffs, I could learn a new language, or leverage an existing language to get things done.
And yes, given sufficient resources (particularly time), developing new mental models is ideal, but reality often prohibits the ideal.
If the crux is that you want something that maps closer to your personal mental model than what's available, I guess the other option is to build the missing tool yourself. That's the other side of "be the change you want to see".
> So, given the tradeoffs, I could learn a new language, or leverage an existing language to get things done.
There is also the option to create a new language (jqsql or whatnot), optionally sharing it publically.
If you do this I think you'd find out why beyond very trivial stuff, sibling commenters have a point in that SQL isn't a good fit for nested data like JSON. Would still be a useful exercise!
I think the closest I've seen to a SQL experience for JSON is how steampipe stores json columns as jsonb datatypes and allows you to query those columns w/postgres JSON functions etc.
I just checked the GitHub page [1] for Microsoft PowerShell. It looks written in C# and available on Win32/MacOS/Linux, where DotNet is now supported. Do you use PowerShell only on Win32 or other platforms also?
Everyone seems to want to invent their own new esoteric symbolic query language
Can you give an example of something that PS can do that is built-in for text processing, instead of a proprietary symbolic query language?
By "the powershell way" I don't mean actually using powershell. I just mean using verbose, descriptive commands that one can easily understand what it does without having a working knowledge of the scripting language.
Have you looked at [duckdb's JSON support](https://duckdb.org/docs/extensions/json.html)? It's pretty transparent and you can do exactly what you say: `select * from 'file.json' where x > 1` will work with "simple" json files like {"x": 1, "y": 2} and [{"x": 1, "y":2}, {"x":2, "y":3}]
> I don't understand why no one just copies SQL so I can write a query like "SELECT * FROM $json WHERE x>1".
You could ask the same with respect to XML too -- why XPath/XSLT instead of SQL?
The problem is that SQL isn't that convenient when you're querying data in a free-form and recursive schema. Especially the latter, because recursive queries in SQL are just not pithy. I say this as someone who loves SQL.
I do sympathise with that a bit, but for me at least it does not look like jql is the solution:
'|={"b""d"=2, "c"}'
this appears to be something like jq's:
'select(."b"."d" == 2 or ."c" != null)'
which.. is obviously longer, but I think I prefer it, it's clearer?
(actually it would be `.[] | select(...)`, but I'm not sure something like that isn't true of jql too without trying it, I don't know if the example's intended to be complete - and I don't think it affects my verdict)
I have the same problem. Then, unrelated, I started building a configuration language, and it turned out it's quite nice for querying json [1]. Here is an example use case that I couldn't solve in jq but I could in RCL: https://fosstodon.org/@ruuda/111120049523534027
I had the same problem, keeping me from really exploiting the power of jq. But for this and similar cases I am really glad about copilot being available to help. I just tell it what I need, together with a reduced sample of the source-json, and it generates a correct jq-script for me. For more complex requirements I usually iterate a bit with Copilot because it is easier and more reliable to guide it to the solution gradually than to word everything out correctly in the question in the first go. Also I myself often get new and better ideas during the iterations than I had in the beginning. Probably works the same with ChatGPT and others.
How does that usually play out in the Rust ecosystem? Lots of dependencies tell me there's a huge risk of the dependencies becoming inherently incompatible with each other over time, making maintenance a major task. How will this compile in say, 2 years?
Because of the lockfile, it will use the same library versions when compiling again in the future. The main question for "will this compile" is whether the Rust compiler is sufficiently backwards-compatible, which (at least from my experience) it certainly is.
Also re "lots of dependencies": This is kind of unavoidable in Rust because the stdlib is deliberately very lean, and focuses on basic data structures that are needed for interop (e.g. having common string types is important for different libraries to work together with each other) or not possible to implement without specific compiler support (e.g. marker traits or boxing). Contrast this with Go where the stdlib contains things like a full-fledged HTTP server and regex engine. It's easy to build things in Go with a rather short go.mod file, but only because the go.mod file does not show all the stdlib packages that you're using.
I understand the concept of a lock file and they are a blessing, but inevitably one will need to upgrade at least one of the dependencies. Whether this is due to desired functionality or a bug, it is bound to happen.
Lock files won't solve that problem if one of the other libraries will be incompatible. Add more time and the problem compounds. Major problem in e.g. the npm ecosystem.
It is somewhat similar to Linq in C# although SQL there is more standardised so I like it more. Also, it would be fantastic to have in-language support for querying raw collections with SQL. Even better: to be able to transparently store collections in Sqlite.
It is always sad to see code which takes some data from db/whatever and then does simple processing using loops/stream api. SQL is much higher level and more concise language for these use cases than Java/Kotlin/Python/JavaScript
I've found the same. I store all raw json output into a sqlite table, create virtual columns from it, then do a shell loop off of a select. Nested loops become unnested, and debugability is leagues better because I have the exact record in the db to examine and replay.
I've noticed what I'm creating are DAGs, and that I'm constantly restarting it from the last-successfully-proccessed record. Is there a `Make`-like tool to represent this? Make doesn't have sql targets, but full-featured dag processors like Airflow are way too heavyweight to glue together shell snippets.
Yes. SQL is much better for relational data with a strict schema. Though you'll still never get a way to express recursive queries in SQL w/o a lot of verbosity.
Ah, you're right. TextQL combined with Miller would be closer, but DuckDB can do the same things all in one. Always good to have a variety of tools to choose from.
Unfortunately JSON numbers are 64 bit floats, so if you're standards compliant you have to treat them as such, which gives you 53 bits of precision for integers.
Also hey, been a while ;)
Edit: I stand corrected, the latest spec (rfc8259) only formally specifies the textual format, but not the semantics of numbers.
However, it does have this to say:
> This specification allows implementations to set limits on the range/and precision of numbers accepted. Since software that implements IEEE 754 binary64 (double precision) numbers [IEEE754] is generally available and widely used, good interoperability can be achieved by implementations that expect no more precision or range than these provide, in the sense that implementations will approximate JSON numbers within the expected precision.
In practice, most implementations treat JSON as a subset of Javascript, which implies that numbers are 64-bit floats.
I'm being pedantic here, but JSON numbers are sequences of digits and ./+/-/e/E. Whether to parse those sequences into 64-bit floats or something else is left up to the implementation.
However what you say is good practice anyway. The spec (RFC 8259) has this note on interoperability:
> This specification allows implementations to set limits on the range and precision of numbers accepted. Since software that implements IEEE 754 binary64 (double precision) numbers [IEEE754] is generally available and widely used, good interoperability can be achieved by implementations that expect no more precision or range than these provide, in the sense that implementations will approximate JSON numbers within the expected precision. A JSON number such as 1E400 or 3.141592653589793238462643383279 may indicate potential interoperability problems, since it suggests that the software that created it expects receiving software to have greater capabilities for numeric magnitude and precision than is widely available.
JSON does not define a precision for numbers, so: it's often float64 (but note -0 is allowed, but NaN and +/-Inf are not), but it depends on your language, parser config, etc.
Many will produce higher precision but parse as float64 by default. But maximally-compatible JSON systems should always handle arbitrary precision.
From a quick test it looks like it supports exponents up to 9 digits long (i.e. 1.0e999999999), which, frankly, seems pretty reasonable; it's hard for me to imagine a use case where you'd want to represent numbers larger than that.
jq 1.7 do preserve large integers but will truncate if any operation is done on them. Unfortunetly it currently truncates to a decimal64 which is a bit confusing, this will be fixed in next release where it follow the suggestion from the JSON spec and truncates to binary64 (double) https://github.com/jqlang/jq/pull/2949
I guess it's cute that there's some terminal line art library in Rust somewhere, but when I tried to invoke jaq it just pooped megabytes of escape codes into my iTerm and eventually iTerm tried to print to the printer. Too clever.
I tried to do `echo *json | rush -- jaq -rf ./this-program.jq {} | datamash ...` and in that context I don't think it's appropriate to try to get artistic with the tty.
The cause of the errors, for whatever it's worth, is that `jaq` lacks `strftime`.
and (after commenting out halt_error) slower than both jq and gojq
$ time jq -sf aoc22-13.jq input.txt
6415
20056
real 0m0.023s
user 0m0.010s
sys 0m0.010s
$
$ time gojq -sf aoc22-13.jq input.txt
6415
20056
real 0m0.070s
user 0m0.030s
sys 0m0.000s
$
$ time ./jaq-v1.2.0-x86_64-unknown-linux-gnu -sf aoc22-13.jq input.txt
6415
20056
real 0m0.103s
user 0m0.065s
sys 0m0.000s
If this wrong behavior from jq, or some artifact consistent with how the floating point spec is defined, surprising, but faithful to IEEE 754 nonetheless?
I think it is a bit more complex, since NaN is defined to be "unordered" with respect to all other values (including other NaNs), and so any relation for which unordered values result in true (e.g., compareQuietNotEqual) will return true. (See section 5.11)
I used Bard after trying unsuccessfully to decipher the wikipedia page and Bard says, according to IEEE 754, nan < nan should return false (0); while nan > nan should return false (0)
I wish there was some version of Wikipedia for people who speak good English (not Simple English), but aren't assumed to already be experts on the topic. Technical articles are pretty much impenetrable.
So you basically wish for Wikipedia to also feature simplified explanations of technical topics.
I don't think "good English vs simple english" plays into this.
It's not like the problem for technical articles being impenetratable on Wiki is that Wiki doesn't have an intermediate level between expert-talk and simple english.
It's just that it doesn't have simple english explanations of some technical topics.
Quite a lot! i use it to explode both JSON and tex (parse using jq functions). I also use it for exploring ane debug binary formats (https://github.com/wader/fq). Now a days i also use it for some adhoc programming and a calculator.
Yeah, I've always liked the idea of jq but personally I find it easier to open a REPL in the language I'm most familiar with (which happens to be JS, which does make a difference) and just paste in the JSON and work with it there
It may be more verbose, but I never have to google anything, which makes a bigger difference in my experience
https://github.com/wader/fq has a REPL and can read JSON. Tip is to use "paste | from_json | repl" in a REPl to paste JSON into a sub-REPL, you can also use `<text here>` with fq which is a raw string literal
Yes. So much easier to reuse other common helper functions. Once you’ve finished exploration you can just copy the code into production instead of translating.
My most common usage is pretty-printing the output of curl, or getting a list of things from endpoint service/A and then calling service/endpoint B/<entry> to do things for each entry in the list.
I find jq's syntax (and docs) kind of opaque, but I guess we have no other options. And I don't think this latest incarnation breaks any new ground there. But it'd be better if I just wrote it myself - "be the change ...."
Well, as pointed out in the jaq docs there is jql.
But I just looked at jql and I liked it even less. The pedantry about requiring all keys in selectors to be double quoted is, um, painful for a CLI tool.
It have some learning curve, but it actually makes sense when you get used to it and work for other format too. It is much better than other transformation language, and you can even call Java.
I think they kind of stuck in the development, even the mule engine only have one active developer from the github commit ….
and in powershell you don't need to learn all those syntaxes for different tools for different formats like jq, xmlstarlet, etc. Just convert everything to an object and query the data by using powershell syntax
I applaud this project's focus on correctness and efficiency, but I'd also really like a version of `jq` that's easy to understand without having to learn a whole new syntax.
`jq` is a really powerful tool and `jaq` promises to be even more powerful. But, as a system administrator, most lot of the time that I'm dealing with json files, something that behaved more like grep would be sufficient.
This is awesome, thanks! Not OP, but this will help me to write specifications for modifying existing JSON structures immensely. It's kind of a pain parsing JSON by (old man) eye to figure out which properties are arrays, and follow property names down a chain. This will definitely help eliminate mistakes!
Doesn't work in my terminal. When you recommend yq behavior, please specify which yq you're using. There are at least two incompatible implementations.
This looks some much better as an ad-hoc tool. Would be cool if it supported more formats - plist, yaml, xml (hoow to do body, or conflicting attr/elements)
It is a little early to say, but I have been learning how nushell deals with structured data and it seems like it is very usable for simple cases to produce readable one-liners, and if you need to bring out the big guns the shell is also a full fledged scripting language. Don't know about how efficient it is though.
It needs to justify moving to a completely different shell, but the way you deal with data in general does not restrict itself to manipulating json, but also the output of many commands, so you kinda have one unified piping interface for all these structured data manipulations, which I think is neat.
Maybe like SQL for relational algebra? Codd made two query languages that were "too difficult for mortals to use". (B-trees for performance was a separate issue)
But jq's strength is its syntax - the difficulty is the semantics.
there's got to be some syntax though. jq does a unique function that isn't defined in any other syntax. i'm with you, the jq syntax is weird and sometimes difficult to understand. but the replacement would just be some different syntax.
these little one-off unique syntaxes that i'm never going to properly learn are one of my favourite uses of chatGPT.
You could use an elaborate filter with jq (see https://stackoverflow.com/a/73040814/452614) to transform JSON to XML and then use an XQuery implementation to process the document. It would be quite powerful, especially if the implementation supports XML Schema. I have not tested it.
Yes. jq is essentially an XPath/XSLT for JSON. I'd say that jq is more powerful than XPath/XSLT, but that's neither here nor there since both can evolve to be as powerful as they need to be.
I know perl is useful. I know it's going to help me. It seems like you can get away with a quick perl script whereas a python script would attract scrutiny.
jq have been in my toolbox since a while it’s a very great tool. But yet another query language to learn, jaq seems identical on that. I think that’s where LLMs can help a lot to make it easier for adoption, I started a project on that note to manipulate the data just with natural language, https://partial.sh
‘cat’ your json file and describe what you want I think should be the way to go
The obvious reason here is jaq makes some changes to semantics, changes which would be rejected by jq.
Another likely reason is that it seems a motivation for jaq is improving the performance of jq. Any low-hanging fruit there in the jq implementation was likely handled a long time ago, so improving this in jq is likely to be hard. Writing a brand new implementation allows for trying out different ways of implementing the same functionality, and using a different language known for its performance helps too.
Using a language like Rust also helps with the goal of ensuring correctness and safety.
jq hasn't had much work done to make it fast though.
There's two classes of performance problems:
- implementation issues
- language issues
The latter is mainly a problem in `foreach` and also some missing ways to help programmers release references (via `$bindings`) that they no longer need.
The former is mostly a matter of doing a variety of bytecode interpreter improvements, and maybe doing more inlining, and maybe finding creative ways to reduce the number of branches.
jq maintainer here. We love that there are multiple implementations of jq now. It does several things: a) it gives users more choices, b) it helps standardize the language (though we've not yet written a formal specification), c) it brings more energy to jq because the maintainers of the other tools have joined jq as maintainers. I also love that these alternative implementations relieve my growing dislike of C.
Somewhat off-topic, but is there a tool which integrates something like this/jq/fx and API requests? I’d like to be able to do some ETL-like operations and join JSON responses declaratively, without having to write a script.
I think a query language would be great, with a way to subquery/chain data from previous requests (e.g. by jsonpath) to subsequent ones.
The closest I’ve gotten is to wrap the APIs with GraphQL. This achieves joining, but requires strict typing and coding the schema+relationships ahead of time which restricts query flexibility for unforeseen edge cases.
Another is a workflow automation tool like n8n which isn’t as strict and is more user-friendly, but still isn’t very dynamic either.
Postman supports chaining, but in a static way with getting/setting env variables in pre/post request JS scripts.
Bash piping is another option, and seems like a more natural fit, but isn’t super reusable for data sources (e.g. with complex client/auth setup) and I’m not sure how well it would support batch requests.
It would be an interesting tool/language to build, but I figure there has to be a solution out there already.
This is exactly what Murex shell does. It has lots of builtin tools for querying structured data (of varying formats) but also supports POSIX pipes for using existing tools like `jq` et al seamlessly too.
I think it's more the hand-in-handedness that seems to exist between "rewrite an existing, mature tool" and doing it in Rust. Half the time it's hard for me to know which caused which — the need for the tool, or the desire to rewrite something in Rust.
The other options are C, C++, Go, and maybe Ada or Zig, though I haven't seen many CLI tools written in those two in practice. In practice, it seems like Go, Rust, and C++ are the preferred languages for newer CLI tools, although I have no data; my conclusion is based on my general perception. Older ones, C and Perl.
I'm a lot happier with a fad for Rust-written CLI tools than the disappointment of reading install instructions for a simple CLI tool that starts with "First... npm... bower..."
"Yak", just like "javascript" is pronounced "yavascript"[1], and "JIF" peanut butter is pronounced "yif". Universally pronouncing "j" as "y" maximizes confused amusement.
> Jaques is the French spelling of Jack/Jaak/Jak/Jaq. They're all pronounced the same
They're not, though. The French pronunciation of 'j', as in the word Jaques is /ʒ/. In English, 'j' at the beginning of the word 'Jack' is pronounced /dʒ/. And 'Jaak' makes me think of Dutch, where that 'j' is pronounced as /j/.
In the real world, the descriptivist realizes an individual's pronunciation of the concept labelled Jac/Jack/Jacques/Jacq/Jak/etc. depends much more on the their personal context and stylistic choice than the spelling used.
I've heard many folks (American and otherwise) pronounce "Jack" many times in my life, and the range of utterances very comfortably includes Pépin's own "Jacques".
There’s no single way “native” speakers say any word, and the fact that you think there is shows you have had no exposure to the massive diversity of American accents.
What dialect of American English treats /ʒ/ and /dʒ/ as allophones at the beginning of a word (or any other context)? You've already (weirdly) accused me of being a prescriptivist, here's your chance to counter with some descriptive evidence of the kind of variation you are talking about.
I'd love to be wrong, because I'd learn a new thing. Please, educate me.
I'm also from the US and I've spent quite a lot of time listening to people speaking, specifically listening for how they realize phonemes. Your experience doesn't at all align with mine, either what I heard or what I've read about dialect variation in the US.
As someone who has heard the name "Jack" pronounced by Americans many, many, many times in their life... that Jaques video sounds entirely in-range of the variety of pronunciations I hear for Jack.
Well, as one particular American, who has spoken to many, many other Americans in their life, I can only tell you what I think.
Something I find interesting is that Americans say the a in the word taco (a word borrowed from Spanish) with the a as in father, that is, [ɑː], but English people say the a as in cat, [æ]. Different dialects approximate the Spanish [a] differently.
Yes they're slightly different in theory, but not in any way that would prohibit mutual understanding. Besides, if you're telling anyone about this library you're most certainly going to spell it out anyways.
I'm American and pronounce Jacques and Jack the way they described. If someone said [ʒak], I would transcribe it as Jacques, and if someone said [dʒæk], I would transcribe it as Jack. It may be a French name, but it's not very foreign. (If I heard [dʒak], I would assume the speaker is British and transcribe it as Jack).
I was confused reading people say that Jacques is pronounced the same as Jack, so it does seem like mutual understanding is inhibited.
It's just like how, even though Johann is a German name (though borrowed from Latin), I know to pronounce it in English not as [dʒoʊhæn] (the naive English pronunciation), but as [joʊhan], which is similar to the German pronunciation, [johan].
You're implying I subconsciously view them the same and pronounce them the same. But I don't. Maybe your dialect of English is different than mine, but I am not you. And it was there for a while because I use Hacker News on my phone and don't check it all the time.
My original sentence repeated the same word twice as a typo. It was this:
> I was confused reading people say that Jacques is pronounced the same as Jacques.
I realized my mistake and edited it to this:
> I was confused reading people say that Jacques is pronounced the same as Jack.
If we'd been discussing the words "chick" and "chic," I might have accidentally written:
> I was originally confused reading people say that chick is pronounced them the same as chick.
Then I'd realize my error and edit it to:
> I was originally confused reading people say that chick is pronounced them the same as chic.
That doesn't mean I actually pronounce "chick" the same way as "chic" and it doesn't make the words interchangeable in the dialect I speak. "Chic" is pronounced like "Sheikh," referring to the Arab leader, or like "Sheik" from the Legend of Zelda. I'll be confused if you say "a baby Sheikh" instead of "a baby chick," and if you say "chick fashion" instead of "chic fashion" I'll be thrown off but realize you meant "chic."
The implication I'm positing is "if you mix up words without notice, they are conceptually interchangeable". You can't disprove it by stating that words you didn't mix up without notice aren't interchangeable.
Sometimes I have accidentally written "chick" when I wrote "chic" due to autocorrect, just not during this conversation.
Regardless, I guess I can't make you believe me when I say what sounds natural to me. Ignore what I say if you really want. The fact that you insist that I say the two names interchangeably does not make it so.
How does this relate to navigating structured documents? Even if you use XML, presumably you will want to programmatically navigate/query it at some point.
That's my whole point. The tools for navigating, transforming, streaming, parsing, etc. XML are genuinely terrific, like nothing else, and it's demoralizing to see younger devs throw it all away because they prefer not to have to learn anything with more than trivial complexity.
XMLs downfall was not providing built-in serialization/de-serialization. If XML had started with libraries like https://pydantic-xml.readthedocs.io/en/latest and people understood that this was the way to produce and consume XML -- that if you're using something like xpath or touching the raw tree with getChildElement and the like for more than one-off scripts something has gone wrong. And that xslt is at best an optimization and at worst staring into the abyss so don't start with it.
But now it doesn't matter because the backing format doesn't really matter and JSON was there at the right place right time.
It's not really because they don't want to have to learn it, it's because XML is fundamentally the wrong data model for most data. JSON is great because it matches the object structure used in 99% of programming languages - for JS it is the object structure.
Find me a programming language where objects have attributes, the order of members is significant and can be interleaved, everything is stringly typed etc...
It's a shame because I agree the tooling for XML is still better than JSON. But not better enough that it's worth fighting the data model mismatch.
I'm not sure if there is any open source XSLT tool as complete as jq is for JSON. There is xsltproc but IIRC it does not support streaming scenarios (jq has some support for streaming processing)
Though, personally, I prefer JSON. Probably due to superior tools (thanks to its popularity) and less-bloated syntax (it is somewhat easier for me to read raw JSON file than raw XML file).
I do not see license in either repository and it seems that this tool only has 30 day evaluation tier for free. Anyway, using this means that you have dependency on a single vendor and you accept their future pricing changes.
If XML tools aren't open enough for certain needs, then sure, I get it. But it's tragic to see highly-engineered, pro solutions just die out because younger devs don't like the learning challenge or because business owners are cheapskates.
Sure, but not everything uses XML. Lots of things use JSON, so even if you do not like it, presumably you will have to work with it at some point. So this is a tool that lets you do that. I do not think it is reasonable to expect that everyone uses XML, or should use XML, even if it is your favorite.
You don't understand the power of XML and committee design. XPath could do almost everything. And XSLT in skillful hands could give birth to a blackhole due to information density alone.
I think we all understand this to some degree, but working on open source, outside of a few flashy projects, is some of the most thankless work there is. And contributing an immense amount of difficult work (such as perf and correctness improvements across the board) to a repo that you don't own and won't be recognized for is somehow significantly more thankless than that. For whatever reason, people only really care about the creator of a project, and virtually no one else.
For instance, do you know who Junio Hamano is? Oh, he's just a guy who's been maintaining a fairly minor project called Git for the last 15 years. But everyone can connect Linus Torvalds with git, even though he only worked on it consistently for a year or two before leaving it [1].
Also, and I think we all know this too, but working on someone else's codebase kinda sucks. Greenfield is so much more fun. It's a shame, but I'm really not surprised in the slightest.
As an outsider, getting your code merged into a popular open source project involves a political process of convincing the maintainers that your fix should be addressed, and then convincing them they should merge your code.
Writing a fork involves sitting down at your laptop and coding it out.
As the benchmarks show, jaq is pretty significantly faster than jq.
I've commented before that I expect Rust to be a language that is generally faster than even C or C++ in a way that's hard to capture in small benchmarks, because the borrow checker permits code to be written safely that does less copying that other languages have to do for safety. Given the nature of what jq/jaq does, I wouldn't be surprised that that is some of the effect here. It would be interesting to instrument them up with tools that can track the amount of memory traffic each benchmark does to compare (that is, not memory used but total traffic in and out of RAM); I bet the Rust code shows a lot less.
That would still be a microbenchmark. Given that the benchmarks in the post take on the order of seconds to run, I am assuming they are not microbenchmarks, or at least, much less "micro"benchmarks. I would hope some sort of standard JSON querying benchmarking suite would include some substantial, hundreds-of-kilobyes or more JSON samples in it.
I think in this case it's for the completely reasonable reason that he wanted to write it in Rust and asking jq to rewrite their whole project in rust would be obnoxious.
Well, taking into account that jq development has been halted for 5 years and only recently revived again, it's no wonder that bug reports have been sitting there for that time, both well known and new ones. I bet they'll get up to speed and slowly but surely clear the backlog that has built up all this time.