I love JQ so much we implemented a subset of JQ in Clojure so that our users could use it to munge/filter data in our product (JVM and browser based Kafka tooling). One of the most fun coding pieces I've done, though I am a bit odd and I love writing grammars (big shoutout to Instaparse![1]).
I learned through my implementation that JQ is a LISP-2[2] which surprised me as it didn't feel obvious from the grammar.
I can't stand jq. I realize this is an unpopular opinion, and our codebase at work has plenty of jq in the bash scripts, some of it even code that I wrote. I begrudgingly use it when it's the best option for me. But something about it rubs me the wrong way - I think it's the unintuitive query syntax and the need to search for every minute step of what I'm trying to do, and the frequency with which that leads to cryptic answers that I can only decipher if I am some sort of jq expert. But I have this instinctive reaction to all DSL languages that embed themselves into strings, like htmx and tailwind (both embedded in attribute string values). I realize some people like it, and it's a well-made piece of software, and I will even admit that sometimes there is no better choice. But I guess I just hate that it's necessary? I guess I could also admit it's the least-bad option, in the sense that it's a vast improvement over various sed/awk/cut monstrosities when it comes to parsing JSON in bash. Certainly once you find the right incantation, it's perfect - it transforms some raw stdin into parsed JSON that you can manipulate into exactly what you need. But for me, it ranks right next to regex in terms of "things I (don't) want to see in my code." I hate that the jq command is always some indecipherable string in the middle of the script. The only real alternative I've ever used is piping to a Python program that I define inline in a heredoc, but that ends up being at least as nasty as the JQ script.
> I hate that the jq command is always some indecipherable string in the middle of the script
It might be worthwhile to just learn how jq works. At the end of the day, you need to learn some language to parse json. I hate DSLs too, but I cannot think of anything as useful and concise as jq.
> but that ends up being at least as nasty as the JQ script
That's exaxtly why jq is so nice. Nice alternatives just don't exist
> That's exaxtly why jq is so nice. Nice alternatives just don't exist
Write a simple Python script, parse JSON into native objects, manipulate those objects as desired with standard Python code, then serialize back into JSON if necessary. Voila, you have a readable, maintainable, straightforward solution, and the only dependency (the Python interpreter) is already preinstalled on almost every modern system.
Sure, you may need a few more lines of code than what would be possible with a tailor-made DSL like jq, but this isn't code golf. Good code targets humans, not "least possible number of bytes, arranged in the cleverest possible way".
The simple existence of DSL tools like jq is the testament to the fact that people don't want to go to a generic language to solve every kind of problem.
I'm also convinced that a big subset of "use generic language for everything" do it because they want to use their shiny hammer on that nail as well.
> Sure, you may need a few more lines of code than ...
jQ integrates very nicely into bash script. Especially in between pipes a short&simple jq-snippet can work wonders for readability of the overall script.
On the other hand, if the bash script becomes too complex it may be a good idea to replace the entire bash script with python (instead of just the json-parsing-part)
> ... if the reader happens to be familiar with the niche language "jq".
Eh. Linux/Unix has always had an affinity for DSLs and mini-languages. If you're willing to work with bash, sed, awk, perl, lex, yacc, bc/dc etc. jq doesn't seem like it should cause too much consternation.
> Especially in between pipes a short&simple jq-snippet
Many of them are not short and simple though. And each time you do a some transformation, you pretty much need to go in/out of jq at each step of it want to make some decisions or get multiple types of results without processing the original multiple times.
The point in my career at which I used jq the most was when I was doing a lot of work with Elasticsearch doing exploratory work on indexed data and search results. Doing things such as trying to figure out what sort of values `key` might have, grabbing ids returned, etc.
Second to this, I've mostly used jq to look at OpenAPI/swagger files, again just doing one-off tasks, such as listing all api routes, listing similarly named schemas, etc.
From what I've seen in the companies I've worked for, this is fairly consistent, but naturally I can't speak for everyone's use-cases. At the end of the day, I don't think most people use jq in places where readable or maintainable would be most appropriate.
Yea except the python solution is probably going to be several hundred lines, instead of a few.
Python is often not installed in server environments unless it's a runtime environment for Python.
Want to use a non standard library? Now your coworkers are suddenly in Python dependency hell. Better hope anyone else that wants to use this is either familiar with the ecosystem, or just happens to have an identical runtime environment as you.
Or someone could just curl/apt/dnf a jq binary to use your 3 line query, instead of maintaining all of this + 200 lines of Python.
I got to jq for the same reason I go to regular expressions. If you tell me this is too complex
(?:[A-Z][a-z]+_?(\d+))
Then I don't know what to tell you. Do you think that's too complex and should be a python script too? I don't think so. It looks complex, but if you just learn it, it's easier than a 'simple' script to do the same thing.
I'd argue it's good code if you don't have to sift through lines of boilerplate to do something so trivial in jq or regex syntax.
I do lots of exploratory work in various structure data, in my case often debugging media filea via https://github.com/wader/fq, which mean doing lots of use-once-queries on the command line or REPL. In those cases jq line-friendly and composable syntax and generators really shine.
> Something not having alternatives doesn't make it necessarily nice
Of course not, but compared to every alternative today, jq is eons better than everything else. It's conciseness, ease of use, ease of learning all make it awesome. So as of right now, it is the nicest thing to use by far.
Personally though, I don't think I do wish for better. Jq is missing nothing that I want.
I really like jq, but I think there is at least one nice alternative to it: jet [1].
It is also a single executable, written in clojure and fast. Among other niceties, you don't have to learn any DSL in this case -- at least not if you already know clojure!
I hadn't seen this before. At a quick glance, the syntax looks fine. Though I don't know what command line utility I'd need to use it. It makes me wonder how hard a translator from jq syntax to jsonpath would be... Then we could have our cake and eat it too.
In my opinion (potentially nor popular) JQ has this appeal to nerds the same way that stuff like Perl does. I say this as someone who did Perl for 20years but now prefers python or JS…
For many people regexes are as bad as the jq queries… and vice versa. I would not recommend to write python script instead of regexp, but indeed it may work the same for small data and be more readable.
I love régex and been mastering it since 1999. So much that in 2013 I used it in production to parse binary protocol with dynamic sized fields. I believe the project is still talking 10k plus devices. Google must’ve just released protocol buffers… I would love to finally see regexes which can work over custom flow of objects and also on trees.
I also loved XPath which is very powerful and very comprehensible, then there is CSS1/2/3 which are again for queries to structures tree like data.
The prospect of now learning jq does not appeal me that much even though I appreciate its ingenuity. I may recommend it to dev/ops colleagues now and then, but for me this syntax is a lot of additional cognitive pressure which does not necessarily pay up. Of course if there is large amount of JSON data - it is the Swiss knife.
But nowadays I’ll likely use some LLm to generate the jq query for me. Also would joke with my bash-diehard colleagues who would love one more DSL…
For simple things like navigating down one key, or one array entry, I know by heart, and it's incredibly useful. But anything more complicated, and I'm too lazy to lookup the documentation.
jq will fall into the bucket along with sed/awk of "tools I once wished to become an expert on, but will never do so because ChatGPT came along".
Would also put regex into that bucket, but they're so ubiquitous that I've already learned regexes. I wonder if the new wave of coders learning coding via ChatGPT will think of regexes the same way I think of sed/awk.
I think these very terse languages are precisely the ones you shouldn't unleash ChatGPT on. It needs to be really exact and if it is wrong, you can easily end up with something that is an infinite loop or takes exponential time with respect to the input.
My way of using ChatGPT is just to ask it to give me some complicated sed/awk command, and then I can usually understand easily if the command is correct, or easily look it up. So it is very good for learning.
many problems seem to have the property that it's easier to verify a solution than to come up with one. If someone provides a filled-out sudoku puzzle, it's relatively straightforward to check if they've followed the rules and completed it correctly. However, actually solving the puzzle from scratch requires a different kind of thinking and might take more time.
I've also found that learning by "ask ChatGPT, paste, verify" is so much faster and more fun than banging my head against concrete to deeply read documentation to reason about something new.
I've started doing this for new programming languages and frameworks as well, and it shortens the learning curve from months down to days.
Agree - by the time I need more than grep and reach for json parsing, it’s already complicated enough for a Python script. stdin pipped to json.loads ain’t that bad.
Def. seen jq thrown into sed/awk scripts where a readable programming language was the right move. People spend hrs finding the right syntax to these things ~ not always well spent.
I've got similar feelings about it and recently I started experimenting with writing scripts in Nushell rather than bash + jq. I get the json object as a proper type in the script, get reasonable operations available on it and don't have to think of weird escaping for either the contents or the jq script. It cuts down the size my scripts by about a half and I'm very happy with the results.
Yeah, Python is like 10-20x the number of lines required to do the same thing as jq (especially with the boilerplate of consuming stdin), but that's also why it's more readable. But generally I agree - I would choose jq over some weird bash/python hybrid most of the time. I just wish it was more immediately readable.
Simple jq programs are easy to read because simple jq programs are just path expressions, and the jq language is optimized to make path expressions easy to read. Path expressions like
.[].commit | select(.author == "Tom Hudson")
which basically says "find all commits by Tom Hudson" in the input.
`.[]` iterates all the values in its input (whether the input be an array or an object). `.commit` gets the value of the "commit" key in the input object. You concatenate path expressions with `|`, and array/object index expressions you can just concatenate w/o `|`, so `.[]` and `.commit` can be `.[] | .commit` and also `.[].commit`. Calls to functions like `select()` whose bodies are path expressions are.. also path expressions.
Perhaps the most brilliant thing about jq is that you can assign to arbitrarily complex path expressions, so you can:
The syntax is strange probably because of this trying to make path expressions so trivial and readable.
jq programs get hard to read mainly when you go beyond path expressions, especially when you start doing reductions. The problem is that it resembles point free programming in Haskell, which is really not for everyone.
The other thing is that jq is very much a functional programming language, and that takes getting used to.
Also, here’s something that seems not widely appreciated: You can write super clever unreadable one-long-line jq programs embedded in bash scripts (I hear you on the point-free thing), or you can write jq programs that live in their own files, with multiple lines, indentation, comments, and intermediate assignments to variables with readable names. I recommend the latter!
This also won't work since it'll crash on missing fields. e.get("commit", {}).get("author", "") maybe (ignoring the corner case of non-list top level object).
This is a non-problem solved by the jq example. Clearly nobody sane writes (or consumes) APIs which sometimes produce array of object, sometimes produce singular objects of the same shape... Or maybe I'm spoiled from using typed languages and cannot see the ingenuity of the python/javascript/other-untyped-hyped-lang api authors that it solves?
> Clearly nobody sane writes (or consumes) APIs which sometimes produce array of object, sometimes produce singular objects of the same shape...
Has nothing to do with arrays, it has to do with the fact that Python dicts with string indexes and Python objects with properties are different things, unlike JS where member and index access are just different ways of accessing object properties.
> Or maybe I'm spoiled from using typed languages and cannot see the ingenuity of the python/javascript/other-untyped-hyped-lang api authors that it solves?
This isn't an untyped thing, this is a JavaScript (and thus JSON) and Python have type systems (even if they usually don't statically declare them) and those type systems and thus the syntax around objects are different between the two.
Oops, yep totally. Even more futzy! Think if I was doing this a lot I'd totally pull out one of those "dict wrappers that allow for attr-based access" that lots of projects end up writing for whatever reason
I wish it had won over jq because JMESPath is a spec with multiple implementations and a test suite where jq is... well jq and languages have bindings not independent implementations.
> I wish it had won over jq because JMESPath is a spec with multiple implementations and a test suite where jq is... well jq and languages have bindings not independent implementations.
jq has multiple implementations too! In Go, Rust, Java, and... in jq itself.
> jackson-jq aims to be a compatible jq implementation. However, not every feature is available; some are intentionally omitted because thay are not relevant as a Java library; some may be incomplete, have bugs or are yet to be implemented.
Where JMESPath has fully compliant 1st party implementations in Python, Go, Lua, JS, PHP, Ruby, and Rust and fully compliant 3rd party implementations in C++, Java, .NET, Elixer, and TS.
Having a spec and a test suite means that a all valid JMESPath programs will work and work the same anywhere you use it. I think jq could get there but it doesn't seem to be the project's priority.
I've found Ruby much nicer for writing dirty parsing logic like this in a "real" language, it lets you be more terse and "DRY" than Python. Which in bigger software projects doesn't hurt me as much but when I'm primarily trying to write something that otherwise would be well handled by SQL or JQ I found Ruby the better middleground for me.
"Indecipherable string" to me means you likely don't understand the language or how it works.
The language itself works very well for what it needs to do.
It does not work the same way as something like parsing an object and manipulating it in python.
It is a query language. You are building up a result not manipulating objects.
Definitely unintuitive if you are coming from a programming language.
Once learned it makes a lot more sense and is even preferable depending on your needs.
> it's the unintuitive query syntax and the need to search for every minute step
I love jq as a power tool and have the same challenges. I think the best path would have been for JavaScript to adopt something akin to JsonPath, although I more often reach to jq out of familiarity than use it in kubectl.
I hadn't looked into JsonPath as a standard, and on closer inspection, it looks to be stalled out. Maybe I'll keep piping kubectl get <resource> -ojson | jq '<what I'm looking for>'.
The responses to this comment seem to miss a vital point that the comment is making: languages executed within a different primary language are usually opaque to the tools in use. Those tools are usually aimed purely at the primary language, not any secondary languages used within it. Tools for the secondary language are now much harder to use because they (usually) have to be invoked and used via the primary language.
If I’m working on a Python script which has some jq embedded in it, then these problems probably exist:
- My editor will only syntax colour the Python, and treat jq code as a uniform string with no structure
- My linter will only consider Python problems, not jq problems
- My compiler, which is able to show parsing errors at compile time rather than runtime, will not give me any parsing errors for jq until execution hits it (yes, Python has a compilation step)
- jq error messages that show a line number will give me a relative line number for the jq code, rather than the real line number for where that code lives in the Python file
- My debugger will only let me pause and inspect Python, and treat the jq execution as a black box of I/O
I’m discussing this as a jq problem, but this happens far more commonly with SQL inside any host language. No wonder ORMs are so popular: their value isn’t just about hiding/abstracting SQL, it’s about wrangling SQL as a secondary language inside a different primary one.
- Microsoft’s LINQ for C#
- Webdev-focused IDEs which aim to correctly handle HTML and Javascript inside server-side languages (e.g. PHP)
jq is way too much for what I need. I hacked together a filter in C to reformat JSON and I like it better than every JSON library/utility I have tried. For simple reformatting, jq is slow and brittle by comparison. Also, I can extract JSON from web pages and other mixed input. All the JSON utilities I have tried expect perfectly-formed JSON and nothing else.
I also find VisiData is useful for adhoc exploring of JSON data. You can also use it to explore multiple other formats. I find it really helpful, plus it gives that little burst of adrenaline from its responsive TUI, similar to fx and jless mentioned.
For my toolbox I include jq, gron, miller, VisiData, in addition to classics like sed, awk, and perl.
I understand where you're coming from and often feel the same, but I'm also afraid that this is a clear case of inherent complexity: querying JSON is just a complex problem and requires a complex query language, regardless of how well a piece of software implementing it is designed. The same is valid for regexes of course.
The main problem is treating one-thing and many-things the same way. Its not a great PL design choice (and its why we can't have slurp as a filter). If streams (not arrays) were also first-class, we would easily have `smap`, `sselect` etc and the code would look like a functional programming language where | is the pipeline operator.
Otherwise, its fine if you try to keep the thought "everything is a 'filter' or a composition of filters, and a 'filter' is a function that either maps, flatMaps or filters things" in your mind at all times
`jq` and `GNU Parallel` share a world in my brain where I know they're wonderful tools, but I spend more time grokking the syntax of each one as rarely as I need either, than just writing a bash/sed/awk/perl, ruby, or python script to do what I need.
`jq` solves the problem of JSON in legacy shells. But I think the real problem is that the world is stuck using Bash rather than a more modern shell that can parse JSON (as well as other data structures) as natively as raw byte streams.
The problem with Bash is to do anything remotely sophisticated you end up embedding DSLs (a bit of awk, some sed, a sprinkle of jq, and so on and so forth) into something that is itself already a DSL (ie Bash).
Whereas a few more modern shells have awk, sed and jq capabilities baked into the shell language itself. So you don’t need to mentally jump hoops every time you need to parse a different type of structured data.
It’s a bit like how you wouldn’t run an embedded Javascript or Perl engine inside your C#, Java or Go code base just to parse a JSON file. Instead you’d use your languages native JSON parsing tools and control structures to query that JSON file.
Likewise, the only reason jq exists is because Bash is useless and parsing anything beyond lists of bytes. If Bash supported JSON natively, like Powershell does (and to be clear, I’m not a fan of Powershell but for whole different reasons) then there would be literally no need for jq.
Community refuses to admit that powershell is much better alternative to bash/python combo and here we are stuck in this mess.CI/CD scripts spaguetti is usually the most unstable piece of code in a company.
> Community refuses to admit that powershell is much better alternative to bash/python combo
Because its not.
Powershell is very nice as a glue language for .NET components, and its better as a general purpose shell/scripting language than the old DOS-inspired Windows Command Prompt, for sure.
I greatly dislike case-insensitivity. It's a source of many problems for users and implementors.
For implementors case-insensitivity means the need for full Unicode support is urgent, while Unicode canonical equivalence does not often make the need for full Unicode support urgent. In practice one often sees case-insensitivity for ASCII, and later when full Unicode support is added you either have to have a backwards compatibility break or new functions/operators/whatever to support Unicode case insensitivity.
For users case-insensitivity can be surprising.
For code reviewers having to constantly be on the lookup for accidental symbol aliasing via case insensitivity is a real pain.
Why does it have to be bash+python? I'm finding myself using node.js scripts glued together by bash ones these days unless I'm working on a lot of data. Doing that means you can work with json natively.
`json.loads` in Python exists, and Python does the intuitive thing when you do `{"a": 1} == {"a": 1}`, at least for most purposes (you want the other option? `is` is right there!). Stuff like argparse is not the easiest thing to use but it's in the standard library and relatively easy to use as well.
Not going to outright say that node.js scripts are the worst thing ever (they're not), but out-of-the-box Python is totally underrated (except on MacOS where `urllib` fails with some opaque errors untill you run some random script to deal with certs)
Assuming <data> will be a key-value-object aka dict, it would be something like this:
import json
data = json.loads('<data>')
bar = None
if foo:=data.get('foo'):
bar = foo[0].bar
print(bar)
If you can't be sure to get a dict, another type-check would be necessary. If you read from a file or file-like-object (like sys.stdin), json.load should be used.
I love nodejs, it's my go-to language for server side stuff.
Even with that bias though, I have to admit that it's awful for typical command line script stuff.
Dealing with async and streams and stuff for parsing csv files is miserable (I just wrote some stuff to parse and process hundreds of gigs of files in node, and it wasn't fun).
Python is the right tool for that job IMHO.
Also, weirdly, maybe golang? I just came across this [1] and it has one of my eyebrows cocked.
Any not-designed-specifically-for-shell language will suck for shell, more or less. Ruby, python, node, whatever, they all have the same problem - you write stuff too much and care about stuff you shouldn't care while in shell.
You're probably right. I just wish there was an easier way to handle json on the command line that didn't turn into its own dsl. The golang scripting seems interesting, might be what motivates me to learn the language.
Apparently, the old community need to literary die with their old habits for new to take place. There is no amount of good argumentation that can be fruitful here. And there is tone of it, pwsh is simply on another level then existing combos.
The fact that you have to learn a new language to parse JSON is frankly insulting. If you've gotten to the point you're parsing JSON with a shell script, you should've switched to a real language a week ago.
Some people are weird and awe at the ellegance of piping 8 obscure commands, but if I'm given this shit and have to keep it working, I'm rewriting it on the spot.
Are you rewriting it in the first language you learned?
Sometimes less general tools are nice. If they fit the problem space well, they can be very expressive without feeling unwieldy. And in some contexts reducing the power/expressivity is actually a good thing (e.g. not using a C interpreter to make your program and your config file use the same 'language')
I also just add a JQ parser/grammar to the online LALR(1)/FLEX grammar editor/tester at https://mingodad.github.io/parsertl-playground/playground/ select "Jq parser (partially working)" from examples then click "Parse" to see a parser tree of the source in "Input source".
A related question for you and anyone else into this kind of tooling: if you had to automate some structural edits across a codebase that contains a wide range of popular languages (say: C++, C#, Java, Ruby, Python), and you had to do it with a single tool, which tool would you use?
jq is great for letting users munge their data; we do something similar letting users provision an incoming webhook endpoint, send us arbitrary json data, and set up mappings to do useful things with their data, along with regression tests, monitoring, etc. jq makes the majority of cases straightforward (basically json dot notation) and the long tail possible.
I love jq, but I also use JMESPath (especially with AWS CLI), yq (bundled with tomlq and xq as well), and dasel [2]. I also wish hclq [3] wasn't so dead!
I've been using `jq` for years and I'm always able to cobble together what I need, but I have yet to find it intuitive and I'm rarely able to arrive at a solution of any complexity without spending a lot of time reading its documentation. I wish I found it easier to use. :-(
I also love gron, if nothing else to find the paths I need to use with jq later.
But ChatGPT has genuinely solved my suffering writing jq, it does a pretty good job. It even almost replaces gron, if you feed it an exmaple json and ask for jq, it gives you something. It usually needs a little adjusting but it gets me 90% of the way there and saves me a bit of time.
I rarely use it for much else but its a jq winner :)
I think parent is referring to the habit of technically inclined folks of using "trivially" similar to words like "simply" and "just" [0][1] in a way that assumes too much about what the reader already knows.
Or maybe I'm trying to whet one's appetite for learning the thing by showing a relatively simple expression that demonstrates the power of the language.
Looks handy, but I'd rather go the other way and extend grep (and diff, etc.) to also work on things that aren't restricted to be lines in a text file. The number of times I've needed to go through contortions for things that should be easy solved problems (e.g., grepping for patterns of line pairs, grepping for records that use a different delimiter when the record data itself could contain linefeed, etc.)...
Tip for pairs of lines. Use grep -A, -B, or -C to emit lines surrounding the first line to match, then pipe that into a second grep (also with -A, -B or -C). e.g.
grep -A1 foo | grep -B1 bar
Will find a line with "foo" followed by a line with "bar" and emit both. Of course, it will also find a single line with both "foo" and "bar", so it's not perfect. This is a quick and dirty solution. Beyond that, break out sed and awk, or maybe the Practical Extraction and Report Language... it's really good at that stuff.
Thanks, it's a brittle hack as you acknowledge but at least there won't be false negatives. Funny how often I wind up crawling back to Perl.
The problem here is literally that someone hardcoded "IT'S ALWAYS LINEFEED" into an algorithm that could work equally well with any record separator character -- in fact probably with any record separator regex. I notice there's now `grep -z` which is one small step towards sanity... but the fully general problem is so easy and so useful to solve it's exasperating.
I guess I should stop complaining and submit a patch to grep to add a `--dont-use-linefeed-instead-use <arg>` option already.
I really like the JMESPath interactive tutorial page (https://jmespath.org/tutorial.html). It helped me when I was first learning the syntax and I still go to it if I run into a particularly weird syntax that throws me off.
Another great alternative is JSONPath[1] which unfortunately not as widely supported and known despite being brilliant!
It's inspired by XPath so it's very familiar instead of a complete new DSL. The killer feature imo is the recursive key lookup so you can write `people..address` and it'll find all "address" keys that descend from "people" anywhere in the JSON. It's by far my favorite parsing language for JSON and I wrote an introduction blog on how to use it in JSON dataset parsing [2] :)
I do this for archiving my YouTube watchlist into git - all the video/thumbnail/subtitle URLs are dynamic with expiry which makes them pointless to keep around. A quick `gron | sed | gron -u` replaces them all with "URL" and my diffs are much smaller and happier now.
One of the reasons I like/tolerate jq is that it's stable, i.e scripts written for it a few years ago still work the same today.
I have some code around for yq instead that keep breaking because yq keeps improving in non backward compatible ways (I didn't investigate how often yq introduced backwards incompatible changes, but the issue affected me several times in unrelated places, CI scripts or whatnot, that by their nature end up running with different versions of base tooling and update them at various pace)
I was always thus grateful to the great wisdom of the jq maintainers for their understanding of the importance of backwards compatibility.
I hope this announcement doesn't mean that this stability was just an accidental side product of stagnation and that once stagnation is "fixed" it will be done at the expense of stability.
I wonder how often someone will develop a script on Archlinux and later be surprised that it will not work in our Debian CI. One nice property about jq was that 1.6 was everywhere, remains to be seen how annoying this will be. Probably not that much.
Is there a way to get jq version inside the script?
Its error reporting is also clang-vs-gcc level wizardry, and I often use it to get a helpful message instead of "ENOWORKY" from jq (I haven't tried 1.7 yet, so it could be better for all I know)
I'd personally use that, but in the context of sharing scripts and snippets with colleagues, the strength of the incumbent `jq` is that we can all assume everybody will have it installed on their machine.
Not yet, but jq is getting into the kind of shape where we could add support for lots of formats. I'd like to have support for various binary JSON types (e.g., CBOR, but maybe too JSONB and others), YAML, and XML.
What are you using yq for? Unless you use YAML-only features (e.g. integers, arrays, and objects as object keys), it seems like it would be easier to just pipe-convert your YAML to JSON and process it with jq.
Depending on whether or not the resulting YAML is supposed to be human-readable[1], you can just produce JSON output, since YAML 1.2 is a superset of JSON. I did that at one of the places I worked in at the time, and it worked very well.
[1]: I personally think that JSON is plenty readable, but a lot of people seem to disagree.
I use it to make small edits to yaml configuration files which are supposed to remain human-editable and keep their comments and ideally also whitespace. I'm sure some people do enjoy working with raw JSON, but I'm very much not such a person.
That's a very negative view. If Excel were to fix their 1900-is-a-leap-year bug, I'd call that a clear improvement, even though it would break some spreadsheets that work around the bug. Seen through that lens every major version of almost all programming languages would be a deterioration.
yes I tried to not be too aggressive towards yq's authors who surely don't deserve bad words, but at the same time I wanted to express how painful is even the smallest backwards incompatible change to a tool that may end up being used in many tiny dark corners of your automation that everybody forgets to maintain.
In addition to my previous comment about jq-like tools, I want to share a couple other interesting tools, which I use alongside jq are jo [0] and jc [1].
This is the first I'm hearing of gron, but adding here for completeness sake. Meanwhile, JSON seems to be becoming a standard for CLI tools. Ideal scenario would be if every CLI tool has a --json flag or something similar, so that jc is not needed anymore.
It's really awesome how the community pulled together and helped us recruit new maintainers to revive the project. Special thanks to, well, all involved, but especially @stedolan, @itchyny, and @owenthereal (all GitHub usernames).
This is a fantastic new feature. I would also love a version of ‘pick’ that works on streaming data, since that doesn’t seem to be possible currently without reassembling the stream first
I'll give a plug for jaq [0], a clone focused on correctness, speed, and simplicity. It only implements a subset of jq, but I've been enjoying it so far.
Seeing this news today, I decided to give jq another try and ended up discovering jq-mode [1] for emacs. It doesn't just support jq filter file editing, it supports jq in org-mode and something else called 'jq-interactively'. This interactive mode allows you to apply jq interactively on a JSON or YAML (with yq) buffer. The buffer contents become the filtered value when you finish editing the jq filter with a return. This is especially impressive to see in yaml files.
Personally, when I test REST APIs, I use „restclient.el“ all the time which also comes with a great JQ integration („jq-set-var“ for example for deriving request variables from responses). For traversing larger responses I use „counsel-jq“ in a customized JSON mode: https://github.com/200ok-ch/counsel-jq
jq is great. I don't know how many times I've had to explain to engineers the invalid numeric literal error means your json is bad. no really, don't trust me, copy it into the ide. it's not jq. your message is malformed.
Strangely, I also have ECMA-404 and RFC8259 open in other tabs. mostly annoyance with the occasional flashes of anger over number formats and duplicate keys.
That's an error because you can't select an env var key '[env.AUTH_USER_HEADER]' in the middle of a chain like that, only immediately following a pipe:
> That's an error because you can't select an env var key '[env.AUTH_USER_HEADER]' in the middle of a chain like that, only immediately following a pipe:
You can use `env.AUTH_USER_HEADER` as a key the way you wanted. The issue is that you had to write `... | .match[0].header[env.AUTH_USER_HEADER] ...` -- no `.` between "header" and the index operator!
This complaint is a fairly frequent one, so in fact we did "fix" this in 1.7! You can now write `.a.[0]` and it works.
A slight tangential question. I use ripgrep-all along with fzf to interactively search through files from cli. Is it possible to integrate jq (or some equivalent) into this search ecosystem to search through json files?
I wish there was a faster version of jq, I was only able to get a few mb/s throughput out of jq vs few hundred mb/s throughput out of ripgrep.
I often use ripgrep to setup quick bash pipelines for rapid data analysis, would love to be able to use jq for that purpose. These days I am setting up scripts with simdjson but the cost of writing a script vs quickly setting up jq or ripgrep in a bash pipeline are orders of magnitude different.
This is a common refrain. So much so that when we changed the release artifact names in 1.7 we broke scripts and docker recipes that would download jq from the github releases 'latest' URIs! (This is fixed now.)
ChatGPT has massively improved my ability to work with jq. I use it just infrequently enough that I constantly have to read the docs for anything non trivial, but ChatGPT lets me pump out scripts quickly. It's been super nice for my work flow.
jq is amazing, and I'd like it even more if it supported CBOR too. The forks are just too finicky and are quickly abandoned, and the format is basically binary JSON.
jq is a tool that is so powerful and useful it really should be made part of the POSIX standard. It should come with everything preinstalled. It’s an amazing tool like that of an earlier more refined age.
I love JQ so much we implemented a subset of JQ in Clojure so that our users could use it to munge/filter data in our product (JVM and browser based Kafka tooling). One of the most fun coding pieces I've done, though I am a bit odd and I love writing grammars (big shoutout to Instaparse![1]).
I learned through my implementation that JQ is a LISP-2[2] which surprised me as it didn't feel obvious from the grammar.
[1] https://github.com/Engelberg/instaparse
[2] https://github.com/jqlang/jq/wiki/jq-Language-Description#:~....