Fascination of Awk

cpdean · on March 24, 2023

I'm a huge fan of awk but the "Python vs awk" page this links to [1] shows python code that's almost deliberately atrocious.

Take this function the author wrote for converting a list of integers (or strings) into floats

    def ints2float(integerlist):
        for n in range(0,len(integerlist)):
            integerlist[n]=float(integerlist[n])
        return integerlist

Using `range(0,len(integerlist))` immediately betrays how the author doesn't understand python. The first arg in `range` is entirely redundant. Mutating the input list like this is also just bad design. If someone has used python for longer than a month, you'd write this with just `[float(i) for i in integerlist]`.

Further down in the function `format_captured` you see this attempt at obfuscation:

    freqs=ints2float(filter(None,captured[n].split(' '))[2:5])

Why bother with a `filter`? Who hurt you?

    freqs = ints2float(captured[n].split(' ')[2:5])

That said, the author's implementation in awk does look pretty clean. I'm just peeved that they straw-manned the other language.

[1] https://pmitev.github.io/to-awk-or-not/Python_vs_awk/

elesiuta · on March 24, 2023

> python code that's almost deliberately atrocious

That code was so bad I felt I had to step in too, I used chatGPT to simplify it a bit but it also introduced some errors, so I found what appears to be an input file to test it on [1]. The only difference with the awk program is that it uses spaces while the original python program used tabs.

  #!/usr/bin/env python3
  import sys
  
  freq, fc, ir = [], [], []
  with open(sys.argv[1]) as f:
      for line in f.readlines():
          words = line.split()
          if "Frequencies" in line:
              freq.extend(words[2:])
          elif "Frc consts" in line:
              fc.extend(words[3:])
          elif "IR Inten" in line:
              ir.extend(words[3:])
  
  for i in range(len(freq)):
      print(f"{freq[i]}\t{fc[i]}\t{ir[i]}")

[1] https://dornshuld.chemistry.msstate.edu/comp-chem/first-gaus...

andrepd · on March 27, 2023

> I used chatGPT to simplify it a bit but it also introduced some errors

Lmaoooo

NoboruWataya · on March 24, 2023

Agreed - this is pretty much the perfect use case for list comprehensions, which are one of the best features of Python. Normally "oh but there's a better way to do it in that language" isn't a particularly interesting observation, but here it completely turns the author's point on its head. I can't think of many more elegant ways to convert a list of ints to floats, in any language, than `[float(i) for i in integerlist]`.

chillpenguin · on March 24, 2023

I agree. But regarding

> I can't think of many more elegant ways to convert a list of ints to floats, in any language, than `[float(i) for i in integerlist]`.

I think something like `integerlist.map(float)` is at least a contender.

pcthrowaway · on March 24, 2023

I prefer the python syntax generally, but throw in some typing with Typescript inferring the post-map type from the map function's return type, and I'd definitely go with `integerlist.map(float)`

enriquto · on March 24, 2023

> I can't think of many more elegant ways to convert a list of ints to floats, in any language, than `[float(i) for i in integerlist]`.

What about `float(integerlist)`

nawgz · on March 24, 2023

I don't think writing a function called `float` that apparently accepts arrays as an argument is very elegant at all

sgarland · on March 24, 2023

I wanted to point out that the Python code was written to be 2.7 compatible, and maybe the atrociousness was due to that, but then I looked up when list comprehensions were introduced - 2.0, with PEP202.

version_five · on March 24, 2023

Looks like he copy pasted the python version from another forum post, and didn't look at it carefully. I'd suspect it can be made to look a lot cleaner (edit, yes, e.g. by just translating each of the main lines in the awk script to an if statement) I agree with the strawman comment.

randlet · on March 24, 2023

    floats = [float(v) for v in integerList]

or less idiomatic

    floats = map(float, integerList)

wiredfool · on March 24, 2023

Except that these are different -- the second returns an iterator (at least in python3). You'd need

   list(map(float, integerList))

for them to be equivalent.

(It doesn't matter in a lot of cases, but there are enough edges where it does. Json serialization for one)

asicsp · on March 24, 2023

My rough workflow is:

* Can I solve the problem using one-liners (grep, sed, awk, perl, sort, etc)? Or perhaps from within Vim?

* Can I glue together one-liners with minimal control flow as a Bash script?

* If not, go for Python

---

Discussion for https://blog.jpalardy.com/posts/why-learn-awk/ mentioned at the end of the article: https://news.ycombinator.com/item?id=22108680 (420 points | Jan 21, 2020 | 235 comments)

mananaysiempre · on March 24, 2023

I don’t think you’re right to put Awk into the one-liner category. It actually scales up remarkably well to a couple hundred lines or so, as long as the problem does not strain its anaemic data-structure capabilities.

Compared to straight Python (i.e. not Numpy/pandas), it can also be surprisingly fast[1]. I experienced this personally on a trivial problem: take a TSV of (I,J) pairs, histogram I/J into a given number of bins. I can’t remember the exact figures now, but it went like this: on a five-gig file, pure Python is annoyingly slow, GNU awk is okayish but I still have to wait, mawk is fast enough that I don’t wait for it, C is of course still at least an order of magnitude faster but at that point it doesn’t matter.

[1] https://brenocon.com/blog/2009/09/dont-mawk-awk-the-fastest-... note that the original author of mawk has made a release since then, at https://github.com/mikebrennan000/mawk-2, that doesn’t have the crashes I encountered with 64-bit builds of Dickey’s fork

asicsp · on March 24, 2023

Oh, I agree with you and the article that Awk is a wonderful scripting language. It is just that personally I use it mostly for one-liners (even wrote a book) and prefer Python for longer scripts.

version_five · on March 24, 2023

If I have a text file of csv that I need to do something with, I'll usually start with a shell script and coreutils + sed and awk.

If I need a script to generate some output (a regular use case that seems to come up for me is random numbers with some property), I tend to use python.

I also use python if I need to do more complicated aggregations or something on tabular data that pandas is better at. Though it's fun to try with `join` and awk sometimes (parsing csv can get tricky).

If I need to plot something I tend to use jupyter notebook but it's way more satisfying to use gnuplot, which I mention because it fits naturally into workflows that use shell tools like awk.

asdff · on March 24, 2023

Honestly for what you describe R might be quite comfortable. pandas was built to try and get R sensibilities into python, so you might as well drop the dependency and use the real deal all in one place.

anthk · on March 24, 2023

try xml2tsv

git://bitreich.org/xml2tsv

Then parsing it with AWK should be easy.

masklinn · on March 24, 2023

> If I have a text file of csv that I need to do something with, I'll usually start with a shell script and coreutils + sed and awk.

You should install xsv.

version_five · on March 24, 2023

I'm glad you mention that because I forgot something: I try and stick with only stuff I can find on most computers in order to minimize dependencies. That's a big advantage imo of learning how to use awk and tr for basic data analysis, for example. You can ssh into any server and work without installing anything.

cocodill · on March 24, 2023

I don't like this approach. One is being peddled a bucket of old shit that might have been okay in the 80's for portability reasons. This portability is usually not necessary for my one time tasks anyway. Today there are really cool and small tools to do everything with comfort, and if you want some kind of portability you can still put static linked tools quickly on the given system. But of course that is a personal opinion and depends on the situation.

dev_tty01 · on March 24, 2023

> One is being peddled a bucket of old shit that might have been okay in the 80's for portability reasons.

Like unix or linux? One person's "bucket of old shit" is another person's reliable swiss army knife that is always there when you need it.

cocodill · on March 24, 2023

Like people learning vi because it's part of POSIX and "is always there". For me the only thing that helped me was to learn :q!

version_five · on March 24, 2023

Really depends on your use case though. If you're always going to be working on your own machine, of course customize it to make yourself maximally productive. But if you're regularly going into new situations, it's good to know how to be effective with tools you can count on being there.

My (admittedly very lame) anology: you're hiring a ninja assassin and have two candidates. One tells you about all the special swords and staffs and and smoke bombs he carries, the other says I just need my hands. Who do you hire?

cocodill · on March 29, 2023

I have a simple thought about this. If your work involves many machines, you usually have enough permissions to transfer your programs. I think that the benefits of a pleasant way of working outweigh the effort.

And by the way, I would prefer the first ninja if the second one is handless.

jcul · on March 24, 2023

I'm surprised with some of the responses you've received on this.

But I absolutely know what you are talking about. I often SSH into new machines, that I may not have root access on, that may not even have internet access, or be a distro with a package manager (e.g. some switch running a custom distribution).

In those situations it is a huge advantage to know the tool, rather than try to do some gymnastics to get your tools onto the box or the data off the box.

These "old" and "broken" tools are installed as defaults on all these systems for a reason.

goodoldneon · on March 24, 2023

I agree 99%, but sometimes a tool is such a huge productivity boost that it’s worth installing. Fzf is one of those tools for me. If I had to deal with CSV files a lot then xsv would be worthwhile

masklinn · on March 24, 2023

I guess if your passion is coercing broken tools into half-assign trivial jobs more power to you.

chasil · on March 24, 2023

Here is a script that I use, mainly composed with the Toybox version of awk, that will extract all of the WiFi passwords stored on an Android device.

This is enormously portable, and does not require any new software installations for Android users who have root.

Requiring xsv would reduce availability to a fraction of where it can be run now.

  #!/bin/sh

  find /data \
      -name WifiConfigStore.xml \
      -print0 |
  xargs -0 awk '

    /"SSID/ { s = 1 }
    /PreShared/ { p = 1 }

    s || p {
      gsub(/[<][^>]+[>]/, "")
      sub(/^[&]quot;/, "")
      sub(/[&]quot;$/, "")
      gsub(/[&]quot;/, "\"")
      gsub(/[&]amp;/, "\\&")
      gsub(/[&]lt;/, "<")
      gsub(/[&]gt;/, ">")
    }

    s { s = 0; printf "%-32.32s ", $0 }
    p { p = 0; print }

  ' | sort -f

Note that the -print0 null processing is not POSIX. This is a reasonable compromise of standards compliance, as it does not reduce the base of available users.

I did try to do this first with arrarys, but awk segfaulted.

nousermane · on March 24, 2023

That is quite nifty implementation of reverse HTML escaping. But in python that could be done with much less work:

  import html
  print(html.unescape(foo))

And the best part - you don't need to debug/update the (g)sub list every time you stumble upon new weird &whatever; too. And there are a lot of those out there:

https://www.freeformatter.com/html-entities.html

chasil · on March 24, 2023

The only thing that changed since I originally wrote it was the location of the file.

https://www.reddit.com/r/LineageOS/comments/gzm7to/wifi_pass...

masklinn · on March 24, 2023

XSV is a tool for exploring and manipulating X-separated value files (CSV, TSV, …), which is why I mentioned it in reply to a comment which talks about the exploration of CSV files, and furthermore specifically mentions that

> parsing csv can get tricky

version_five · on March 24, 2023

> tricky

Yes I'd say awk is unsuitable for csv files that can contain text fields with arbitrary text, that includes commas, newlines, quotes etc.

It gets hackier and becomes the wrong tool as the number of edge cases to handle increases.

pacaro · on March 24, 2023

I think that the ordering of your substitutions is potentially incorrect.

The code as written will convert

  &amp;lt;

to

When it should (presumably) be

  &lt;

chasil · on March 24, 2023

While the perfect is the enemy of the good, you are correct.

tyingq · on March 24, 2023

Sqlite is another option, and a fairly simple shell alias makes it nicer. https://sqlite.org/forum/info/ad9d1a8f3e9feb8bddf9be86749a2c...

gpvos · on March 24, 2023

This, but with Perl instead of Python. Perl really has a perfect flow from one-liners to small scripts.

bsdooby · on March 24, 2023

I use Tcl for such workflows...

tyingq · on March 24, 2023

Tcl has pretty much no "one liner" capability though. No -e, nor other things like -n or -p.

bsdooby · on March 27, 2023

That's true; but not really an issue, IMHO. If I am really keen for a one-liner, then (G)awk. [edited]

anthk · on March 24, 2023

I would like Tcl if it didn't had upvar.

BiteCode_dev · on March 24, 2023

Same, but with 2 recent :

- can I use ripgrep, fdfind, fzf and choose to do this?

- can I ask chatgpt to do this ?

HopenHeyHi · on March 24, 2023

I suggest trying xsv as a first step: https://github.com/BurntSushi/xsv

noloblo · on March 24, 2023

Awk sed bash and perl are extremely underrated and nearly always beats python for elegance and succinctness of the repeating problems in the daily chores of sys admin

anthk · on March 24, 2023

Perl itself replaces awk, sed and bash for scripts. Think about it :D.

effie · on March 25, 2023

Perl can do everything, but it does not replace those tools. They are better adapted to their specific domains.

anthk · on March 26, 2023

Then you don't know neither awk or sed. Most of awk can be redone in Perl with ease, and for regex matching and replacing with sed, PCRE and Perl were practically made for that.

I mean, sometimes for small edits sed it's better, and awk for some tabular based files by using xml2tsv or lots of TSV related tools.

But for medium sized projects, Perl it's the obvious tool against something that requieres something similar to awk/sed but more complex data parsing.

effie · on March 27, 2023

> for small edits sed it's better, and awk for some tabular based files by using xml2tsv or lots of TSV related tools.

That, and shell, sed and awk are standard tools which are easy to learn, every unix user should learn basics of (even a java developer), and this won't change anytime soon.

However, this can't be said of Perl - it is a powerful tool, but it became culturally obsolete and deprecated, and most unix users in the present and the future won't be bothered to learn it, when learning more modern languages like Python is a better investment.

pcw888 · on March 24, 2023

I remember the trend away from Perl, but it's a great language. It just became unfashionable.

mcculley · on March 24, 2023

A long time ago, I built a relatively complex program that managed some other systems in awk. It was really a great fit for the problem and I was, at the time, working in an environment with poor developer tooling. The target systems were heterogeneous and I could not depend on Perl even being available. But awk was guaranteed to be there.

The problem was that every time someone else was asked to add any feature to it, they freaked out at the language choice and I had to get on a plane.

sgarland · on March 24, 2023

If you find yourself piping any combination of cut, grep, sed, uniq (and likely others I'm missing) together, you can probably do it all in awk. If you can guarantee usage of gawk, you can add sort to that list (tbf you can also implement any algorithm you want in awk, but arguably at that point you're wasting time) - and it's also worth noting that you can dedupe in awk _without_ having sorted input, albeit at the cost of storing all unique lines in memory.

Pipes are great because they enable you to trivially send data between programs, and they're terrible for the same reason. While the execution time on modern computers for the average data size isn't noticeable, on larger datasets or repeated execution, it absolutely is. If you don't have to pipe, don't.

Rediscover · on March 24, 2023

> ...dedup in awk...

I most recently did that an hour ago, and a few hours ago, pretty much every day.

awk '! x[$0]++' foo

klodolph · on March 24, 2023

Awk is actually amazing as long as you operate within its limitations and core problems that it solves well—it is definitely not limited to one-liners, but it has other limitations.

It is just so damn good at the things it is good at, but nobody learns to use it, because learning Awk is inefficient in the grand scheme of things. There are better things you can learn.

There is one place where Awk has undeniable superiority—and that is its use in environments where bureaucratic rules prohibit the distribution of programs / code (Perl, Python), but where Awk is permitted.

pphysch · on March 24, 2023

A naive AWK solution (extract+transform text) will vastly outperform a naive Python solution in throughput and memory footprint.

It's absolutely efficient to learn, because there isn't much to learn.

klodolph · on March 24, 2023

A lot of people already know how to write Python or Perl or something, and given that background, the case for learning Awk is more difficult.

I mean, I like Awk, but I’ve also been programming for a long time and I’ve amortized the cost of learning Awk.

pphysch · on March 24, 2023

My case for learning AWK is if you spend any significant chunk of time in the terminal (sysadmin, DevOps, data eng). Notwithstanding any other scripting languages you know.

CLI world is full of bespoke text interfaces. AWK is the tool for dealing with those programmatically.

pcthrowaway · on March 24, 2023

Author missed an opportunity to reference this 3D doom-like in awk: https://github.com/TheMozg/awk-raycaster

rapiz · on March 24, 2023

pk-protect-ai · on March 24, 2023

In 1996 I was paid to teach Unix to a group of customers. As a graduation work for my course they had to write an awk program for budgeting without use of any databases. I'm sure they had a lot of fun and cursed me for this sadistic approach to teaching at that time. However decades later some of them were still thankful they have learned regex and awk back then.

kmarc · on March 24, 2023

i do this for my own budgeting in 2023.

pdf / csv / excel export from my three webbanks, a bit of pdftotext, or soffice conversion just to pipe to awk to augment it and render properly formatted spreadsheet

lofaszvanitt · on March 24, 2023

Bash, awk et al = the syntax that can't be remembered, not even when smashed in the head with a tiring iron or forced with a submachine gun.

samuell · on March 24, 2023

I found the awk syntax to be surprisingly discoverable, once I got the rough structure of scripts.

I think the confusing factor with awk is that it allows you to leave out variuos levels of structure in the really simple scripts, meaning that the same scripts you see around will look quite different.

E.g. all the following would be the same (looking for the string "something" in column 1, and printing only those lines):

'$1 == "somestring"'

'$1 == "something" { print }'

'($1 == "something")'

'($1 == "something") { print }'

... to give a small example.

At least this confused me a lot in the beginning.

wiredfool · on March 24, 2023

  awk '{print $3}'

is what cut should have been all along. Specifically, this will give you the third space delimited field, where multiple spaces are coalesced.

  cut -d ' ' -f 3

will get you whatever is between the 3rd and 4th space.

tannhaeuser · on March 24, 2023

Awk syntax is basically what became core Javascript according to its creator [1]. Bourne shell syntax is very different so I take your comment as a frustrated reaction to the "Python obsolete" comment which must be seen in context with Python introducing itself as an awk replacement among other things (though not nearly as aggressive as Perl which use to have the a2p script to bogusly convert awk to Perl code).

[1]: https://brendaneich.com/2010/07/a-brief-history-of-javascrip...

hawski · on March 24, 2023

I can agree on bash syntax being crazy, but certainly not on awk. Awk is very simple, a man page is all you need if you need to find something. Otherwise what's so complex with awk?

abecedarius · on March 24, 2023

If you don't like learning from a man page (understandable), I can recommend a quick read of the Awk book by A, K, and W. It's very short and has fun examples.

nice2meetu · on March 25, 2023

I feel there was a missed chance to make it “by A, W and K”

stn_za · on March 24, 2023

It's easy. You just need to stop thinking of these things as toys...

Awk + bash could easily recreate most existing code in a couple of lines

asdff · on March 24, 2023

Bash is so comfortable to write in, its like you are just writing psuedocode. Pipes >>>>>>> a dozen parentheses I forget to close half the time.

joepvd · on March 24, 2023

awk is amazing. One pattern I often use is:

$ query_something | awk 'generate commands' | sh

For larger programs, I wrote and use ngetopt.awk: https://github.com/joepvd/ngetopt.awk. This is a loadable library for gawk that lets you add option parsing for programs.

mturmon · on March 24, 2023

This can be a very powerful idiom (basically, code generation at the shell prompt).

It’s well suited to iterative composition of the commands: I’ll write the query/find part, and (with ctrl P) add the awk manipulations, and then pipe to sh.

If it doesn’t have side effects you can pass through “head” before “sh” to check syntax on a subset.

PaulHoule · on March 24, 2023

It is a guilty pleasure but I like writing awk scripts that write shell scripts that get piped into sh, for example

   ls | awk '{if (length($1)==7) {print "cat " $1 }}' | sh

it is something you really aren't supposed to do because bad inputs could be executed by the shell. Personally the control structures for bash never stick in my mind because they are so unlike conventional programming languages (and I only write shell scripts sporadically) so I have to look them up in the info pages each time. I could do something like the above with xargs but same thing, I find it painful to look at the man page for xargs.

When I show this trick to younger people it seems most of them aren't familiar with awk at all.

For me the shell is mostly displaced by "single file Python" where I stick to the standard library and don't pip anything, for simple scripting it can be a little more code than bash but there is no cliff where things get more difficult and I code Python almost every day so I know where to find everything in the manual that isn't on my fingertips.

Rediscover · on March 25, 2023

Do You ever use awk's "system" command?

I run stuff like You mentioned (piping to a shell) and also system() frequently. It depends on many factors which one I'll choose.

(FWIW, I'm also quite decent in many shell flavors on many Unix/Linux variants, so that is another determinant)

Eg, the Busybox ash(1) that I frequently work with does not support arrays, but its awk(1) does...

koolba · on March 24, 2023

That’s not so terrible if you at least verify the output before the final “| sh”.

Though you’d have to be confident that running it twice is going to give the same results. If it’s remote data that could change then weird/bad/nasty things could happen.

For anything non trivial, best to separate those steps and generate a temp script to execute.

ketanmaheshwari · on March 24, 2023

I am proud of some serious awk I wrote a few years ago: https://ketancmaheshwari.github.io/posts/2020/05/24/SMC18-Da...

john-tells-all · on March 24, 2023

I love awk! It's incredibly clean and direct.

Once I wrote a 300-line Awk script to install a kernel driver. It would scan a hardware bus, and ask the user questions before loading the driver onto the system. Lots of fun!

xp84 · on March 24, 2023

The existence of things like "git implemented in awk"* serves as a great reminder that there are a lot of developers out there who are far, far, far, far more talented than I will ever be. I salute them.

*Especially keeping in mind that these people wrote things like this for fun.

cesaref · on March 24, 2023

You can write web servers in gawk:

https://github.com/crossbowerbt/awk-webserver

I know, just because you can doesn't mean you should.

tyingq · on March 24, 2023

That's really writing a web server in socat. Still cheating, since I'm using gawk extensions, but here's an awk webserver that's mostly [g]awk.

https://news.ycombinator.com/item?id=22085459

pcthrowaway · on March 24, 2023

I asked ChatGPT to explain that and "prettify" it, for anyone interested (everything below from ChatGPT):

This is an AWK script that serves HTTP requests by listening on port 8080.

It defines several functions:

1. `send` function takes in status code, status message, content, content type, and content length and sends an HTTP response with the provided information.

2. `cf` function checks if a path contains `..` and returns 0 if it does, otherwise returns 1.

3. `mt` function determines the MIME type of a file using the `file` command.

The script sets the record separator RS and output record separator ORS to \r\n which is the line ending used in HTTP.

It enters an infinite loop listening for incoming connections on port 8080. When a connection is established, it reads the HTTP request line by line using `getline` and processes the `GET` request. If no path is provided, the script serves `index.html`. It checks if the requested path is safe using the `cf` function, and if it is a file, reads the file and sends an HTTP response using the `send` function. If the path is not valid or the file is not found, it sends a "404 Not Found" response.

    gawk '@load "filefuncs"
       @load "readfile"

       function send(s, e, d, t, b) {
           print "HTTP/1.0 " s " " e |& S
           print "Content-Length: " b |& S
           print "Content-Type: " t |& S
           print d |& S
           close(S)
       }
       
       function cf(x) {
           split(x, y, "/")
           for (z in y) {
               print "FOUND " y[z]
               if (y[z] == "..") {
                   return 0
               }
           }
           return 1
       }
       
       function mt(f) {
           c = "file -b --mime-type " f
           r = ""
           while ((c | getline z) > 0) {
               r = r z
           }
           close(c)
           return r
       }
       
       BEGIN {
           # Change to the specified directory
           if (ARGV[1] != "") {
               if (chdir(ARGV[1])) {
                   print "Failed to chdir to " ARGV[1]
                   exit
               }
               ARGC = 1
           }
           
           # Set the record separator and output record separator
           RS = ORS = "\r\n"
           
           # Listen for incoming connections
           while (1) {
               S = "/inet/tcp/8080/0/0"
               while ((S |& getline l) > 0) {
                   split(l, f, " ")
                   
                   if (f[1] == "GET") {
                       p = substr(f[2], 2)
                   }
                   
                   if (p == "") {
                       p = "index.html"
                   }
                   
                   stat(p, s)
                   
                   if (cf(p) && s["type"] == "file") {
                       m = mt(p)
                       o = readfile(p)
                       send(200, "OK", o, m, s["size"])
                       break
                   }
                   
                   n = "<html>Not Found</html>"
                   send(404, "Not Found", n, "text/html" RS, length(n))
                   break
               }
           }
       }'

I added some comments to explain what each section of the code does. Let me know if you have any questions!

tyingq · on March 24, 2023

Looks like ChatGPT got it right with the description. Perhaps obvious, but the ".." check is to stop path traversal. Also, if you pass an additional arg, it takes that as a directory for the script to serve over http, defaults to CWD. The rest seems like enough explanation.

You can also pretty print gawk source with -o[filename], using - as the filename to print to stdout, so you can just run the oneliner version, but lead with "gawk -o- " and it will print the pretty version.

Edit: Also, the performance is terrible. Gawk's listen sockets suck because you get no granular control over listen/accept. The socat based one I replied to is probably much better.

prosaic-hacker · on March 24, 2023

Other comments at https://news.ycombinator.com/item?id=35281089

evadk8 · on March 26, 2023

the problem with awk is it is slow: counting lines of a json file (some 18mil lines) takes a few seconds where sed is much faster

jjuliano · on March 24, 2023

Not until you used JQ inside awk