This is good, but the data structures - eg arrays of arrays - don't really match the underlying data. Imagine what 'ip addr' or 'ifconfig' would look like - they output paragraphs rather than lines, so scraping lines wouldn't produce good output.
It'd be better - and FAR more work - to make an object pipeline based powershell equivalent for Nix, based on JSON. You could write cmdlets in any language that outputs JSON and do better at this project's goal of avoiding scraping.
(not the syntax but the idea). You want to allow interested individuals to write the wrappers "forcefully" against uncooperative maintainers (separate people, separate ideas, separate motivations, separate release cycle, etc).
Eventually, some enlightened shell-command owners might add "ls -al --json" or "ps -ef --json", "git stat --json" which removes the need for a wrapper script, but the wrapper script allows innovation peripheral to the core without affecting the core until finally due to overwhelming support the core is extended with something "good" and agreed upon via consensus usage.
For the short term, perhaps if "ps" and "ps.wrap" exist in your path, your awkward shell can inject the "* .wrap" automatically around the given command in a way that doesn't require a lot of typing, either automatically, or via shortened syntax.
(ps -ef).map(...)
!(ps -ef).map(...)
{{ ps -ef }}.map(...)
I don't know but eventually you'd want to get to a point where you're bridging between the interactive use case (ps -ef printing a nice text table) and the pipeline use case (ps -ef --json letting you easily access different fields), maybe {{ ps -ef }} either calls * .wrap if it exists or if it doesn't exist calls it with --json (this assumes the command supports --json directly)
Regardless, this idea has some legs and I look forward to seeing how much traction it gets.
I'll give that the syntax of these shell programs take some work to understand but they're hard to beat for the purpose they serve.
If you're parsing plenty of JSON these days I find 'jq' indispensable. It's basically awk/sed for JSON.
I would just be sure to study those programs as they took quite a bit of effort to make them fast. And there are plenty of alternatives with extended syntax and features.
I'm sorry but that seems to be some of the least intuitive code I have ever seen. What are the practical applications of knowing AWK over knowing JS? (Besides the shell voodoo). I'm not trying to start a language war, I just want to understand why learning a completely new set of grammars with a very limited domain would be worth the effort.
Awk is actually very intuitive once you learn the basics (should take about an hour). Awk is essentially a line-oriented pattern matching language, that is, a collection of patterns -> actions that are applied to every line of input. Syntax is very similar to C, but feels much more lightweight.
To decompose the example above:
NR > 1 {
print $1, $4
}
There are two parts to this, the pattern (the part outside the brackets), and the action (the part inside the brackets). Every line parsed is split into fields and stored in $1..$NR, where NR is the number of fields. The entire line is also available in the $0 variable. The default separator is the space character, though that can be changed.
So, knowing the above semantics, the meaning of the above example should be clear: If the number of fields for this line is larger than 1, print the first and fourth field of the line. It's a very powerful paradigm, and you can do crazy stuff with Awk. Examples are an x86 assembler [0] and a SASS-style CSS preprocessor [1] (plugging myself there a bit).
Unix coreutils are very powerful once you're familiar with them.
$NR stores the number of records (or the current record number) not the number of fields (that's $NF).
In the example the "NR > 1" is meant to exclude the header line from the output.
You are entirely correct -- I got tunnel vision/brain fart and mixed NF and NR up, newbie mistake!
NR is initialized to 1 on the first line, and is incremented for every subsequent line read. So, in this case, the above pattern will match any line after the first.
> What are the practical applications of knowing AWK over knowing JS?
That it is present on pretty much every Unix-like system, and most other platforms with a "real" OS, including platforms where finding a modern Javascript implementation is impossible.
It's also a very small language, and the default of parsing records from standard in and splitting them by a field separator and pattern matching on them makes it very well suited for compact filters without extra seremony (some other interpreters have flags to give an awk-like exerience - e.g. Ruby (MRI) - but none are remotely as likely to be installed everywhere unless you go for more cryptic tools like sed).
I think of it as a poorly designed DSL. Still, a poorly designed DSL, once memorized, is way more practical on the command line than typing out javascript function calls, even with the fat arrow syntax. Maybe it's a declarative vs imperative kind of thing. I hate having to reference manpages or use trial and error every time I need to sed/awk, but I don't know if wrapping commands in JS functions and results in Array.prototype is the best solution.
> that seems to be some of the least intuitive code I have ever seen
There's nothing intuitive about programming. A pencil is intuitive. Learning how to program gives us advantages but it requires effort.
The real problem here is that you don't want to learn. You'll be repeating the same mistakes.
> What are the practical applications of knowing AWK over knowing JS?
It's fast, has a small memory footprint, and plays well with unix pipes.
> Besides the shell voodoo
It's not voodoo. It's a useful programming environment for getting work done.
> I just want to understand why learning a completely new set of grammars with a very limited domain would be worth the effort.
It's a specialized language for extracting data from delimited byte-streams. It's not a big language and not a lot of effort is required to benefit from it.
Where the domain is well-understood it's great to have a domain-specific language with short-hand syntax that abstracts away the unimportant details. I don't have to define functions or call any methods. I just get some variables because I know awk just takes a pattern and splits it on a separator... it saves me a bunch of work. Once I understand the language I can manipulate field-delimited byte-streams with little ceremony.
The bonus is that awk plays well with the *nix environment. Anonymous and named pipes, etc are really useful.
Normally I have a severe allergic reaction to anyone trying to do anything serious with Javascript, but this looks like a legitimately good idea and a decent innovation on how shells work.
I'm beginning to think that it isn't Javascript that I hate, it's the way people use it (i.e. to build bloated, slow, terrible webapps). But this is actually pretty cool. JS is shockingly decent at text processing so repurposing it as a shell language kinda makes sense.
It's a bit verbose for a shell language (all the async stuff tends to just get in the way in my experience using it like that), but it's great in the sense that everything can be treated as "text" and the functional bits work well when parsing over lines.
Sorry I'm late to this, but I'm a big AWK fan, I use it daily and have done so for years. For sysadmin things its great for doing data manipulation in shell scripts.
I get that Awkward is a shell, and I can see the use. But it seems very clunky to use.
This AWK script was posted in this thread
ps | awk 'NR>1{ print $1, $4}'
It takes the output of the PS command, tosses the header away and then prints the value of the first and fourth fields out. This data can be piped to other programs for additional processing.
The NR>1 idiom is pretty well known after you look at a few example scripts that use it. To replicate the scrip that the Awkward video uses, it's just
ps | awk '{ print $1, $4}'
It's shorter than the Awkward example. I think it's simple, for every line you get from PS print the first set of text and the fourth set of text. Not sure what Awkward would need to do to skip the header.
@erikrothoff makes a point about whats the value in knowing AWK over JS. I'd like to think that it gives you a powerful portable tool that can be used as either a stand alone application or as part of a shell script. If you are a Windows user, then it's not going to fit as well, but if you are a Unix/Linux/OS X user, it's a worthy tool to add to your arsenal.
The syntax may be "some of the least intuitive code" but investing about 20 minutes can fix that. There are lots of tutorials that can give you a quick leg up. There are also a few "one line AWK scripts" collections that are good samples.
One of the most powerful things about AWK is the ability to use regular expressions to see if you want to process the record. The ability to go through a file, pluck out the record(s) and process them is very nice.
ps | awk '/astro/ {print $1, $4}'
will give me any user that has astro in it. Not sure how that would work in Awkward. The use of regular expressions like this is what makes AWK a go to language for text processing.
Not here to add to the AWK vs Javascript debate, but wanted to add AWK is a powerful tool, having it in your toolbox is a good thing.
This is very cool. It does remind me of powershell in it's aims & inspiration. But agree with nailer that using json (or javascript data structures) would hopefully end up with a better, more cohesive shell in awkward. I no longer use Powershell (I got sick of the microsoft ecosystem)
I'm not sure what format Powershell uses to pipe structured data around, but it seemed to be some kind of ad-hoc 'whatever-they-came-up-with-at-the-time' mish mash, instead of using say, JSON, and leveraging the power of arrays, dictionaries et al. In fairness json was just starting when I was using Powershell, so it wouldnt have been on their radar.
One question, How are you turning output from ls into arrays? Or is ls just a javascript function that behaves like ls?
Oh, also, what strategies do you envision for wrapping arbitrary commands, or creating an ecosystem of wrappers? Perhaps some kind of plugin system?
I am currently also getting excited by hyperterm, and wonder if there is some kind of fusion there with your project (that is getting _alot_ of traction).
The point being that the outputs of each of these functions is available (technically, it may require refactoring to get at the output pre-stringified) without having to parse formatted text.
For the same reason people don't write every program in binary using a magnetized needle straight to their hard disk. People like abstractions, and this guy happened to like CoffeeScript as his abstraction du jour.
The fact that it uses whitespace instead of braces is a big plus for people used to that, and generally you write less coffeescript compared to TS or ES. Plus it has things like switch expressions, list comprehensions, chained comparisons, the safe nav operator, and a ton of other little features that either don't show up in TypeScript or ES.
And it's very simple (no configuration needed), and compiles to very readable javascript which means converting to plain JS is very easy (hell, there's even a purpose built tool called decaffeinate which will translate it directly to ES2015)
You might not like the language, I don't really like it all that much anymore either, but it's not irrelevant, and for a small personal project like this it seems like a great choice which is easy to switch away from if needed.
I wonder how long it will take before something like this gets implemented as a `Hyperterm` [0] plugin.
I think it could be quite useful to combine `npm` packages with shell commands and `map`, `filter`, etc, in your terminal.
The only thing I am unsure about are the underlying representations of the shell output. Arrays of arrays are not very semantic, so the code that parses them would not be very pleasant.
Though if this was the case you would want some way of applying the right transform depending on the arguments passed into the command.
I guess the bad thing is that there is currently presumably no way of using other shell features like pipes.
Also, if you're going to start outputting different shaped data, then I think you should consider hooking this all up to `flowtype` [1] and creating type signatures to provide DX [2]
FWIW, I've been using Groovy for this purpose to encapsulate processes in *Nix and Windows. Not perfect, not lightweight, but happens to get the job done.
Apache Groovy's perfect for these kind of tasks where it doesn't need to be lightweight. Just don't use it for building actual applications -- build them in a decent typed language like Java, Scala, or Kotlin, then use Groovy as a multi-OS version of Bash to automate builds, tests, and runs.
Neat hack! I like the idea. Initially when I read the title I thought it would be yet another 'make my shell behave more like my preferred language' type of project. Those while fun to code and play with aren't really useful in the long run. This though might just be useful.
I'm imagining a world where coreutils and built-ins all respect an environment variable, say STDOUT_JSON=1, and would output JSON. That would be a great start. Especially if the JSON-Schema could be output with a new built-in, manj.
Not a good idea. Imagine that you invoke some program A that in turn invokes a coreutils program B. Imagine that A has been written before today (in other words, it is not aware of the STDOUT_JSON technique). Since A and B inherit the STDOUT_JSON environment variable, A will receive JSON. But it is not expecting this, so it will probably crash.
Thanks for that answer, that lead me to learn and understand more about standard streams and file descriptors than I ever cared to before! That's a good thing. :)
So then, what about this approach: since the de-facto standard streams are STDIN, STDOUT and STDERR using file descriptors 0, 1 and 2, respectively, are there three more reserved numbers (I couldn't find any in my google-glance) that could be used for, say, JSONIN, JSONOUT and JSONERR?
I think the heart of the matter is: unless it becomes part of the POSIX standard, is there any hope on agreeing on an informal "UNIX philosophy" sort of approach that would allow coreutils and shell built-ins to accept and pass structured responses in any kind of object form?
Making all in a terminal an object is very interesting!! I'm curious about applicability though. What can you do with this that you wouldn't normally do in a traditional term?
Just a note, _iostreamer_, you can use `command.parse` in Vorpal to pre-append parens to every command before it's evaluated (your project is written in Vorpal).
The following comment could apply to hundreds of similar submissions to HN in recent years.
Many people -- many of them young people I suspect -- have invested a lot of time learning Javascript. It has become a very popular language.
But it remains to be seen whether Javascript will be as long lasting as the UNIX shell and standard utilties. Historically speaking, computer languages have been known to fall into and out of popular usage.
For people who invested a lot of time learning the shell and ubiquitous utilties such as AWK, it appears the investment has paid off. I'm not too worried about the terminal disappearing any time soon.
How long until the next submission that aims to abtract away the need to learn how to use a UNIX terminal -- directly.
There are probably hundreds more on Github alone. What if we conslidated them all in one place: 1001 attempts to abstract away the need to learn UNIX.
By no means am I suggesting these attempts have not been successful.
What I'm suggesting is that the need for them will not abate. It could be that UNIX terminals, and programs like AWK, are not just a fad.
Perhaps there are features from other programming languages that should exist in shell programming?
For example, I've always felt that the fact that almost everything in shell programming is a gigantic string is really bad. It would be much nicer if all commands had type signatures, and therefore piping commands into each other could throw type errors if it does not make sense.
This would also allow much better feedback when stringing together commands, since the shell IDE would be able to recommend commands that act upon a particular type. For example, if a command produces output that could be considered to be N > 1 lines long then `wc -l` would be available, however if this isn't the case it does not make much sense for it to be recommend to the user.
The interface is not necessarily the content of the message conveyed by the interface. Lines of text std(in|out) is a well-defined interface. Some programs may go further and specify the format of messages passed on that interface. https://github.com/synacor/dtk is one such system that specifies (for most of its commands) tab-delimited records as the message format. It interfaces well with other elements of the UNIX ecosystem, however, because its interface is simply lines of text on std(in|out).
Well, I think, trying to abstract something away does not mean it's a fad, but totally the opposite: It means it's something people think they'll have to deal with a lot, but for whatever reason find it too much of a burden to deal with it directly.
It's like the ORMs. None of them mean SQL is a fad, it's just that some people seem to dislike SQL in a programming context.
They're going to continue to pop up, just like JavaScript frameworks will. Because they're not popping up due to any necessity, but because someone (1) learned JavaScript, (2) learned some shell commands, (3) had the epiphany that they could abstract shell commands away using JavaScript, and (4) isn't aware of the previous thousand attempts by others to do the same thing.
AWK isn't perfect (nothing is). So that means that there are still improvements that can be made to it. One of the improvements to AWK (or something awk-like) would be to make it have a more familiar syntax to those who don't already know awk.
I'll admit, the only thing I use awk for currently is to pull out columns of the output to pass on to the next tool. I know it can do more, but I don't have time to really learn and understand the langauge.
While I don't like many things about this project, I really like some of the ideas that it's suggesting. I'd love a "awk but with javascript" where you can use a subset of javascript to work like awk where you can pipe in data and have it output "regular" lines that can be passed to the next tool on the command line.
letting me use familiar javascript functions and syntax to map/reduce/filter over lines of input and output plaintext would be amazing. This doesn't quite do that in the way I want, but it's a cool thing nonetheless.
What mainstream languages, especially if standardized or open sourced, have you seen be completely abandoned?
Even Cobol, Fortran and Lisp are still active.
Javascript as this moment probably has the highest number of programmers using it. It is standardized and has several competing state-of-the-art OSS implementations. It's going go be with us, at least as legacy code, for decades :)
It'd be better - and FAR more work - to make an object pipeline based powershell equivalent for Nix, based on JSON. You could write cmdlets in any language that outputs JSON and do better at this project's goal of avoiding scraping.
instead of: Since the fields have keys, you avoid magic numbers and the code's easier to read.(you could of course alias those to make 'pidof node' like Linux distros do)
The end result would be much better than Powershell, since it would use JSON and not be tied to .net languages.