Unix is not an acceptable Unix

useerup · on June 7, 2015

Interesting examples, especially in light of the recent debates about Unix shells and PowerShell.

In PowerShell, the `ls` command is a simple command that produces output without flags to control output format, ordering or filtering. Like with higher order functional programming, those tasks are delegated to cmdlets like Format-List, Format-Table, Select-Object, Where-Object.

Some PowerShell examples:

To list the files and directories (children) of a directory:

ls

To get just the names or the full path:

    ls | select Name
    ls | select FullName

To discover which properties are available from `ls`:

    ls | gm

To select only files larger than 200MB:

    ls | ? Length -gt 200MB

(ls is alias for Get-ChildItem, gm is alias for Get-Member, ? is alias for Where-Object)

It is somewhat ironic that PowerShell is more true to the principle of "do one thing and do it well" than the Unix shells.

ekidd · on June 7, 2015

I recently discovered that I can do this on Linux, too!

After over two decades of using Unix and Linux, I ran into jq, a tool for querying and transforming structured JSON pipelines: http://stedolan.github.io/jq/

This can be used to do many of things you demonstrate with PowerShell. Here's an example from the docs:

    curl 'https://api.github.com/repos/stedolan/jq/commits?per_page=5' | \
      jq '.[] | {message: .commit.message, name: .commit.committer.name}'

This outputs:

    {
      "name": "Nicolas Williams",
      "message": "Add --tab and -indent n options"
    }
    ... more JSON blobs...

You can output a series of individual JSON documents, or convert the output to a regular JSON array. You can also output raw lines of text to interoperate with existing tools. And this works especially well with commands like 'docker inspect' that produce JSON output.

I think that PowerShell and jq get this right: More command-line tools should produce explicitly-structured output, and we should have powerful query and transformation tools for working with that output. For day-to-day use, I do prefer jq's use of JSON over a binary format.

rane · on June 8, 2015

I've written a tool inspired by jq that builds on top of ramda and LiveScript. The motivation behind it was to be able to write pipelines with tools that already exist, instead of having to learn new syntax that can't be used outside that domain.

https://github.com/raine/ramda-cli

The example from before would look like this:

    curl https://api.github.com/repos/stedolan/jq/commits\?per_page\=5 |\
      R 'map -> message: it.commit.message, name: it.commit.committer.name'

Canada · on June 7, 2015

That's cool, though powershell deals in actual objects which are more powerful than better structured input and output. For example, the objects returned by ps have methods for interacting with them.

myhnaccount108 · on June 7, 2015

I would argue the object approach limits the universality of the PowerShell way as a general purpose computer interface because it binds it to the necessity of a particular flavor of object system (e.g. .Net) and mutable data which is venom to general purpose pipe and filter systems.

jq looks very interesting though note that it builds upon an underlying notion of text to serialize JSON.

Canada · on June 9, 2015

You can parse streams of text in PS too, so it's not like cmdlets are making the pipeline any less powerful.

As for binding to .NET, I don't think that's very limiting. A surprising amount of stuff is already easily hosted in .NET.

I would argue that all PS needs to be more competitive is: Ports to Linux/*BSD/OSX and a terminal implementation for Windows that doesn't suck.

Cmd.exe is a piece of shit that needs to die:

- Command editing. Is support for at least ^A, ^E, ^W, and friends too much to ask?

- Completion. Who wants to cycle through 4253 possibilities one at a time?

- Copy/paste. Programs actually STOP executing when you select text, like as if you ^Z in Unix. Even with quick edit enabled so you don't have to ALT-SPACE-E-ENTER<select stuff>ENTER, the fastest way to paste is 4 keys: ALT-SPACE-E-P.

- No SSH. Microsoft is addressing this. It's borderline criminal that Windows doesn't just ship with OpenSSH server already.

- No screen/tmux. I can't even talk about how it deals with resizing without using a lot of profanity.

- Lack of echo by default is seriously annoying.

In short, make the terminal feel like Putty and the editing/completion features feel like bash and I think PS could give all existing shells a run for their money.

nailer · on June 7, 2015

This, a thousand times.

As a Linux user / developer, it's surprising to hear colleagues talk about how important it is to separate content from presentation regarding, say, TeX, but ignore the benefits of Powershell doing the same. Unix users actually expel cycles trying to express things like times and file size as regular expressions to operate on strings, rather than dealing with them using their real data types.

A big part of this is that remoting has sucked - most webb app developers don't use Windows - so a lot of people who should have been able to find out for themselves hasn't been able to. Microsoft supporting SSH (as they announcedd recently) should fix that.

chongli · on June 7, 2015

it's surprising to hear colleagues talk about how important it is to separate content from presentation regarding, say, TeX

This debate never ends because, like the static vs dynamic types debate, it ignores the main tradeoff in favour of a one-size-fits-all solution. In both debates (separation and types) the tradeoff can be expressed most generally as an up-front vs deferred time investment. Static types require more up-front time investment with the advantage that they save time later. The same goes for having separated content from presentation.

Now, in light of the tradeoffs above what is the appropriate choice to make? That depends on how much time you're going to spend writing the script/document/whatever and how much time you're going to spend running/maintaining it later. For scripts, it makes no sense to have a heavyweight type system if you're only going to run the thing once and throw it away. Likewise, for documents it makes no sense to put in the extra time planning all the styles if you're only going to make a document (say, a shopping list) and then throw it away. For a long-term document such as a book or a thesis it makes a ton of sense to use something like TeX.

alkonaut · on June 8, 2015

> Static types require more up-front time investment with the advantage that they save time later.

What exactly is it that saves time? Is it to not have to type as much? In my experience if you use a lang with good type inference you type a lot less with strict typing than without, since you don't have to have unit tests to guard for things like typos/param order/param types.

chongli · on June 8, 2015

Typing on the keyboard is not what takes up most of the time in programming; thinking is. With static types you are forced to put more time into thinking about how to make the program compile whereas in a dynamic scripting language you only care about the result output. Your script may be ill-typed but if it gives the correct output for that situation then it doesn't really matter.

Static types save time later by giving you the opportunity to do a lot of refactoring with high assurance of correctness. Unit tests bring back some of this ability to dynamic languages but they require extra time to write and maintain the tests.

alkonaut · on June 8, 2015

I just can't think of many situations when a dynamically/weakly typed program produces a meaningful result in that case. In my experience the errors are the same in both the (strong/static) case and the (weak/dynamic) case, with the only difference being whether the issue shows up at compile time or runtime. Dealing with N runtime issues is slower than fixing N compiler errors when you discover runtumr problems one at a time, but compiiler errors you get in batches of up to N.

chongli · on June 8, 2015

Then you must not spend much time doing shell scripting. Strong/static types are not much help when everything is a string.

alkonaut · on June 8, 2015

Yes if everything is untyped then types are less useful. And for interactive scripring, compilation feels a bit strange too.

(Of course, I think text-only Unix tools as a way of interacting with a computer is a fundamentally broken idea). Powershell is an interesting idea, though maybe not the best implementation of that idea.

dllthomas · on June 8, 2015

"Yes if everything is untyped then types are less useful."

Everything sharing a common representation does not make types less useful. I would rather my program crash then confuse Bid and Ask prices.

"And for interactive scripring, compilation feels a bit strange too."

I wonder if you've used a REPL for a typed language. I agree that peeling out a compilation step would be odd, but adding types doesn't actually have to change the interaction that much.

icebraining · on June 8, 2015

since you don't have to have unit tests to guard for things like (...) param order (...)

Well,

  precoding :: Int -> Int -> Int -> (Int, Int) -> Precoding

  rsa_dp :: Integer -> Integer -> Integer -> Either RSAError Integer

  compile :: String -> String -> String -> Map Name Interface
             -> Either String (Interface, String)

alkonaut · on June 8, 2015

You can always choose not to use types even when you have them. If a method has multiple args of the same (system) type it's poorly typed, at least in a language that allows type aliases.

raverbashing · on June 7, 2015

This

Linux/Unix can't even get the concept of "one thing" right. Sometimes it's separated by spaces, sometimes by lines (and then you need the IFS gore to make it behave the way we want)

I mean, the shell is great, once you get around all those idiosyncrasies, but we can evolve.

CrLf · on June 7, 2015

However, after a few years of using both, my conclusion is that PowerShell isn't a better shell. It is programatically superior (for scripting) but inferior as a CLI language. And your examples are one of the reasons why.

"Do one thing and do it well" is about the goal of the tool, not about its features. With all its options, `ls` is still just about listing directory contents.

And that motto is still just a guideline, not an absolute rule. What makes the *nix shell great is that it follows that motto enough to be sensible and structured, but not enough that it becomes a burden. Just like a human language does.

wumbernang · on June 7, 2015

ls != list files in powershell.

ls is an alias for Get-ChildItem which is simply "get me a list of objects attached to the path". That's about as orthogonal as you can possibly muster.

It's actually more like plan 9 than unix.

Canada · on June 7, 2015

If only it wasn't hosted in the awful cmd style terminal with it's frustrating style of edit functionality and lousy completion. If it had the feel of a Unix terminal I think I could get used to it.

agumonkey · on June 7, 2015

I am really curious about how much parsing and formatting occupy *nix source code. (IIRC 30% of ls), and if it would be a good idea to decouple the way you mention it. my json-infused distro is still on my mind.

ps: this lisp machine talk mention how 'programs' (I guess functions) exchanged plain old data rather than serialized ascii streams making things very fast. There are caveats of adress space sharing though but it's another nice hint we could investigate other ways.

http://www.youtube.com/watch?v=o4-YnLpLgtk

sudioStudio64 · on June 7, 2015

That's an interesting point. abstracting out parsing/serialization from the data piped between commands would lead to more consistent argument handling for all commands.

agumonkey · on June 7, 2015

somebody wrote docopt (started as python lib) as a generic POSIX usage string parsing (part of the standard). Maybe it could lead to simpler argument parsing and 'user interface' generation, whether static documentation or shell completion scripts.

Also structured output may lead to more relation tools, less regexful ad-hoc parsing, maybe some kind of typecheck so callers can be notified when they need to be rewritten.

andmarios · on June 7, 2015

So, what would happen if I ran:

    ssh myrouter "ls /var/log" | ? Length -gt 50ΚΒ

Unix shells aren't confined on one's computer. It may be on your server, your android phone or your ten year old adsl router.

useerup · on June 7, 2015

    irm myrouter {ls c:\logs | ? length -gt 50kb}

This will emit "deserialized" objects from myrouter. PowerShell remoting works across machine boundaries by defining an xml based format for serializing and deserializing objects.

During serialization, all properties are serialized, recursively to a configured max depth. The local client will see the deserialized objects - ie they are not "live" any more.

Note that the PS team recently announced that they would support SSH as a remoting mechanism. I suspect that they will still send CliXML objects when using powershell to powershell communication.

Powershell uses an industry standard for http based remote commands. It should be adaptable to SSH.

geofft · on June 7, 2015

Or `irm myrouter {ls c:\logs} | ? length -gt 50kb` to match the original example, right?

Serializing structured data is a pretty solved problem, and I wonder if decades of UNIX going out of its way to screw this up is confusing the parent poster. XML is certainly a way to do it that works just fine, but there are a million others. None of them require awk or counting spaces.

useerup · on June 7, 2015

You're right. And I also screwed up "irm". It should have been "icm". Apologies.

chris_wot · on June 7, 2015

That won't work, because where-object needs a PowerShell object, and ssh is returning just text. I'm curious how Microsoft will deal with this.

alkonaut · on June 7, 2015

They are confined to Unix (or Unix-y) computers. It's as limited as anything else really.

mercurial · on June 7, 2015

> It is somewhat ironic that PowerShell is more true to the principle of "do one thing and do it well" than the Unix shells.

It's not particularly surprising that a system designed a long time after the original is more consistent (I wonder how the Plan 9 shell fares in this regard?).

vbezhenar · on June 7, 2015

Another UNIX principle is that programs deal with text. PowerShell is huge violation of that principle. You can't easily write PowerShell command with C or Rust or Java, as far as I understand. Yet you can write command-line tool with any language and use it with combination of hundreds existing tools.

wz1000 · on June 7, 2015

A solution to this might be a standardized object format, like JSON.

Slackwise · on June 7, 2015

YAML seems more preferable, since it would be far more readable when STDOUT is the shell.

Oh, and JSON is a functional subset of YAML, so you'd still be able to output as JSON and a YAML parser will read it.

jsjohnst · on June 7, 2015

Except what about Dates? Or apps were floating point precision is important? Or needing to deal with very large integers? Also, have you thought about how large the output would be for ls as JSON? Think about the I/o usage if every time you invoked ls it had to do a deep walk of the file system to get all the possible data ls might output with all its command line arguments.

vbezhenar · on June 7, 2015

JSON doesn't support behaviour (methods). They are very useful for composing programs IMO.

masklinn · on June 7, 2015

Methods are useful to organise and abstract programs, they're really quite bad at composition. Many functions operating on the same primitives is much more composable than small sets of functions operating on their own custom primitives.

wz1000 · on June 7, 2015

Behaviour could be implemented by shell commands. That way, if you wanted to implement a method for a particular object, you could just write a binary/shell script/whatever that reads the object from stdin and writes its result to stdout

current_call · on June 7, 2015

Then you'd need a JavaScript interpreter. Which would be terrible.

icebraining · on June 8, 2015

Well, it doesn't have to be JS specifically. In theory, it could even be a piece of machine code.

sudioStudio64 · on June 7, 2015

if the deserialization mapped to a system wide type system then the methods could be mapped in at that point.

chaz72 · on June 7, 2015

And a small number of tools for composing JSON.

nailer · on June 7, 2015

Do your own apps communicate via text? Eg, rather than use JSON for the REST APIs you write, do you use only strings?

Text made sense because text was the universal format. Now we have JSON. In the Windows world .net objects are universal (though I'd prefer JSON).

espadrine · on June 7, 2015

I don't understand your point, since JSON is a textual format.

Sure, the fact that Unix chose text meant they created a lot of different formats, some non-standard, but I don't see this being an issue, except for configuration purposes.

nailer · on June 7, 2015

Do you deal with the serialized JSON text directly, or run JSON.parse() and JSON.stringify() to turn your JSON into objects?

Do you ever use regular expressions to parse unstructured text when using a Unix shell? Do you think 'grep' and thinking of a regex is more or less efficient that using 'where'?

camperman · on June 7, 2015

"or run JSON.parse() and JSON.stringify() to turn your JSON into objects?"

By that time, the communication has already happened.

throwawaykf05 · on June 7, 2015

Exactly. His point is that whatever operations you perform from that point on is on the object and not the string that was used to represent the object as it was being communicated. The string is just an intermediate representation. It's the object that's relevant to your business logic.

nailer · on June 7, 2015

Well the only point of serialising was communication, so I'd argue unwrapping a presentation layer format is included as part of communication.

camperman · on June 8, 2015

Fair enough. I was coming at this from my current experience where a couple of different programs I have grab JSON from a server: the Python one puts it into an object, the bash script doesn't because I couldn't be bothered :)

throwawaykf05 · on June 7, 2015

JSON is not a textual format. What it is, is right there in the name: It's an Object Notation. Now, JSON objects are most frequently serialized to text, but they are still objects. The format is an orthogonal matter.

espadrine · on June 8, 2015

Let me cite you json.org:

> JSON is a text format

json.org is owned by Douglas Crockford, who should know what JSON is, since he was the first to specify that format.

nailer · on June 8, 2015

Agreed, but nobody uses it without parsing it. You're not grepping JSON, you're JSON.parse() ing it.

espadrine · on June 9, 2015

Why does it matter whether the exchange format is a regular or a context-free language? It is a textual representation of binary data either way.

Sure, it is harder to pattern-match context-free, and we have a convenient syntax for textual regular languages, but we create tools such as jq so that we can pipe JSON and extract data.

icebraining · on June 8, 2015

The "notation" part is the serialization. JSON is a "system of written symbols used to represent" objects.

sudioStudio64 · on June 7, 2015

PS will take the text output of any command line program and turn it into .Net strings that can then be manipulated like any other object. You can write PS cmdlets in C/C++, there is an SDK. It will also marshal COM objects into and out of the appropriate .Net types.

When dealing with output from command line apps I usually take the time to parse out the data I want from the strings and turn them into strong types if I'm writing a script...If I'm just trying to get something done with a command prompt I just do whatever gets it done the fastest.

cremno · on June 7, 2015

>You can write PS cmdlets in C/C++, there is an SDK.

Are you sure? AFAIK you have to use C++/CLI which isn't C++ and the official examples are either C# or VB.NET: https://code.msdn.microsoft.com/site/search?f[0].Type=Topic&...

sudioStudio64 · on June 7, 2015

In the 2012 R2 time frame Jeff Snover said that they opened up cmdlet authoring to subsystem teams to use C++ to build cmdlets. Maybe they haven't released it yet?

That may mean that you have to implement at least some part of it as a .Net class. That may mean that they are doing COM components that inherit certain interfaces...it may mean that they are doing PInvoke...to be honest I haven't looked into it. It may be that it's still internal. Huh. I should look that up.

wumbernang · on June 7, 2015

You can.

Half way through a powershell script you can switch to C#, VB, JavaScript if you want and implement a pipeline or shell out to a C++ program, talk to something over the network or even Cygwin if you really want.

hurin · on June 7, 2015

I'm not familiar with PowerShell, but wouldn't following through with this principle mean that the default program has to give maximally verbose output to the piped formatter, since the formatter can only filter rather than extend the original command.

I imagine to achieve the default functionality your command must be significantly more verbose, if this philosophy is followed through 100%, or you have a lot of per-configured default formatters (and you have to remember what to pipe to which). Maybe I'm misunderstanding this, it's very neat in principle though.

useerup · on June 7, 2015

PowerShell cmdlets return (active) objects in the same process space as the hosting process (usually the shell). As such they can expose expensive-to-compute properties as late-evaluated properties that are only evaluated if and when they are invoked later.

Take for instance Get-Process:

    $p = ps powershell

(ps is an alias for Get-Process). Now $p is the process information onject about the PowerShell process itself.

Type

$p

And PowerShell will respond with

    Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
    -------  ------    -----      ----- -----   ------     -- -----------
        359      24    57096      61972   611     2,20   9608 powershell

If you request the content of $p at some later point in time, you will notice how the CPU has increased, for instance. The point is that a lot of the properties are computed when read, and hence do not incur overhead in the pipeline.

AndrewDucker · on June 7, 2015

PowerShell returns objects, not strings. Which means they can have as much information as you like without being overly verbose :-)

hurin · on June 7, 2015

> PowerShell returns objects, not strings. Which means they can have as much information as you like without being overly verbose :-)

How does the user know exactly which output he is going to get? The formatting program cannot know anything about default expected output - so it either must be specified explicitly, or objects must distinguish between default_for_formatter_to_output fields and more_verbose_hidden_fields - I'm really not a fan of the amount of man <command_name> using Unix involves, - but is the alternative really much better?

sudioStudio64 · on June 7, 2015

There are some sane defaults in the PS case along with heuristics that try to map arguments between pipe-out and pipe-in cmdlets. The parsing/serialization and formatting are done by the environment, not by an individual program.

It works pretty well. In the end there is also some convention that helps it work more consistently.

In practice, it is kind of cool to be able to construct new objects as the pipeline progresses so that can customize that behavior. I've found that with some aliases for default commands it can be reasonably succinct as well.

AndrewDucker · on June 7, 2015

Each object specifies how to format itself by default.

So ls, by default, gives Mode,LastWriteTime,Length,and Name.

But if I check all of the properties I get PSPath, PSParentPath, PSChildName, PSDrive, PSProvider, PSIsContainer, BaseName, Mode, Name, Parent, Exists, Root, FullName, Extension, CreationTime, CreationTimeUtc, LastAccessTime, LastAccessTimeUtc, LastWriteTime, LastWriteTimeUtc, Attributes

lake99 · on June 7, 2015

Your comment is breaking the formatting of the entire page.

AndrewDucker · on June 7, 2015

Fixed, thanks!

Didn't realise it wouldn't line-break!

wz1000 · on June 7, 2015

AFAIK PowerShell pipes stream objects, not text. I've never used it though.

nickysielicki · on June 7, 2015

The reason that a program should strive to be unixlike is portability. The nice side effect is little bloat. Being unixlike is about the scope of a program rather than trying to make the implementation as basic as possible to the extent of not taking flags. If a program relies on another program to be useful, that's clearly not portable. It's no different if it's a library or another executable.

These cmdlets are excessively small in scope and that makes them intrinsically tied to one another. They're useless without their sibling cmdlets. In effect they're one giant program.

So I completely disagree that these are more unixlike.

hobarrera · on June 7, 2015

This actually looks pretty nice and usable.

I do wonder why they use TitleCase for somethings, and lowercase for others ("ls", "gt"). I think it kind of makes this non-obvious and non-predicable. (reminds me of PHP a bit).

crdoconnor · on June 7, 2015

My one wish for UNIX coreutils would be that they all had a switch to output JSON.

ls --json

ps aux --json

If they did that, all that powershell stuff would become trivial.

Wouldn't even be hard.

_paulc · on June 7, 2015

See libxo[1]

There has been added in FreeBSD 11 but I am not sure how many of the commands have been converted.

[1] https://juniper.github.io/libxo/libxo-manual.html

McElroy · on June 7, 2015

>I am not sure how many of the commands have been converted

There's a page [1] on the FreeBSD wiki with a list of converted and ongoing conversion. I don't know if the list is complete but the page was last modified 2015-05-22, so it should be fairly up to date i guess.

[1]: https://wiki.freebsd.org/LibXo

jsjohnst · on June 7, 2015

Except how do you represent every possible data type without losing fidelity? I made this point in another leaf, but it's worth echoing again. A perfect example of this is Dates. There isn't a defined JSON type for that so we'd be back to basic text parsing again (which will cause anyone who has dealt with parsing dates to recoil in agony).

When you try to ram a one size fits all approach to everything, you end up with abominations like SOAP with XML. I simply don't get the fascination with trying to use the same tool for every job even if it's not applicable.

oinksoft · on June 7, 2015

Ten years ago this would be --xml. Who knows what it would be ten years from now. Text streams are timeless.

alkonaut · on June 7, 2015

Text streams are just an ad hoc structured format. Even an out of date format is better than one you have to invent or decode on a per case basis.

The whole xml vs json debate feels pretty pointless, they are for all practical purposes equal (in size, complexity etc). Sure xml has some weird design decisions around namespaces and so on, but if you use it as a simple hierarchical markup it's pretty much equivalent to json with different brackets? and often easier to parse for humans because of closing tags instead of }}} and of course, comments.

The xml-hate I think isn't really xml-hate it's the reaction against the xml-y things from 10-15 years ago: the IBM/Struts/Blah we all had to wade through. These days it feels frameworks pick json over xml even when it's clearly an inferior choice (such as for config files). Json is an object/message markup, not a good human readable config format.

jstimpfle · on June 7, 2015

>> Ten years ago this would be --xml. Who knows what it would be ten years from now. Text streams are timeless

> Text streams are just an ad hoc structured format. Even an out of date format is better than one you have to invent or decode on a per case basis.

I won't argue which one sucks more, XML or JSON. They are both inferior to what you call "ad-hoc structured" text files.

XML and JSON are both hierarchical data models. It has been known for fourty years that these are inferior to the relational model because they make presumptions about the access paths of the consuming algorithms.

Put differently, hierarchical data models provide just a view of the information that is actually there. (Implicitly -- consistency and normalization are not enforced). Relational databases on the other hand are concerned with the information and provide much better mechanisms for enforcing consistency and normalization.

By coincidence, the "unstructured" text files in Unix are just miniature relational database tables. Think passwd, shadow, hosts, fstab, ... . Consistency is not technically enforced (that would be a huge overkill at this level of abstraction), but there are even checker programs like pwck.

alkonaut · on June 7, 2015

A good standardized relational model format would be cool, and I'm sure such formats exist. Feels like we could do better than spitting out randomly (I.e per- tool) formatted data with so-so encoding support!

A sequence of tables in csv with decent encoding support would go a long way towards a good machine parseable relational text output.

It's really two separate discussions though: what is s good input/output format for Unix-like tools, and what makes a good format for a config file.

jstimpfle · on June 7, 2015

> It's really two separate discussions though: what is s good input/output format for Unix-like tools, and what makes a good format for a config file.

I don't see where these are not one single problem. Everything is a file.

> A good standardized relational model format would be cool, and I'm sure such formats exist. Feels like we could do better than spitting out randomly (I.e per- tool) formatted data with so-so encoding support!

I'm actually currently trying to realize such a thing in Haskell, for usage in low traffic websites. There are obvious advantages in text DBs compared to binary DBs, for example versioning.

But I doubt we can do better than current Unix text files if we don't want to lock in to some very specific technology.

alkonaut · on June 7, 2015

> I don't see where these are not one single problem. Everything is a file.

A very general format could solve more problems but as I said earlier I think the lack of comments in json makes it sub par as a config format for human editing.

jstimpfle · on June 7, 2015

I generally like the idea of tabular text files where the last column is free form text.

If you need more commenting freedom or flexiblity, why not make another indirection and generate the data from some source which is tailored to your needs? After all, relational data is often not suited for manual input. It's meant for general consumption by computers.

For example, as a sysadmin, passwd / shadow / group is not enough to model my business objects directly. I keep my business data in a custom database, and generate the text files from that.

crdoconnor · on June 7, 2015

XML hugely complex and difficult to parse despite not being any more useful than JSON as a means of representing or transmitting data.

YAML is a far superior format to XML for config files, and is basically JSON in a readable format.

There's literally zero reasons to choose to use XML where you could use JSON/YAML instead.

vezzy-fnord · on June 7, 2015

YAML may look quite simple on the outside, but the specification of it is surprisingly complicated.

alkonaut · on June 7, 2015

I really don't care what a format is called so long as it fulfills the basic requirements: 1) can be written and parsed with the std libs of the language in question, and 2) supports comments if it is to be used by both humans and machines, such as in config files.

crdoconnor · on June 7, 2015

XML is a mess and it was no less of a mess 10 years ago.

Text streams are unstructured, so parsing them is a pain in the ass.

JSON is simple, standardized, and not going anywhere.

GFK_of_xmaspast · on June 7, 2015

With --xml ten years ago we would have had ten years of structured output from the coreutils and probably more good praxis of working with the format.

Comaleaf · on June 7, 2015

How would that work for late-evaluated data? A PowerShell object doesn't have to provide all of the data up-front, and can give updated data when checked at a later date. JSON is still just text, it still needs to provide all of the data you might need up front.

jsjohnst · on June 7, 2015

Great point that I think all the JSON advocates are missing.

rlguarino · on June 7, 2015

Text streams also have that problem.

jsjohnst · on June 7, 2015

Not necessarily. Text streams don't have to by default provide all possible data for the next process in the pipe. Sure, you could keep all the command line arguments you had before to make JSON output manageable, but then you have two problems rather than one.

pronoiac · on June 7, 2015

Hey, would that be easier if we rewrite the coreutils in, say, Rust? Honest question.

agumonkey · on June 7, 2015

Efforts have been done, see:

https://github.com/uutils/coreutils

https://www.google.com/search?q=coreutils+rust

bishop_mandible · on June 7, 2015

Why not TOML?

hk__2 · on June 7, 2015

JSON is a standardized format, TOML is not.

MrBuddyCasino · on June 7, 2015

Exactly. I think the Unix philosophy of "everything is a file" has an impedance mismatch with "streams of text". Files and directories are hierarchical lists of objects with certain properties, similar to JSON.

The "directories of files" abstraction is versatile and useful, simply because it is a versatile and simple data structure. Streams of text are too limited.

andmarios · on June 7, 2015

As I see it ls does one job (list directory contents) and does it well (by supporting 37 flags).

If ls was a human, we would call him an expert.

Since unix tools expect input as plain text, the flags that ls supports are absolutely useful and I believe this is obvious to anyone who has written more than 50 lines of bash. When your input doesn't support a standard complex structure (e.g json with some standard set fields), claiming that “sort by creation time” is a simple filter is funny.

First you would need a ls flag so that ls will output creation times in an easy to process time format (seconds since epoch), then you would need to ask sort to work on a certain (1 flag) numerical (1 more flag) field. Last you would have to pass sort's output to a stream processor to get rid of unwanted fields.

Current “list files by creation time”:

    ls --sort=time --time=ctime

Replace sort flag with filters:

    ls -l --time=ctime --time-style="+%s" | sort -n -k 6 -r | awk '{ $1=$2=$3=$4=$5=$6=""; print substr($0,7) }'

useerup · on June 7, 2015

The same in PowerShell - using ls without the abundance of parameters - would look like:

   ls | sort -des LastWriteTime | select name

1: Get the "child" items of the current directory 2: sort them descending according to the last write time 3: strip out all other properties but name

No need for ls to support a plethora of formatting options. And the typing is actually pretty short due to tab completion:

    ls | sort -des *time<tab><tab><tab><tab><tab> | select n<tab>

I could not remember what the "last change time" was called, so I just entered *time and used tab completion. PowerShell is smart enough to be aware that (one of) the output types from "ls" has a property called "LastWriteTime" - it came up after 5 tabs. Similarly when pressing n<tab> in the select it completed directly to Name because it knew (still from the pipeline) that output of "ls" would have a "Name" property - even if sorted first.

zymhan · on June 7, 2015

I totally agree that trying to filter the output of ls is madness when compared to using the built-in flags. But doesn't this also indicate that command output should be more unified in it's output format? "cut", for example, is a useful tool, but it seems ham-fisted as a way of formatting output data for anything other than a one-time job.

hobarrera · on June 7, 2015

> As I see it ls does one job (list directory contents) and does it well (by supporting 37 flags).

That's exactly write. The author of the article seem to confuse "do one thing" with "implement one primitive function".

"List directory contents" is just one thing. How it's done is configurable, but it still does the same thing.

joepvd · on June 7, 2015

Nitpick, sorry. `ctime` is not Creation Time, but Change Time. The traditional *NIX file systems do not have a meta data field for creation time.

andmarios · on June 9, 2015

Thank you; I naively assumed since we have modification time (mtime), ctime would stand for creation time. :)

pjmlp · on June 7, 2015

Yes, there is this strange idea in new generations that GNU/Linux is UNIX, without having tried DG/UX, Tru64, Solaris, HP-UX, Aix, Irix and many others.

Just drop a GNU/Linux user into a default HP-UX installation and watch them getting around the system. Hint, those nice GNU flags and utilities aren't there.

As for what UNIX turned out to be, I think Rob Pike as one of its creators is a good quote:

<quote>

I didn't use Unix at all, really, from about 1990 until 2002, when I joined Google. (I worked entirely on Plan 9, which I still believe does a pretty good job of solving those fundamental problems.) I was surprised when I came back to Unix how many of even the little things that were annoying in 1990 continue to annoy today. In 1975, when the argument vector had to live in a 512-byte-block, the 6th Edition system would often complain, 'arg list too long'. But today, when machines have gigabytes of memory, I still see that silly message far too often. The argument list is now limited somewhere north of 100K on the Linux machines I use at work, but come on people, dynamic memory allocation is a done deal!

I started keeping a list of these annoyances but it got too long and depressing so I just learned to live with them again. We really are using a 1970s era operating system well past its sell-by date. We get a lot done, and we have fun, but let's face it, the fundamental design of Unix is older than many of the readers of Slashdot, while lots of different, great ideas about computing and networks have been developed in the last 30 years. Using Unix is the computing equivalent of listening only to music by David Cassidy.

</quote>

Taken from http://interviews.slashdot.org/story/04/10/18/1153211/rob-pi...

FooBarWidget · on June 7, 2015

> Yes, there is this strange idea in new generations that GNU/Linux is UNIX, without having tried DG/UX, Tru64, Solaris, HP-UX, Aix, Irix and many others.

I think that it's not so strange, seeing that you have to fork over lots of $$$ to get your hands on hardware that would run Solaris/HP-UX/AIX/etc, plus licensing fees. You can't easily rent a cloud server for an hour to play with it (with some exceptions, like SmartOS on Joyent). The high barrier to tinkering reduced proprietary Unices to expensive niche environments that are willing to pay lots of cash.

justincormack · on June 7, 2015

IBM offers free access to AIX machines[1]. Joyent is the easiest way to try Solaris.

[1] http://www-304.ibm.com/partnerworld/wps/servlet/ContentHandl...

CrLf · on June 7, 2015

> Yes, there is this strange idea in new generations that GNU/Linux is UNIX, without having tried DG/UX, Tru64, Solaris, HP-UX, Aix, Irix and many others.

Actually, one can say that nowadays Linux is unix. It isn't Linux that has strayed outside the unix philosophy, it's the commercial unixes you've mentioned that have become stuck in the past. Even those that still get under-the-hood work and new features, are not even trying to become nicer to end-users (sysadmins).

On second thought, they are, since you can easily install GNU utilities from vendor-provided repositories (on AIX you even use RPM to do it).

pjmlp · on June 7, 2015

Except AFAIK there are zero features in POSIX that came from GNU/Linux.

Now, if anything I do agree that UNIX is stuck on the past by not having any standard workstation environment or adoption of modern kernel architectures.

Also CDE is not what one would expect from a 2015 workstation.

CrLf · on June 7, 2015

> Except AFAIK there are zero features in POSIX that came from GNU/Linux.

My point is that commercial unixes are old, dying, relics that don't define what unix is anymore, Linux does (and to a lesser extent, so do the BSDs). There is no such thing as a "real unix" anymore.

> Also CDE is not what one would expect from a 2015 workstation.

The unix workstation/desktop market was lost long ago. Unfortunately, Linux hasn't successfully recovered it and, IMHO, never will. The commercial vendors stopped caring when big hardware margins dissappeared and the opensource community lacks the unifying vision to build something sensible.

icebraining · on June 7, 2015

I found this a weird post. The shell does know what files are (how else could you do "echo * "?) and programs do take arguments and have return values.

In any case, I think the author is kind of tilting at windmills; as far as I know, that Unix and Linux don't really follow the Unix philosophy is reasonably accepted today, even if it wasn't back in '83, when Rob Pike and Brian Kernighan made the presentation called "UNIX Style, or cat -v Considered Harmful", which pointed out exactly those problems, which were already in development:

  Bell Laboratories
  
  Murray Hill, NJ (dec!ucb)wav!research!rob
  
  It seems that UNIX has become the victim of cancerous growth at the hands of
  organizations such as UCB. 4.2BSD is an order of magnitude larger than Version
  5, but, Pike claims, not ten times better.
  
  The talk reviews reasons for UNIX's popularity and shows, using UCB cat as a
  primary example, how UNIX has grown fat. cat isn't for printing files with line
  numbers, it isn't for compressing multiple blank lines, it's not for looking at
  non-printing ASCII characters, it's for concatenating files.
  
  We are reminded that ls isn't the place for code to break a single column into
  multiple ones, and that mailnews shouldn't have its own more processing or joke
  encryption code.
  
  Rob carried the standard well for the "spirit of UNIX," and you can look
  forward to a deeper look at the philosophy of UNIX in his forthcoming book.

Yet, Rob Pike himself uses Linux nowadays - the problem today isn't so much acceptance, but backward compatibility.

soyiuz · on June 7, 2015

I think the author misses the point. As far as I see it, Unix tools are optimized for one thing: the human user. If `ls` is bloated, it is only bloated from the system point of view. As a user, I only see a small command that is quick and consistent (to a reasonable extent) with other tools. I don't need to know about 39 flags--that complexity is revealed with time, as I need to accomplish more complex tasks. For now, I only use and "see" two or three. And if more is needed, the program is small enough to be reasonably well documented. `man ls` perhaps with `grep` is all most people need.

This brings me to the second point. "Streams of text." Just like `ls`, streams of text are an optimal format/convention for humans. Many other things are better at being more compact or more efficient etc. But as formats and conventions proliferate, streams of plain text remain: readable and universal. Humans will ALWAYS be able to work with text. It is something that all humans kind of agreed upon--which cannot be said for any other formats or standards, which can offer various technical benefits at the expense of longevity, universality, and readability.

These two features (`ls` being relatively light and text streamy) leads to the "bootstrapping" effects sought after by the first generation of Unix developers. Learning about pipes and filters in one part of the system is applicable to all others. These tools scale with your level of expertise. They grow with you because despite the small quirks, there's a remarkable consistency of interface: text! Consequently `ls` (along with many other core tools) is implemented in the same way across a staggering variety of platforms. It has survived decades of alternatives touted as "better," "faster," "more usable," etc. etc. etc. That is the remarkable achievement of *nix / GNU etc. approach to creating human-centric software. As we architect ever more complex systems, we would do well to understand why and how Unix has endured, warts and all.

iopq · on June 8, 2015

No, I'd rather be working with FILES, not paths to files.

This is how you get `rm -rf $STEAMROOT/*` problems

adambatkin · on June 7, 2015

This article gets it all wrong: The ls command does one thing well (LiSting files) and it isn't optimized for terminal size, it's optimized for PROGRAMMER productivity. The ls command gives you a default view that is exactly what you want most of the time and is easily copy/pastable for other uses (it's a single column) or more importantly, piped to some other program for further processing. And then there are easy-access options for common tasks (extra details, hidden files, sorting by attributes, etc...)

Powershell largely does exactly what the author is talking about. And yet it's completely unusable for interactive use (though it's great for writing longer-lived automations).

mangecoeur · on June 7, 2015

I think there's a bit of a disconnect between how people talk about/defend the unix shell and the reason it's actually still there. People often defend the simplicity or philosophy - despite, as the article points out very well, much of it being clearly poorly designed.

But really the reason most of it exists unchanged is simply path-dependancy - we used it then so we use it now. It works well enough that there's not enough incentive to drive a large change. There are dozens of different shell-like environments and scripting languages, but they never really took over out of plain intertia. That, and the conservative attitudes of people defending their Unix traditions (related in no small part to how much effort they put into learning the thing in the first place). For new people arriving on the scene, its still less effort to learn things, even when they get truely arcane, than to use something which relatively few other people use or even roll your own alternative...

ised · on June 7, 2015

I have never thought that UNIX approach is the "best" in a universal sense.

But when compared to how large, slow, complicated and opaque the "alternatives" are, UNIX is the clear choice for me.

I can modify and recompile UNIX to meet my own ends. That is all but impossible if I chose an alternative such as Windows.

For example, if I do not want ls to have 30+ options, I can trim it down to just a few options and recompile.

There are other utilities besides ls for viewing file information, e.g., BSD stat(1) or mtree -cp. The later displays mode information in octal which is something ls, despite its 30+ options, does not do.

Or I can write my own simple utility. I am given the full UNIX source code. Where is the source code for Windows?

Personally I keep my filenames short and never use spaces, so I sometimes use the shell's echo builtin and tr(1) to get a quick list of files.

   echo * |tr '\040' '\012'

If there were non-UNIX alteratives that were small, simple and transparent, perhaps I might not be using UNIX.

Because I have become very comfortable with UNIX, any alternatives that others suggest have to be comparable with UNIX on size and simplicity before I will take them seriously.

Currently, I use a kernel source tree that compresses to under 40MB; I can compile kernels with about 200MB of RAM and fully loaded kernels are about 17MB. Userland utilities are usually around 5MB as I prefer to put them in the kernel's embedded filesystem. I do not like to rely on disks. My "IDE" is the same system I am compiling. There is no GUI overhead, everything can be done in textmode. The importance of the preceding two sentences cannot be understated.

I am always willing to consider non-UNIX alternatives that can offer the same or better flexibility, size constraints and simplicity.

But after decades of being open to alternatives, I am still not aware of any.

throwawaykf05 · on June 7, 2015

Why do you need the source code to write a utility when all you need is an API? I've written software for Linux and Solaris but never had to look at the source code, so I imagine it's the same for Windows.

ised · on June 7, 2015

"... you need the source code to write a utility..."

Where in the comment is this statement?

Personally the primary reasons I would want the source code for the kernel and utilities would be 1. to assess its quality and, assuming the quality meets my standards, 2. to modify it to meet my own ends. In my case, the less I have to write things from scratch the better.

Let me know if you still have questions.

throwawaykf05 · on June 14, 2015

> Where in the comment is this statement?

> Or I can write my own simple utility. I am given the full UNIX source code. Where is the source code for Windows?

I agree with your reasons, it is useful to have the source code. Just saying tat in the many, many years I've been writing software that's deployed on Linux, I've never had to look at it.

mwcampbell · on June 7, 2015

The essay "Free Your Technical Aesthetic from the 1970s" [1] by James Hague is appropriate here. See also The UNIX-HATERS Handbook; much of it is outdated, but I think some of it is still relevant, particularly the part on the shortcomings of pipelines.

I think it's unfortunate that even if a good open-source PowerShell clone is developed (or PowerShell itself is open-sourced), it will probably never gain widespread acceptance outside of Microsoft shops. Tribalism so often trumps actual merit. But if someone did a PowerShell-like thing on top of Node.js, I could see that taking off, particularly if it were integrated with the Atom editor and its package manager.

[1]: http://prog21.dadgum.com/74.html

bishop_mandible · on June 7, 2015

> But if someone did a PowerShell-like thing on top of Node.js, I could see that taking off

I think it should also use React.

RubyPinch · on June 8, 2015

"open-source PowerShell" PaSh http://pash.sourceforge.net/ , a bit unloved, but definitely exists

"node.js" termkit, which is what partially inspired the current article.

I think the problem is there is no authoritative backing for anything on the linux side, with windows, if you are going to make a object-aware-pipes tool, its going to be powershell without a doubt. In linuxland, the decision is not so clear (does one implement their own? go with termkit? go with pash? go with one of the other more deviant shells?)

ska · on June 7, 2015

> it will probably never gain widespread acceptance outside of Microsoft shops.

This seems fairly natural for something so deeply coupled with .NET.

Perhaps mono and Microsofts current efforts will make this less of a constraint, but still -- .NET has made some deep architectural tradeoffs that you will mostly have to live with if you want to play in this particular sandbox, and I can't see that helping with wider spread acceptance.

vanilla · on June 7, 2015

I can't follow the comparison.

Linux is just a Kernel, most distros use the coreutils package which is developed/maintained by GNU [1].

So any comparison is not between Linux and Unix, rather the GNU part in GNU/Linux

[1]: https://www.gnu.org/software/coreutils/coreutils.html

justizin · on June 7, 2015

Right, and OS X is based on FreeBSD.

sigsergv · on June 7, 2015

Linux is not Unix, that's just fine, I'm happy it is not. I've tried “re-evaluated” shell (Powershell), no, thanks, it's not intended for using by humans. Currently shell is still simple and powerful. Powershell is powerful and NOT simple.

UK-AL · on June 7, 2015

What does intended for humans even mean? At least powershell has command names that describe what they do.

dijit · on June 7, 2015

Powershell is readable but barely writeable.

Bash is the opposite, it's very easy to do complex things if you have some basic knowledge.

for instance a one liner nested loop to replace a string in a bunch of text files which are a few directories deep. Easy to write but it takes a while to read.

sigsergv · on June 7, 2015

Good (==effective) UI does not require commands to be descriptive. Yes, it requires some learning, but then you can use them very effectively. And shell is not a tool for “general audience”, so it's fine to have some initial learning steps.

bediger4000 · on June 7, 2015

At least powershell has command names that describe what they do.

That's always debateable. Since nobody writes command names that are incomprehensible gibberish (to them), then I think if powershell command names describe (to you) what they do, then good. Almost certainly, a sizeable fraction of any populace will find those command names "unintuitive".

Svip · on June 7, 2015

Indeed. In PowerShell, you have to write 'programs' to get things done, in Bash (as an example), you write commands to get things done. I'd like to think that commands are allowed to contain shorthands. I don't see a problem with `ls`'s flags, even if they are numerous, because they all help reduce the interaction from programs to commands.

useerup · on June 7, 2015

> In PowerShell, you have to write 'programs' to get things done, in Bash (as an example), you write commands to get things done

Huh? I use PowerShell (almost) daily. I do not write 'programs'. I use small steps, modify and repeat if the result was not as desired, just as with any REPL.

If you are referring to how one creates commands for the shell, PowerShell cmdlets can be written as PowerShell functions or as .NET classes. Not "programs". A very nice feature of PowerShell in that regard, is how the shell performs the parameter parsing. The cmdlet declares the parameters with name, type and optionally position, and the shell does any type coercion from e.g. string to the parameter type.

This means that PS cmdlets - unlike sh commands - contain no parameter parsing logic whatsoever - only the logic and declarative parameters. This also means that the information is readily available to the shell (or extensions) to be used for auto suggestions ("intellisense"), tab completion, early error checking etc.

sudioStudio64 · on June 7, 2015

what you end up using in bash is actually the defaults that people have come up with after decades of use.

taking PowerShell straight out of the box isn't that different from running bash with no config file. I use lots of aliases and custom cmdlets to make things smoother for my particular workflow. PS is still pretty new so a lot of that stuff is still being built.

sigsergv · on June 7, 2015

Powershell is hard even with config file.

It really requires “coder” mode of thinking.

sudioStudio64 · on June 7, 2015

That may very well be...the guy that created powershell always says that Unix is document oriented configuration while windows is API oriented configuration. To do anything in Windows you have to know API's. To be fair someone has to write the code that parses all of those config files in Unix...but I think the over all idea is accurate.

They are changing that with PS Desired State Configuration...if you haven't seen it you should check it out. But you are right...it's coming from a different place than Unix shells.

chris_wot · on June 7, 2015

Someone needs to port PowerShell to Unix :-)

I'm confused about the response to his tweet though - ls has a lot of switches, but I can't really see any that don't do something with listing files.

And then the response is "Unfortunately, the lineage is _not_ Unix -> Linux. It is Unix -> Plan 9. Linux doesn't follow Unix philosophy :-("... Except Unix's ls eventually had 19 options, and the ls used by Linux is a port Gnu's coreutils.

Nevertheless, I tend to agree. ls does a lot of formatting that could be moved to ancillary utilities. I guess what I like about PowerShell is the way it allows for redirecting into objects - that really is an innovation that Unix shells don't have (at least as far as I'm aware...).

guard-of-terra · on June 7, 2015

"A system composed of a zillion tiny modules is itself a pile of mud"

Having so many args to `ls' might be an overkill, but I certainly don't want to combine four programs to get full coloured sorted listing of current directory.

wz1000 · on June 7, 2015

Many shells already come with a few default aliases that call ls with different arguments. Your "four programs" can be combined in a simple alias.

nly · on June 7, 2015

And what would you call an alias that gave you a coloured, sorted listing of the current directory?

mseepgood · on June 7, 2015

icebraining · on June 7, 2015

Why not, if you can combine them permanently using a shell script?

guard-of-terra · on June 7, 2015

I won't be doing that.

And if someone else does that for me, I've suddently got 400 (n^2) aliases that I can't reason about.

icebraining · on June 7, 2015

I wouldn't say you need 400 aliases. You'd only need aliases for the commonly typed pattern, not for any possible combination you might ever use.

Dewie3 · on June 7, 2015

One solution is to optimize shell programming for terseness, since it has to be so interactive. Then you can sacrifice this Unix rule in the name of terseness and get commands that are nice and simple to write and good enough for 90% of uses-cases, but perhaps bloated and not as strictly composable as the Unix philosophy would like. Then these commands can be implemented by simple compositions of other commands that are a bit more lean and verbose, and so probably won't be used much for shell programming.

A nice bonus is that you can answer "what is `ls -m`?" by referring to an alias of something simple like first using a command that outputs a list of file names which is composed with a formatting command.

ggchappell · on June 7, 2015

> You can’t write a function that returns a list of files because the shell doesn’t know what a “list” is, doesn’t know what “files” are, and couldn’t tell you the difference between a “function” and a “program” if its life depended on it.

Could it be that what this article is really groping towards is Perl? I mean Perl not as it has become, but as it was -- Perl 4 perhaps: basically the capabilities of the shell, augmented with a couple of data structures. So there are strings, lists, hashes, and functions, and that's pretty much it.

golergka · on June 7, 2015

The `ls` example is especially interesting: it made me think about perfomance of pipelines. Imagine a folder with 10^6 files. Listing them would take some time; if you only need 10^3 of files selected by the simplest filter (LIMIT 1000, for example), piping the whole list, in whatever format, to the filter program would be very far from optimal.

What would be a perfect solution is to use iterators instead of lists (the best example that I know of is Linq, but it seems that this approach is common enough).

However, now we don't even exchange data between different programs; now we couple them, running at the same time, by a complicated interface! It seems that if we take one more logical step in this direction, we'll get generics.

Don't you start to feel that this kind of ideal shell, piping different functions together in the most correct way possible already exists, and it's already installed on your computer? I'm talking about interpreted languages like Python, Perl, Javascript, Ruby or whatever else you fancy.

Of course, I don't seriously think that we should abandon shells for language interpreters. It's just that balance between complexity and universality is a very delicate thing, and convention, however bad, is good just because it's a standard.

In the meanwhile, I'm quite happy with the Fish shell.

Camel2 · on June 7, 2015

> It seems that if we take one more logical step in this direction, we'll get generics.

I agree, Unix would have been so much better with generics, ADTs, higher-order functions, monads, higher-kinded types, dependent types and structural pattern matching.

As I always say: "Those who do not understand generics were condemned to invent Unix, poorly."

golergka · on June 7, 2015

Yes, what you mean by sarcasm is exactly what I meant. When you try to improve stuff by going more universal, more abstract way, with data formats that are more "right", that's where you'll find yourself.

So, may be it's actually good that among all the modern tools in our arsenal we actually have something extremely simple and non-generic, which doesn't follow some abstract principle and just works instead.

Mr_T_ · on June 7, 2015

If the Unix guys were smart they would have used an Idris REPL as default shell.

bishop_mandible · on June 7, 2015

Why not Nimlang?

Mr_T_ · on June 7, 2015

Because Nim is not dependently typed. You can't get sh*t done without dependent types.

radiospiel · on June 7, 2015

Well, Unix is prepared to handle that nice enough: when the pipe reader terminates (because it received 1000 lines) the writer would receive SIGPIPE on the next write to the pipe and consequently be killed (as the default, at least.) Case closed.

golergka · on June 7, 2015

Well, `LIMIT 1000` would of course be the simplest case, so closing it isn't really an achievement. What if I want to list all information about files with some mode? `ls` will still send 10^6 lines, carefully formatted, and almost all of them will be discarded with a simplest filter it could perform before reading all additional information from disc.

useerup · on June 7, 2015

> What if I want to list all information about files with some mode? `ls` will still send 10^6 lines, carefully formatted, and almost all of them will be discarded with a simplest filter it could perform before reading all additional information from disc.

Indeed. That's probably one of the reasons PowerShell was designed to execute the commands in-process: The objects being piped are just the object references.

There's still the problem of using native indexes. That hasn't been solved elegantly yet. Reading all file names when wildcards would have eliminated 99% of them using file system metadata seems a waste. Which is probably why "ls" in PowerShell still allows a -Include filter and an -Exclude filter that take wildcards.

anewhnaccount · on June 7, 2015

To execute this automatically in an efficient way we need a query planner which can look at the whole pipeline and decide to use a different set of primitives if the naive/implied ones aren't sufficient. What you're talking about is implemented in relation database management systems but it requires the query planner to know about the whole system - that is it's the opposite of Unix - there is a piece of the system that needs to know about the whole system.

As far as Unix is concerned, the whole list is not stored in memory. Pipes are buffered streams (they are iterators, just with a buffer attached, which makes them more efficient not less).

golergka · on June 7, 2015

> What you're talking about is implemented in relation database management systems but it requires the query planner to know about the whole system - that is it's the opposite of Unix

And that's exactly my point: that this line of thinking about how to make these things right will lead to overcomplicated, bloated system.

Dewie3 · on June 7, 2015

Speaking of bloat (from another comment in this thread):

> I am really curious about how much parsing and formatting occupy *nix source code. (IIRC 30% of ls),

chongli · on June 7, 2015

If you've got a lot of files in a directory and you only want information about a subset of them then you're better off using a command like find:

    find . -maxdepth 1 -perm /g=w -exec ls -ald {} \;

This finds all group-writeable files in the current directory and lists them using ls.

dbbolton · on June 7, 2015

Seems like the article, and several comments here, equivocate Linux with GNU. While they are most commonly used in conjunction with one another, I still don't believe that "this GNU program does too many things; therefore Linux doesn't adhere to the Unix philosophy" is a valid argument.

Myk267 · on June 7, 2015

I think it's easy to misinterpret any of the "Unix philosophies". It's Ye Olde "MIT vs New Jersey" again, and Unix is relentlessly straight out of NJ. Simplicity in the "NJ" sense isn't what you'd expect if you're of the "MIT" mindset.

jstimpfle · on June 7, 2015

That's a good point. Makes me think again of "slow devices" and interrupts leading to EINTR or partial reads/writes. In my eyes a beautifully pragmatic solution to a real world problem, and yet at some point SA_RESTART was devised.

mtrn · on June 7, 2015

> Granted, we’ve developed graphical user interfaces that keep “ordinary users” away from the command line, but we still expect “serious developers” to drop down into a demonstrably inhumane environment to get anything meaningful done.

Maybe I am too familiar with it, but I have yet to encounter a graphical environment that matches the effectiveness of the terminal and command line programs. I saw systems, that tried to replace shell one-liners with a hundred thousand line GUI application, that no one cared to use.

ezy · on June 8, 2015

"Many of the usability issues raised by Don Norman in his 1981 criticism of Unix have gone largely unaddressed"

That's because Don's complaints were not about intrinsic issues, but complaints from someone who had a frustrating time developing a mental model of the components of a system and who tried to use the just-so story of "cognitive design" to justify his inability. The non-point about prompting y/n to delete is particularly misguided.