Hacker News new | past | comments | ask | show | jobs | submit login

This is interesting, yes. If the shell could infer the content type of data demanded or output by each command in a pipeline, then it could automatically insert type coercion commands or alter the options of commands to produce the desired content types.

You're right that it is in fact possible for a command to find the preceding and following commands using /proc, and figure out what content types they produce / want, and do something sensible. But there won't always be just one way to convert between content types...

Me? I don't care for this kind of magic, except as a challenge! But others might like it. You might need to make a library out of this because when you have something like curl(1) as a data source, you need to know what Content-Type it is producing, and when you can know explicitly rather than having to taste the data, that's a plus. Dealing with curl(1) as a sink and somehow telling it what the content type is would be nice as well.




My ultimate use case is a contrived environment where I have the luxury of ignoring otherwise blatant feature-gaps--such as compatibility with other tools (like curl). I've come to the same conclusions about why that might be tricky, so I'm calling it a version-two problem.

I notice that function composition notation; that is, the latter half of:

> f(g(x)) = (f o g)(x)

resembles bash pipeline syntax to a certain degree. The 'o' symbol can be taken to mean "following". If we introduce new notation where '|' means "followed by" then we can flip the whole thing around and get:

> f(g(x)) = (f o g)(x) = echo 'x' | g | f

I want to write some set of mathematically interesting functions so that they're incredibly friendly (like, they'll find and fix type mismatch errors where possible, and fail in very friendly ways when not). And then use the resulting environment to teach a course that would be a simultaneous intro into both category theory and UNIX.

All that to say--I agree about finding the magic a little distasteful, but if I play my cards right my students will only realize there was magic in play after they've taken the bait. At first it will all seem so easy...


The magic /proc thing is a very interesting challenge. Trust me, since I read your comments I've thought about how to implement, though again, it's not the sort of thing I'd build for a production system, just a toy -- a damned interesting one. And as a tool for teaching how to find your way around an OS and get the information you need, it's very nice. There's three parts to this: a) finding who's before and after the adapter in the pipe, b) figuring out how to use that information to derive content types, c) match impedances. (b) feels mundane: you'll have a table-driven approach to that. Maybe you'll "taste" the data when you don't find a match in the table? (c) is not always obvious -- often the data is not structured. You might resort to using extended file attributes to store file content-type metadata (I've done this), and maybe you can find the stdin or other open files of the left-most command in a pipeline, then you might be able to guesstimate the content type in more cases. But obviously, a sed, awk, or cut, is going to ruin everything. Even something like jq will: you can't assume the output and input will be JSON.

At some point you just want a Haskell shell (there is one). Or a jq shell (there is something like it too).

As to the pipe symbol as function composition: yes, that's quite right.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: