Hacker News new | past | comments | ask | show | jobs | submit login

Honorary mention of Taco Bell Programming. (fits this genre).

http://widgetsandshit.com/teddziuba/2010/10/taco-bell-progra...

Someone ought to write - Zen and the art of Unix tools usage.




(warning, mandatory HN contrarian comment)

"This is the opposite of a trend of nonsense called DevOps, where system administrators start writing unit tests and other things to help the developers warm up to them - Taco Bell Programming is about developers knowing enough about Ops (and Unix in general) so that they don't overthink things, and arrive at simple, scalable solutions"

It's not possible for developers to know enough about Ops, just as it's not possible for Ops to know enough about development, because they are different jobs. Moreover, devs are doomed to create terrible solutions because of their job, and Ops are doomed to create kludgy hacks for those terrible solutions because of their job. DevOps is just an attempt to get them to talk to each other frequently so that horrible shit doesn't happen as frequently.

Also, the real Taco Bell programming is actually to use only wget, no xargs. It takes a whole lot of basically every option Wget has, and a very reliable machine with a lot of RAM, but you can crawl millions of pages with just that tool. xargs and find make it worse because you don't get the implicit caching features of the one giant threaded process, so you waste tons of time and disk space re-getting the same pages, re-looking up the same hostnames, etc. (And that's Ops knowledge...)

The Zen of Unix is to try to move towards not using the computer at all. One-liners are part of that path, but so is minimizing the one-liner. http://www.catb.org/~esr/writings/unix-koans/ten-thousand.ht...


I don't understand this devs don't understand ops nonsense.

> so you waste tons of time and disk space re-getting the same pages, re-looking up the same hostnames, etc. (And that's Ops knowledge...)

If it's my job to write a web scraper, it's absolutely my job to think about/solve this problem.

Is this a new trend thing?


The difference between dev & Ops is an Italian grandma vs a restaurant chef. It's different experience that gives you different knowledge and a different skillset.


Simple, Composable Pieces is practically the whole ethos of both Unix and Functional Programmers, and they both converged on the same basic flow model: Pipelines with minimal side-effects. This naturally leads to a sequential concurrency, which seems to be easy for humans to reason about: Data flows through the pipeline, different parts of the pipeline can all be active at once, and nobody loses track. It doesn't solve absolutely every possible problem, but the right group of pieces (utility programs, functions) will solve a surprising number of them without much trouble.


Works until time plays a role in the computation you're doing.


>> suppose you have millions of web pages that you want to download and save to disk for later processing. How do you do it?

I don't know enough about the 'real way' or the 'taco bell way', but interested to know --- is this doable the way Ted describes in the article via xargs and wget?


Yes, absolutely. This is absolutely how ~~we~~ many (most?) of us used to scrap web pages in the Dark Ages.


I would assume a combination of

- sed/awk to extract URLs, one by line

- xargs and wget to download each page from the previous output




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: