Hacker News new | past | comments | ask | show | jobs | submit login
Systems Software Research is Irrelevant (2000) [pdf] (cat-v.org)
62 points by zarakshR on Oct 1, 2022 | hide | past | favorite | 38 comments



I think one of the reasons it's really hard to do any kind of system innovation is because current systems, that work pretty damn well despite their flaws, have decades of development baked into them.

Let's say I have an idea for some kind of experimental system idea. The amount of effort to build that idea from scratch and make it relevant to modern computing is astronomical. Unless you're a wealthy company that can hire dozens of engineers to work on a research problem for years, it's much more reasonable to build that idea as some kind of software package on top of existing systems like Linux or have it be a component of whatever technology stack that idea is relevant for (web - JS/ruby/php etc)


This paper is from 22 years ago, it's not true anymore. The fact that the technology stacks that occur to you are "JS, Ruby, PHP, etc." is strong evidence of that.


Rails is a very relevant modern web technology. It's just ruby + JS at its core along with whatever tools get added to make the infra scale up (docker, aws, redis, kafka etc)

There is no widespread modern web without JS at the moment


Agreed, and 22 years ago none of that was true. Rails, Docker, AWS, Redis, and Kafka didn't exist, and JS was unstable and mostly used for image rollovers. Those are ideas that have been built from scratch and made relevant to modern computing since the paper was written.


Yes and to the author's point and mine, they are just software built on top of existing systems. Theses technologies solve certain problems very well, but they all run on linux under the hood using the same base OS primitives.

If you wanted to use an innovative system to redo the web from the ground up, good luck, as you'd have to reinvent this entire ecosystem to get to a reasonable equivalence


The slides gloss "systems" in this context as "Operating systems, networking, languages; the things that connect programs together." See https://doc.cat-v.org/bell_labs/utah2000/utah2000.html if you don't see it in the PDF. He's definitely not talking about only operating system kernels.

JS, Ruby, PHP, Rails, Docker, AWS, Redis, and Kafka are solidly in that wheelhouse (though, to be fair, JS, Ruby, and PHP did exist in 02000, Pike just didn't know they were important). Some of them couldn't have been built with the base OS primitives common in 02000: Docker requires cgroups, for example, and AWS requires virtualization or (originally) paravirtualization. Paravirtualization came out of Keir Fraser's doctoral research.

You don't have to write a system that has no connection to the existing ecosystem in order to be doing systems software research, or even relevant systems software research.

Also though the Web has been pretty much redone from the ground up since 02000.


> Also though the Web has been pretty much redone from the ground up since 02000.

Rather: An insane uncontrolled growth of extensions appeared since 2000. The only in my opinion serious attempt to redo the web from ground up that existed in this timeframe was XHTML 2.0 (https://en.wikipedia.org/w/index.php?title=XHTML&oldid=11002...). Since the arrival of HTML5 (which was a "let's standardize the wild zoo of extensions that already exist at least a little bit instead of redoing things from ground up in an organized way") XHTML 2.0 is dead.


I mean, the hypertext web has been mostly replaced by a web of interactive applications. They even removed support for ftp:// URLs!

In particular, in this context, the browser went from being a user-agent for browsing hypertext (a regular application, if one a bit 3270-like as Pike points out in the talk) to being mainly an execution environment for third-party software. That is, it became "the things that connect programs together". Most new software is developed for this platform, and it's a platform that barely existed in a recognizable form at the time he wrote the talk.


I had lunch with Rob Pike after he gave that talk. I wasted it on arguing about Java, being just a dumb grad student at the time.



I think "systems engineering" has won out over "systems research" - optimizing existing systems instead of creating new ones. Part of it is that we are constrained by our hardware, which forces us into specific paradigms; but I think also systems end up becoming fungible and it a point it doesn't really matter what RPC protocol you use - just that it's fast.


While true we are hardware bound, it’s pretty apparently we are not fully utilizing the hardware all the time. That gap is what research needs to close


I would say closing that gap is exactly what engineering is - optimizing your existing systems to maximally use the hardware. Its very quantitative and extremely detail oriented; you usually just improve Linux or add system calls for better performance, instead of baking up a whole new operating system.

Obviously part of this is just arguing over semantics though :)


That is very useful work to squeeze more performance out of existing software, I would say there is still a lot more to be gained from big wins in more efficiently distributing computing power like cloud computing and containerization (which are the kind of non-obvious ideas that require systems research to happen before implementation).

I'm not a systems researcher on the verge of becoming a millionaire so I don't know what those ideas are but anyone can see the untapped potential is enormous.


The author defines systems research as:

> What is Systems Research these days? Web caches, web servers, file systems, network packet delays, all that stuff. Performance, peripherals, and applications, but not kernels or even user-level applications.

This seems false. DBMS and OS research is pretty interesting today. Look into Andy Pavlo or Onur Mutlu's courses and students to get pointers.


He's not wrong. Computer research is so irrelevant that I haven't ready much of it in decades. The things that do bubble up are theoretical security problems that really won't move the needle.

I mean, TCP/IP won, but that doesn't mean that it's the best. It's not even close. Same with the current CPU architectures; the work, they can scale, but still.

The chips today can solve certain kinds of problems really well. Is the scope our problems are limited by what the chips can do? Or does it seem that way because they're so general-purpose that they can handle any kind of problem?


Apache Spark originated from a Berkeley Ph.D. thesis (Zacharia), so it shows that a single person can still move the needle in systems research [1]. It also shows that academic research matters, and that academic innovation interacts with industry and FOSS research: Spark was built as a response to limitations of Hadoop (FOSS research & development), an open source clone of Google Map/Reduce (industry research & development). Spark then led to a startup, Databricks, so we've gone full circle.

Now this doesn't render Pike wrong, his complaint is these things do not happen often enough. My take on why is: many academics are not good coders, because academic environments view coding as "just engineering", all they care is papers. The primacy of the paper (over the system artifact that can be demoed) is one of the sad developments in CS.

Edit: We should less discuss whether Pike is right or wrong - I think he wants to provoke us to prove him wrong by building more radically new systems again and demo them - let's do that!

[1] https://www.kdnuggets.com/2015/05/interview-matei-zaharia-cr...


We did. More than almost anyone, he did.


In the 2000s we spoke about distributed operating systems (Inferno, etc.). That's where systems research seemed to be headed, yet that never really took off. But maybe we're revisiting the concept now, just coming down from the layer above (e.g. application data types) instead of trying to rethink the OS.

I work on this [1], it kinda looks like systems software, and it doesn't feel stagnant. But then, nobody uses it (yet - I hope).

And there's a plethora of similar ideas (DAT, OrbitDB, etc.).

[1] https://www.hyperhyperspace.org


The Web is our distributed operating system. You say the name of an application, and the latest version of its code gets loaded into your computer over an encrypted and authenticated connection, compiled for your platform, and run in a sandbox that isolates it from all the other applications you're running. You can use any device interchangeably to run it, and all your data is just there.

It's not a very good distributed OS (our data belongs to the applications, and integration between them is very limited) but we've definitely realized the mobile code and portable executable dreams of the 90s.


> we've definitely realized the mobile code and portable executable dreams of the 90s

Fine, now how to get rid of it, along with the lack of innovation and the entangled group or naive nerds and salesmen taking advantage of them who think or pretend the web is about their fscking apps when it was always about content.


Build a better alternative and pack them off to it?


It's so strange to see this get posted every year and have people respond, "Oh, yes, systems software research is still irrelevant," despite all the radical changes in the last 22 years. The problem Pike pointed out in this talk got so, so dramatically solved, it's just unbelievable how solved it is. Consider the "high-end workstation" slide:

01990 software: Unix, X-Windows, Emacs, TCP/IP

02000 software: Unix, X-Windows, Emacs, TCP/IP, Netscape

02022 software: Android, Unix, X-Windows, SurfaceFlinger, Wayland, Emacs, VS Code, TCP/IP, IPv6, ubiquitous TLS, Chromium, Firefox, mpv, V8, Docker, Kubernetes, Dark Souls, SQLite, Hadoop

01990 language: C, C++

02000 language: C, C++, Java, Perl (a little)

02022 language: Python, C, Java, C++, C#, JS, PHP, Objective-C, Golang, Sawzall

The top 30 items on the HN home page right now are about an open-source NewRelic alternative; a library for RPC from the browser to node.js; a UX design curriculum; photosynthesis; this talk; a bash one-liner; an in-browser ping-time tester; using org-mode for to-do lists in Emacs; noise suppression using wasm in Jitsi Meet, an open-source videoconferencing system built on WebRTC and TURN; TikTok tracking you across the web; Active Directory on Azure; USB SuperSpeed; a company hiring; Chinese transnational policing; a self-hosted photo-management system called Lychee using PHP, Laravel, Sass, npm, and Webpack; cyborgs; gambler's ruin; interspecific sociality in nonhuman apes; a detonation rocket test; social aspects of startups; a Microsoft UI bug with devastating consequences; a new approach to keeping trains apart; the international energy market; how to learn to program; gymnastic photographs from 01902; the tenth anniversary of TypeScript; the tradeoffs in Copilot, a deep-learning transformer that completes natural-language code based on a model trained on GitHub; testing React apps with Cypress, with a screencast; salted fish; and the relationship between rationality and wisdom.

Of these 30 items, 11 aren't even about software. Four others are built on systems software that existed in 02000: Emacs, Microsoft Windows, bash, and XMLHttpRequest. The other 15 are built on systems software that didn't exist when Pike wrote this paper: Cypress, React, Git, GitHub, TensorFlow, the transformer model, CUDA, OpenCL, Laravel, node.js, npm, Webpack, Azure, TURN (and NAT traversal in general), WebRTC, V8 and other high-performance JIT compilers for JS, wasm, Android and iOS (where people use TikTok), NewRelic and similar web profiling tools. (I'd say "org-mode" but really org-mode is more like an application than like systems software.)

A developer from 02000 transported forward to now would be totally lost in those 15 out of 19. They'd have no idea what they were even talking about, what kind of problems they were trying to solve.

In 02000, it's easy to forget, even using XMLHttpRequest was an advanced, risky thing. The web was made of documents, not applications. Applications ran on your webserver or, much more likely, your desktop. Since then we've moved to running the vast majority of our software in web browsers and the majority of our servers in AWS, Azure, and Google Cloud, none of which existed in 02000.

The mainstream programming languages of today are Python and JS, which both existed only as tiny niches back then. GPGPU was just beginning; Vulkan, OpenCL, and I think CUDA and even GLSL didn't exist. The most exciting developments of the last few years are things like Docker, Kubernetes, and Stable Diffusion, which all run on platforms that didn't exist in 02000. Not only didn't Intel CPUs support hardware virtualization, even VMWare was just starting to get adoption; virtualization was still mostly only a thing on IBM 360 mainframes.

For better or worse, it's certainly different.


>A developer from 02000 transported forward to now would be totally lost in those 15 out of 19.

Don't you mean "015 out of 019"?


But 9 is not a valid octal digit since ANSI C.


"02000" is a "long now" thing, not octal. Thinking about the next 10,000 years and ald that.

https://longnow.org/


It's just another unnecessary byte to transfer; or more, in fact, due to the inevitable wtfs and octal comments.


Thanks for this amazingly informative post.


I'm delighted you enjoyed it!


> Unix  X Windows  Emacs  TCP/IP | Netscape

Well, now we have:

    Linux | Wayland | Vim/Emacs/VSCode | HTTPS | Firefox/Chromium
So that's some sort of progress. The direction/sign is left as an exercise to the reader.


For reference: This site is hosted on plan 9, using a web server written in shell script.


I think that the intervening 22 years have proved this correct. Just like aeronautics and materials science, computer science seems to have stagnated.


Huh? My team isn’t pushing the state of the art by any means, and I have about 6 guys running about 40k hosts globally and they are more secure, better configured, etc than the work I did 20 years ago. Then, I worked with a few hundred Unix systems as part of a team of like 40.

Systems and storage has advanced significantly.


Computer science may be stagnant, but practical applications of topics researched decades ago has flourished from small seeds into large forests.

I am pretty sure that all research on stuff we are since the horizontal scaling has taken over have been researched 4+ decades ago and is only now is widely used... though when I looked, the CAP theorem is from 1998, so maybe not.


CAP:

https://en.m.wikipedia.org/wiki/CAP_theorem

2 of 3 for Consistency, Availability, Partition tolerance


The fish rots from the top. Lobbyists in the policy sausage making machine are likely the blame for the stagnation. The iPhone's introduction brought back to life the results of past research in industry. For example, Gorilla glass. And, compare the rocket engine design of SpaceX's Raptor, BO's and the designed to be reusable but to be thrown away after single use SLS's.


Neither aeronautics nor material science have stagnated. I doubt computer science has either.


You're clearly unaware of research on programmable matter!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: