Hacker News new | past | comments | ask | show | jobs | submit login
Stonebraker Seeks to Invert the Computing Paradigm with DBOS (datanami.com)
89 points by signa11 73 days ago | hide | past | favorite | 47 comments



Looks cool, but if you're building a stack for everyone else to build atop, why would you start with non-bare metal languages? I see C++ is in there, which is fine depending on who is wielding it, but Java and TypeScript necessarily add overhead that to me just doesn't make sense to add to all systems that want to run on your platform?


Java was in 1st iteration and was replaced by Typescript


So they replaced an entire language with a linter?


I worked at one place, they had heavily invested into pick style databases, when pick databases still made sense, and for some of the crusty old pick programmers still around, that was the os, it was their development environment and shell.

https://en.wikipedia.org/wiki/Pick_operating_system

Note: I don't think they had any actual pick systems left, by the time I got there they were mostly using unidata on linux, but there were still a lot unidata on sco boxes around.


> It is named after one of its developers, Dick Pick.

You gotta be fucking kidding me


And:

> Pick was originally implemented as the Generalized Information Retrieval Language System (GIRLS)

As an aside, though, I really enjoyed working on a Pick system a few decades back.


I have been thinking about that idea (putting database in front of operating system, i.e. applications interact with the environment through database) for a while now, I think it's really neat.

Another good reason is testing and verification. Your application becomes decoupled from the OS resources and you can verify what it does to them by comparing the database. Because the OS state is managed by database, it is formally described and you can verify what your application does to it. In some sense, it's an implementation of functional core/imperative shell idea.

Also, because DBs are transactional, transactions will become a first-class concept in your operating system. Which is something that's long overdue for applications (although there are systems such as CICS that have been doing this for some time now).


Have you heard of PickOS [0]? Maybe similar to what you are describing. I admit I just read the first couple of paragraphs of the article and thought, yeah - PickOS vibes.

[0] https://en.wikipedia.org/wiki/Pick_operating_system


I was a Pick user back in the late 1980s and that was, indeed, my first thought too.


Been following this development here for some time… many threads already provide for some great comments on the idea.

I find it both interesting and exciting to see a new os for the could era. I do feel like the DBOS is still not singular, PICK ( already mentioned) but also mentioned in other thread where Taurus OS and the Plant 9 project, which also leverages a database on a file system as a core for system resource management.

In my own thinking of a cloud OS, and particularly considering the focus on server-less applications as mentioned in the article, I have been inspired by Forth’s Dictionary data structure as a lightweight data structure concept where the core state and application contexts are managed through modular dictionaries, each with isolated and secure contexts. OS has a Core Dictiory, and each application is contained with in it’s own namespace dictionary… ect.

Message passing between applications could be efficiently handled using a tagging/queue system, potentially extending to RDMA for distributed environments. This could offer a lightweight, performance-optimized alternative current OS options, particularly in resource-constrained or highly distributed systems for server-less infrastructures.


This isn't a bad idea. However, assuming heterogenous hardware(CPUs, GPUs, RAM, Network, capabilities) the OS problem is primarily a driver problem. Resource management seem like a candidate for a microkernel and this can expose a relational interface to a distributed OS. The distributed OS is primarily a resource manager and well fit for a similar relational interface. Provided that, besides drivers, the state management (tasks) is a relatively big issue, this makes the whole setup not far from a DB architecture and most managers/schedulers are arguably easily generalizable to a relational interface. In fact most distributed systems run on some form of distributed state management / DB (e.g. Raft, Zookeeper)

This said, I think the more granular approach for the kernel is maybe a cleaner way to think about it. Also, any pragmatic approach would need to take drivers into account and I wonder if it's realistic to assume anything else but major kernels.


As they seem to be targeting only cloud, the set of drivers you actually need to implement is relatively small. eg. Plenty of "toy" OSes get away with implementing just a few virtio drivers and hence can run on qemu, KVM, and AWS, which is enough for them.


fair enough, but wouldn't a virtualization layer in between defeat the purpose?


I'm not quite sure what you mean, but if you target the cloud as the platform (as is stated in the article), then you always have a virtualization layer, even if you provision on something like AWS Bare Metal Instances.

(The reason bare metal instances use virtualization is not obvious: It's so you don't try to reflash the firmware on devices for a persistent attack.)


I'm thinking of Amdahl's law. Any effort to disaggregate OS and kernel for performance capped by the performance of the virtualization layer (about which I know to little to get any intuition).


Virtualization adds a constant overhead to various I/O operations, usually reckoned to be 2-5%, and nothing to CPU bound processes since the CPU just executes userspace instructions as normal. For AWS the overhead will be less since they use a specialized partitioning hypervisor and a lot of custom hardware assistance, including paravirt I/O devices implemented directly in hardware. This small overhead is almost always a good trade-off for the convenience of virt / cloud, such as easy provisioning, live migration, hardware independence and so on. The DBOS decision to only target the cloud makes lots of sense.



This boils down to a serverless platform with storage and scheduling - which is a great idea!

Add reactivity and you get Convex: https://www.convex.dev/


I think that they are more radical - it seems like they have implemented a unikernel that supports the database service and their claim is that that's all you need for app dev.

I think it looks rather similar (as a component) to something like snowflake - lots of data optimised services, but lower level with more app capability. The problem they will have is that a lot of the benefit that they are pushing is created by the big cloud data warehouses already, and the pain of a total re-implementation rather than re-architect or lift-and-shift completely terrifies most CIO's


Where's the code? I see a Typescript framework for writing database-using programs, but I see nothing resembling a kernel or an OS.

It seems the SaaS is proprietary. This is weird since even the article points out they got early feedback that proprietary systems weren't interesting to people:

> The first version of DBOS was written in Java and used VoltDB, the fast relational database created by Stonebraker over a decade ago. But early feedback from interested parties said a proprietary system was a no-go, so the commercial version was rewritten to use FoundationDB [...]


Interesting - we've been hearing about a "database file system" for some time now, and it's exciting to see it popping up slowly. But DBOS still has a long way to go - runs only on AWS currently and you are limited to Typescript for development (which is a curious choice - any idea why?).

> DBOS will resist being on-prem and will resist being POSIX compliant, Stonebraker said ... They may also want things like support for Python and Java programming environments, and support for running in Azure and GCP, which will be determined in the future.


BeOS had a database-based filesystem way back in the early 1990s.

It was slow and difficult to keep up to date. They swapped it out with BFS and the rest was history.

It's possible this will do better than Be's old database system, but this is not a novel idea.


In the article he says that DBs are now much faster.


People are also much happier these days to introduce a factor of 10 slowdown relative to what the hardware can natively support.


IBM big iron systems still use databases as file system.


Reminds me of what Microsoft was trying to do for Longhorn.

https://en.wikipedia.org/wiki/WinFS



this makes me think of the IBM i (fka OS400) Operating System, where the DB integration is so tight in with the OS. Originally DB2 did not have a name because it was so tightly part of an OS, it didnt need one until it came time to compare it to ther dbs and make standalone offerings of it. So therefore "IBM i" is basically DB2OS then?


The original (ca. 1988) name of the OS/400 SQL facility was SQL/400.

DB2 was originally an unrelated RDBMS that ran on IBM's MVS mainframe OS, announced in 1983.

Several years later, IBM expanded the use of the DB2 name to refer to a variety of separately-developed RDBMS products (the original DB2 for MVS, an older mainframe database that ran on DOS/VSE and VM/CMS, the AS/400 database, and the AIX RDBMS that eventually became DB2 for Linux, UNIX, and Windows).



Oh, weird. Their old abandoned code is at https://dbos-project.github.io/ https://github.com/DBOS-project with no indication of this move.


Sounds a lot like OS/400 in my ears. A machine interface that virtualizes the hardware and a database on top of that which applications are built upon. So, maybe not that new of an idea?


Wasn't the old "38" ( https://en.wikipedia.org/wiki/IBM_System/38 ) already built upon this concept?


For a thread on this from an earlier discussion, see: https://news.ycombinator.com/item?id=32705776


its called IBM i these days. DB2 originally didnt have a name because it was part of OS/400... Later DB2 name was applied and standalone versions released. "IBM i" is basically DB2OS if we want to talk about flipping the paradigm


DB2 was not part of OS/400 -- it was for MVS. Its early history is covered in this: https://mcjones.org/System_R/SQL_Reunion_95/index.html


Sounds kind of like what we live in. The universe is the database and life the operating system(s).


And the network is the computer :)


There is still an OS. Calling it a “minimal kernel” in the diagram is just semantic word games.


Why should this be surprising at all?

Big iron RDMS have had the enterprise option to run as full OSes for quite some time now.


By 'invert the computing paradigm' they seem to mean 'reinvent IBM MVS and CICS from the 1970s'. Of course, it's still around and dominates some industries, but hey, where's the marketing spin in that?


If current OS's are large enough that it makes sense to put an entire DBMS under them, that sounds like one of the best arguments I've heard for microkernels since the 1980s (1970s?).

(one might imagine that a microkernel schedules cores [time] and provides block access/mapping to storage layers [space]; while the DB layers would build first locking and tables, and then transactions, on top of that; after which come applications...)


That's exactly how QNX works today, except all drivers (eg. block device access, storage space) are just userspace processes, as is the database layer. But in that case, the OS is not layered on top of the database, they're all peers.


The way it could be different this time is that we have arrived at "The Network is the Computer" age. A datacenter running Facebook can look like one giant computer, so it could make sense to have a distributed operating system to run that on with a distributed database under it. Of course this effort may be unrelated to this view.


The OS currently known as "IBM i" (fka os400) is in essence "DB2 OS". DB2 is available as a "standalone" RDBMS also, and they rename it every decade or so for marketing


Not sure if relational was the optimal choice. A graph db like Neo4j might be more effective.


[flagged]


I wonder how much will be open source, and how much is special sauce.

https://github.com/dbos-project

> This repository is research code and no longer maintained. The students working on it have graduated

This is from "apiary". Lotus also seems experimental and limited to GCP, so I guess not?




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: