Oxide Computer: Docs

corysama · on July 20, 2023

Oxide's "On the Metal" podcast is an incredibly fun deep dive into technical issues no one should have to deal with as told by people who lived them. "Deep" as in: software DRAM drivers, Ring -1 security, bespoke motherboard designs... I only wish there were more episodes.

https://oxide.computer/podcasts/on-the-metal

I just noticed they have a second podcast. I'll assume it's just as good.

https://oxide.computer/podcasts/oxide-and-friends

bcantrill · on July 20, 2023

For whatever it's worth, I think the reason the doc site was submitted to HN recently is in fact because of our recent episode on the frontend.[0] We have really enjoyed doing Oxide and Friends, and if you're an On the Metal listener, we think you'll find a lot to like!

[0] https://oxide-and-friends.transistor.fm/episodes/the-fronten...

pjmlp · on July 21, 2023

By the way, Robot was initially tested at Nokia Networks Oy, and we on the NetAct business unit weren't that found of it.

Otherwise, another interesting episode.

jdougan · on July 22, 2023

Please bring Jon Blow back!

rtpg · on July 21, 2023

Oxide + Friends is very good. It's very motivating to listen to and gets me excited to try and build new things. Way more positive vibes being put out there than a lot of stuff.

yardie · on July 20, 2023

O+F is really good especially if you can make it to the Discord Server

https://discord.gg/gcQxNHAKCB

LoganDark · on July 20, 2023

Are these available in text or article format?

gorkish · on July 21, 2023

O+F is posted to youtube, I get the auto-generated transcript from there. It's unscripted/unstructured, so if you are primarily wanting to hear discussion of the oxide hardware/software/progress it's pretty difficult to consume as an audio program.

This is not a dig on the program at all; I'm glad they are making the time to produce it, and I'd rather they spend their effort getting racks out the door instead of generating marketing hype.

realitythreek · on July 21, 2023

I always miss the live Discord call and never realized they were published to YT! Thanks for the notice.

ahl · on July 22, 2023

Clearly we need to do much more self-promotion!

helf · on July 20, 2023

This sounds like a great listen. Thanks for mentioning it

howinteresting · on July 20, 2023

[flagged]

dang · on July 21, 2023

We didn't shadowban them, we told them we were banning them: https://news.ycombinator.com/item?id=34916347, we told them why, and it wasn't the reason you've presumed.

When people presume things about moderation, they almost always get it wrong. That's a pity because all anyone has to do to get the correct answer is ask us.

If you see an account that's banned and you don't think it should be, please let us know at hn@ycombinator.com so we can take a look. We've unbanned quite a few accounts that way. In the meantime, you can vouch for their good comments (see https://news.ycombinator.com/newsfaq.html#cvouch). But please don't divert the thread with offtopic noise that is likely to be incorrect.

usefulcat · on July 20, 2023

For that one comment, probably not, but maybe for dozens of previous flagged and/or dead comments

howinteresting · on July 20, 2023

Once you're shadowbanned your comments start off as dead by default and have to be vouched to be visible.

usefulcat · on July 21, 2023

Ah, I wasn't aware of that--thank you for the clarification.

mburns · on July 21, 2023

looking at his (recent) comment history, I can’t imagine this is the one in particular that got him shadow banned.

EvanAnderson · on July 20, 2023

Everything about Oxide's gear sounds like fun. I imagine it must be a bit like what working with minicomputers in 70s thru the 90s was like.

I did a little work in the late 90s with Alpha-based machines. I was impressed at those machines didn't seem like the hack-job crap that PC-based stuff was (with simulated chips from the early 1980's hiding out in dark corners because "compatibility") and still is today. I'm betting working with Sun gear felt similar, though I never got to work with it. Just having an honest-to-God serial console, as opposed to crappy bag-on-the-side things that scrape video memory and pretend to be "legacy" PC input devices, would be an amazing thing.

I'll never be able to work with their stuff because I don't work with Customers at that scale. I'm also vastly unqualified to work for them at their current stage. I suppose maybe someday they'll need field service technicians... I can hope, I guess.

soneil · on July 20, 2023

I really wish they'd do a tour of it, hardware & software. They've gotta be proud as hell of what they've pulled off, it boggles my mind that they're not more eager to show it off.

EvanAnderson · on July 20, 2023

I'd fly out to their factory on my dime and pay to watch a dog-and-pony show and hear a Q&A. I'm that excited about this stuff. Even if I just end up being a Customer of somebody who hosts my VM on this gear it's plenty exciting. It actually feels like a new computer, as opposed to the same old, same old offerings from the incumbents.

newsclues · on July 21, 2023

They need to get Linus Tech Tips to do a video on the machines and people of Oxide.

wmf · on July 21, 2023

And then void the warranty by water cooling it.

newsclues · on July 21, 2023

Drop a sled first, then try to watercool one! It would be fun content.

mst · on July 21, 2023

Given it was only relatively recently they were talking about getting their first deployment out the door, I'd suspect that staying mostly quiet is strategically wise in terms of avoiding having more prospective customers come in the door at once than they can (as yet) service to the standard they want to.

So "they probably really really -want- to show it off, but are showing remarkable restraint for good reason" seems at the very least plausible to me.

Aurornis · on July 21, 2023

> I imagine it must be a bit like what working with minicomputers in 70s thru the 90s was like.

I had a chance to talk to someone who worked at Oxide once. I got excited because my experience matches up nicely with their products on several levels.

The person got excited as well and started talking about how to interview there, but the conversation got kind of weird. They kept emphasizing how important it was to not talk about compensation in the interview. Apparently they pay engineers all the same comp (not bad, but would have been a significant step down from every offer I received during that time of my life) and they select for people who aren't interested in getting paid a lot for their unique skills.

Probably not a bad deal for people who like working in that domain with like minded people. At the time I got some very uneasy feelings from being aggressively coached to not bring up compensation or ask any questions about equity during the interview like it was some unspoken rule that would get me disqualified. Maybe the person was exaggerating, but I found at least one other person with a similar story.

Honestly, their comp would have been awesome if I was a single guy living in a low cost of living location and working remote, but at the time it would have meant giving up quite a bit to work for an early stage startup with high expectations and an unspoken rule that I should never ask about compensation.

bcantrill · on July 21, 2023

This conversation strikes me as unlikely on several levels. First, no one would have coached you on "how to interview at Oxide" because that's not where the process starts -- it starts with you preparing your materials.[0] (Our review of the materials constitutes ~95% of our process.) Second, we have always been very explicit about compensation (that is, we ourselves brought it up early in conversations); no one at Oxide would tell you to "not bring it up" because everyone at Oxide knows that it is a subject dealt with early in the process. And finally, this is all assuming that you were talking to someone before March 2021, when we published our blog post on it.[1] After the blog post, compensation simply doesn't come up: everyone has seen it -- and indeed, our approach to compensation is part of what attracted them to the company!

[0] https://docs.google.com/document/d/1Xtofg-fMQfZoq8Y3oSAKjEgD...

[1] https://oxide.computer/blog/compensation-as-a-reflection-of-...

alexgartrell · on July 21, 2023

The person to whom you are replying clearly meant (IMO) that you shouldn’t ask for more compensation or you will make people act defensive. Frankly, your reply reads a little defensive so maybe that’s not awful advice?

It also seems like this was a spontaneous initial conversation and not part of the process, so I’m not sure why you are suggesting that they made it up.

gary_0 · on July 21, 2023

I didn't realize there are two ways of looking at it until your comment. It didn't occur to me that if you randomly have Oxide on a list of 40 other companies to apply to, and you expect to talk to them about compensation as you would with most others, you're going to have a weird time because they have an uncommon policy.

But on the other hand, Oxide is very up-front about it, and their CTO is happy to go on HN and chat about it. So not knowing about it makes you look like you didn't do your research before the interview, or knowing about it and trying to force the issue anyways makes you look kind of arrogant (if you don't agree with it, you can just not apply).

Aurornis · on July 21, 2023

> It also seems like this was a spontaneous initial conversation and not part of the process, so I’m not sure why you are suggesting that they made it up.

Yes, it was an initial conversation as I said. When I asked about equity compensation I was told that compensation discussions are to be avoided because bringing it up could be considered a negative by the company.

Aurornis · on July 21, 2023

> First, no one would have coached you on "how to interview at Oxide" because that's not where the process starts

I was using “interview” as a generic term for applying to a company, not literally referring to your internal process.

I didn’t interview with or even pursue Oxide after the conversation (and never said I did)

This all came up because I asked the person what the equity compensation was like. Not an unreasonable question when talking about a startup. That’s when they started advising me that I shouldn’t bring it up and it’s not something they talk about. After that I got uncomfortable about pursuing a company that discourages any conversation about compensation to the extent that someone felt necessary to warn me about it when I hadn’t even applied.

> no one at Oxide would tell you to "not bring it up" because everyone at Oxide knows that it is a subject dealt with early in the process.

They were trying to tell me that it was important that I avoid giving the impression that I cared about compensation, as that would be a negative if I talked to anyone else at the company. Just repeating what I was told.

> And finally, this is all assuming that you were talking to someone before March 2021, when we published our blog post on it.[1]

No, they told me to look up the blog post, but I had not read the company blog before taking to this person.

> After the blog post, compensation simply doesn't come up: everyone has seen it -- and indeed, our approach to compensation is part of what attracted them to the company!

Or maybe your approach to compensation is what filters people out of the application pipeline? I don’t think it’s realistic to think that this compensation strategy is what attracts people to the company rather than pre-selecting people out.

I read the blog post, but I feel like I’m missing the equity portion of the conversation still.

Regardless, is it so hard to believe that compensation “simply doesn’t come up” because potential candidates (like me) are sometimes coached to not bring it up? Or that the company’s stance appears to discourage bringing it up? This feels like some circular logic: Nobody brings it up because we discourage people from bringing it up.

bcantrill · on July 21, 2023

It definitely wouldn't be viewed as a negative to talk about it, and in fact our transparency on this topic makes it very easy to talk about directly. So we absolutely don't discourage talking about it -- but it's also true that people for whom the compensation is going to make Oxide impossible do self-select out. And that's okay! People have different needs at different stages of their career, and there are different things that they want; there is nothing wrong with optimizing for compensation -- but it's also true that Oxide is very unlikely to be a fit for someone optimizing for compensation, for many reasons.

apple4ever · on July 22, 2023

> People have different needs at different stages of their career, and there are different things that they want; there is nothing wrong with optimizing for compensation -- but it's also true that Oxide is very unlikely to be a fit for someone optimizing for compensation, for many reasons.

I think you are missing that there are other reasons to avoid Oxide that doesn't have to do with optimizing for compensation. They may very well be optimizing for something else, but compensation is still a data point and while they may take less money, maybe not too much less.

bcantrill · on July 22, 2023

Oh, there are definitely many reasons to not work at Oxide -- not least that it's hard, grueling work! (Indeed, part of our process is getting candidates sufficiently understanding what the work actually entails to allow them to make a decision for themselves.)

petesergeant · on July 21, 2023

Hey Bryan, I enjoyed reading the cash compensation article, but I'm curious about how it meshes with your equity compensation? Is it awarded purely on when people joined? Does everyone who wasn't a founder get the same amount? It feels like you could run into much of the same problems the article points out about transparency and so on if the equity component isn't as straight-forward?

heudhdidh · on July 21, 2023

speaking as someone who dropped on the first round: for the nobodies like me, they say equity is based on time joining. it was early 2021 i think and amount already published on the job or mentioned pretty early (don't really recall)... and it was brutally low considering the pay cut. maybe they keep it to negotiate the somebodys.

jjav · on July 22, 2023

> Apparently they pay engineers all the same comp

Yes, that is a sad point. I love Oxide and everything about them. It is the only company in Silicon Valley that I'm honestly kid-like excited about (I fake the "passion" for others but don't feel it). My partner is probably tired of hearing me drool about this one company I wish I could work for...

But with a family to support, it's never going to happen. The pay cut would be brutal, so I never apply. If I ever become an independently wealthy multimillionaire, the first thing I'll do it apply to Oxide. As long as I need a paycheck, it's impossible.

dist1ll · on July 20, 2023

As someone who's only dealt with commodity server hardware, these specs make me salivate. And all these boot/management TUIs are just so satisfying to look at.

(Sorry to be that guy, but just a friendly suggestion: high contrast dark themes are difficult to read for people with astigmatism. Especially since this is technical documentation, intended to be thoroughly read, you might want to consider a light theme toggle.)

tiffanyh · on July 20, 2023

> As someone who's only dealt with commodity server hardware, these specs make me salivate

Commodity servers have the same specs.

Here’s a reseller of SuperMicro servers, where you can buy similar compute on the cheap.

https://www.siliconmechanics.com/systems/servers/rackform

EvanAnderson · on July 20, 2023

I haven't worked with SuperMicro (aside from having it inside "appliance" devices that I've worked adjacent to), but I assume my experience with Dell and HP commodity servers are similar.

The thick layer of hardware contrivances necessary to maintain IBM PC compatibility is unnecessary for the task of bulk hosting of x86/x64 VMs. There's a lot of hardware and software that just doesn't need to be there.

Bare metal out-of-band management ends up being bolted-on to these "legacy" contrivances (scraping video memory for remote consoles, faking being USB peripherals). A serial console or SSH connection to a service processor would be vastly superior. I can't begin to count how many times an iDRAC "lied" to me about issues with a machine, or how many times the solution was "upgrade the iDRAC firmware and reboot it".

I have been mostly unimpressed with the quality of firmware for motherboards, baseboard management controllers, RAID adapters, NICs, HBAs, power supplies, backplanes, front panels, etc. Every new model of system or component ends up being an exercise in fear / anticipation of problems. The integrator has very little power over the firmware quality and I can be assured that if I do have a firmware-induced issue I'm many, many steps away from actually communicating with somebody who can help.

Granted, maybe if I was buying at the scale of Oxide's prospective Customers I'd have some pull with the integrators, but I'm skeptical of that, even.

Oxide is actually building computers. Putting commodity motherboards into boxes with other commodity components won't ever have the level of attention to detail and integration that Oxide can provide.

Aurornis · on July 21, 2023

> Commodity servers have the same specs.

Probably not a coincidence. It would be interesting to know which ODM they partnered with for the hardware.

I've done some work with SuperMicro in the past. Some of their boards come with extensive headers and customization options right out of the box. They're also happy to work on board level customizations with the right contracts in place.

bcantrill · on July 21, 2023

We didn't work with an ODM: the ODMs were unwilling to contemplate some of the most basic things we needed (e.g., replacing the BMC with a much lighter weight service processor, having a true root-of-trust, etc.) -- let alone the more things we wanted to do (e.g., our own switch). The compute sled and the switch are both of our own design and look nothing like what you'll find from an ODM; if you're curious in the details, we have discussed them quite a bit in our Oxide and Friends podcast.[0][1][2]

[0] https://oxide-and-friends.transistor.fm/episodes/tales-from-...

[1] https://oxide-and-friends.transistor.fm/episodes/the-sidecar...

[2] https://oxide-and-friends.transistor.fm/episodes/bringup-lab...

mst · on July 21, 2023

Please consider generating some sort of automated transcript from these.

I'd hope your target audience would understand the limitations of such a thing, and I'm probably not the only person who'd rather read than listen even with the obvious caveats.

(these days automated transcripts seem to be no harder to mentally fix up the errors in as I read them than "somebody typing fast on a software keyboard and suffering the inevitable tyop and autocorrupt related issues" is, though of course others' mileage may vary)

actionfromafar · on July 21, 2023

They could probably have someone proofread the transcripts, there aren't that many episodes.

mst · on July 21, 2023

I was going for "set up some code once and don't think about it again" to maximise the odds of the idea sounding tempting.

Proofreading would set up an expectation on the part of readers that it -had- been proofread and corrected and therefore a commitment to perform a repeated "boring but important" task going forwards for whoever's doing said proofreading.

That way would likely lie either delayed transcripts or never getting to initial activation energy to provide anything at all.

So I think "add a quick bit of code to your podcast publishing workflow and a CAVEAT IN BIG LETTERS" is better to do first.

If it turns out enough people care about the transcript, doing it a more labour intensive nicer way later is something they can decide, well, later.

Shipping is feature zero, as ever.

actionfromafar · on July 21, 2023

I hate bad transcripts.

mst · on July 23, 2023

The automatic transcribers have (pretty recently) reached the point where I'd rather have their output than not.

This came as something of a surprise to me - six months ago I'd likely have been enthusiastically agreeing with you.

As an example, the transcript tab on https://www.thebulwark.com/podcast-episode/tom-nichols-jack-... was pretty readable to me in spite of the errors. Whether you'll find it the same is, of course, a separate question.

intelVISA · on July 21, 2023

As much as I wish Oxide the best, most of us here would probably prefer our own libre, hand-crafted OS atop Coreboot.

(Or maybe I speak for myself only...)

sjdmdlakziggy · on July 21, 2023

They are not even close to the same specs.

upon_drumhead · on July 21, 2023

https://www.siliconmechanics.com/system/rackform-a235.v8.1/2...

This is pretty darn close to the "Gimlet" Compute Sleds

One 225W TDP 64-core AMD Milan CPU 1 TiB of DRAM across 16 DIMMs 12 front-facing hot-swappable PCIe Gen 4 U.2 storage devices 2 internal M.2 devices 2 ports of 100 GbE networking

Point for point, the same hardware as specified.

benjaminleonard · on July 20, 2023

All of our sites and indeed the web console itself are driven by the same design system and we are planning on a light theme as soon as we can.

Thanks!

all2 · on July 20, 2023

> sorry to be that guy ... astigmatism

Darkreader is a lovely plugin that might make your life much easier: https://addons.mozilla.org/en-US/firefox/addon/darkreader/

I believe there is a Chrome extension, too.

It lets you set BG, FG, sepia, total contrast, etc. It is quite a neat piece of kit.

pxc · on July 21, 2023

I recommend Midnight Lizard as a backup for difficult sites. It's a little heavier/slower, but sometimes works on pages where Dark Reader doesn't. You can configure Midnight Lizard not to apply to all sites by default, then selectively turn it on where Dark Reader fails.

bpye · on July 21, 2023

I would never have thought to use an extension intended for adding dark mode, to instead view dark mode websites in a light mode, but that makes total sense!

pxc · on July 21, 2023

> high contrast dark themes are difficult to read for people with astigmatism.

Are you yourself affected by this? I have astigmatism and I keep hearing this, but I've never experienced it. If you are affected, do you keep your screen at high brightness? I'm wondering if it doesn't happen to me only because my astigmatism is mild, or if the fact that I tend/have to keep my screens at relatively low brightness plays a role.

Incidentally, for different reasons, high contrast dark themes can be problematic for me as well (especially at high brightness). Dark Reader and Midnight Lizard are essential for me in keeping contrast in a comfortable range.

benjaminwootton · on July 21, 2023

I was about to ask the same question. I think I have moderate astigmatism and find dark mode easier on the eyes.

jna_sh · on July 21, 2023

I have astigmatism and loathe dark mode. I find low contrast and soft colours easiest.

jxf · on July 21, 2023

> boot/management TUIs

Where are you seeing these in the docs? Or are those somewhere else?

dist1ll · on July 21, 2023

Here https://docs.oxide.computer/guides/system/initial-rack-setup

drbawb · on July 20, 2023

I am super interested in learning more about the storage subsystem! I figured they'd be using ZFS, given the people involved, but it appears they've also gone ahead and built a clustered FS (crucible) on top of it? I figured something like that would be necessary to handle fault tolerance at the gimlet level. (Losing an entire shelf / drive controller, etc.) Getting ZFS to go multi-node is surely a neat trick.

Second to that I just want to say the presentation of these docs is top notch. (I so desperately wish I was the target customer for these systems; reading these docs makes me want to do terrible things to my electrical service and play with one of these racks.)

kaliszad · on July 21, 2023

They use Crucible on top of ZFS. https://github.com/oxidecomputer/crucible I don't think they have anything for S3-like service but there are other options for that, e.g. https://garagehq.deuxfleurs.fr or MinIO. I am not sure whether they have their own SSDs or use of the shelf SSDs just with their firmware or something.

yencabulator · on July 21, 2023

They implemented a block store with replication from scratch? That's kinda brave, considering that that's a project big enough to justify full startups for!

Timothycquinn · on July 24, 2023

However, the folks at Oxide are at the top of the game for this space with dozens of years of experience in building and testing such systems. Secondly Oxide's crucible stack is completely written from scratch in Rust, which dramatically reduces failure modes common to such stacks, which are often written in C / C++.

ori_b · on July 21, 2023

They're not the first company to do that. https://panzura.com/ did something similar.

solarkraft · on July 20, 2023

Finally I actually understand what they're building. Now I must ask: Why?

On-prem servers aren't a new invention. The market seems pretty saturated (and shrinking). Virtualization isn't a new invention either. The market seems pretty well served, at least commercially. Can the integration of both be a convincing enough advantage?

The management UI certainly looks nice; it's something I'd like to have on my KVM box at home (any good Proxmox alternatives?). I don't see why it'd have to be bound to an enormous server.

faitswulff · on July 20, 2023

One benefit is that they're competing in an industry where the time to get a sled up and running for compute can be measured in weeks (they say they've heard up to 90 days) whereas their solution is basically plug and play - Bryan was trying to get Steve to admit set up took "hours" whereas Steve was hedging and saying customers could get started "within a week."

They go into more details on their podcast, and this section in particular covers the bootstrap time: https://youtu.be/5P5Mk_IggE0?t=3381 Pretty fascinating stuff.

throwawaylinux · on July 21, 2023

Do I have it right that they ship their own hypervisor (that's based on maybe Solaris, not KVM) as firmware? Let's assume a very small team can compete on a technical level, it still seems like it could cut out a lot of the potential market.

I can't imagine that large cloud / "web scale" companies would want that. Most want a fair bit of control of their own hypervisor and management stacks based around KVM. And "enterprise" type companies are going to have issues with certification I would have thought -- will RedHat, Microsoft, SAP, Oracle, etc certify their supported products on top of this hypervisor? Seems like a difficult and expensive process.

So what's left? Companies that support their own virtual machine software but don't support their own hypervisor and don't like what's available from vmware or Microsoft or RedHat. A small niche. Or are my assumptions wrong?

panick21_ · on July 21, 2023

I think quite often when we assume 'most want to fair bit of control' is just not true. Enterprises want something that just works, they want control if they can't have something that just works.

If you have a team that is struggling building an internal cloud with all this control (and problems) and all this commodity hardware (and its problems) then maybe they would be happy to switch to something that just works.

> And "enterprise" type companies are going to have issues with certification I would have thought -- will RedHat, Microsoft, SAP, Oracle, etc certify their supported products on top of this hypervisor? Seems like a difficult and expensive process.

If that was the case and nobody running any of these would run their rack, then I wouldn't think they would not have received any funding. But I don't know enough about these certification process to really comment.

> Companies that support their own virtual machine software but don't support their own hypervisor and don't like what's available from vmware or Microsoft or RedHat

Non of these come with a fully integrated rack.

The competition would be somebody willing to buy a rack of Dell servers with VMWare software. Or somebody willing to buy a rack of Dell server and then use RedHat and set up all their own cloud style infrastructure.

throwawaylinux · on July 21, 2023

> I think quite often when we assume 'most want to fair bit of control' is just not true. Enterprises want something that just works, they want control if they can't have something that just works.

That's not what I'm assuming here. Read carefully, I divide the market into 3 categories. Those who support their own VM image software and hypervisors, those who support neither, and those who support VM image but not hypervisor.

First is Amazon, Google, Facebook and the like (and it's not an assumption we can see their public contributions to KVM, QEMU, etc., and hear their talks about some of what they use internally). Second is "enterprise" who wants something that just works. Third is ? and would they want to support their software on a niche hypervisor?

> If that was the case and nobody running any of these would run their rack, then I wouldn't think they would not have received any funding. But I don't know enough about these certification process to really comment.

Well it is the case that enterprise (supported) software is not just supported on any hypervisor. https://access.redhat.com/articles/973163 RHEL runs on their own KVM as well as MS, VMware, some cloud vendors. Some application software also gets certified to hardware and hypervisors, not just operating system (e.g., SAP does this).

> Non of these come with a fully integrated rack.

It's not fully integrated if it doesn't come with the guest software though, is it?

> The competition would be somebody willing to buy a rack of Dell servers with VMWare software. Or somebody willing to buy a rack of Dell server and then use RedHat and set up all their own cloud style infrastructure.

Right. And the problem for Oxide is that the competition will have fully certified and supported operating system and application software for their virtual machines.

kaliszad · on July 21, 2023

The hypervisor is based on Bhyve from FreeBSD + Propolis in user space. Illumos actually has/ had KVM and there is a talk by Bryan Cantrill where he speaks about the porting effort. All of that information is readily searchable.

jitl · on July 21, 2023

Public cloud like AWS is a premium product. When your web scale business wants to increase profits by cutting costs and you’ve already done a few passes making your software run faster, owning your own metal & renting colo space starts looking like a big avenue for savings. Especially if you’re doing something bandwidth intensive, where AWS makes you pay through the nose. I think we’ll see a fair number of companies move back towards owning metal, especially if the metal is super easy to manage. I think the Oxide pitch is making owning racks sensible for ~2000 eng companies instead of ~20,000 eng companies.

chamakits · on July 20, 2023

I'm not affiliated with them, but I recall this being marketed at some point as giving you the flexibility and customization powers that the Googles and Facebooks of the world have with their on-prem infrastructure without needing to have as deep of a dedicated staff as they do to just this which allowed them to develop all their custom tooling in the first place.

Basically if you are on-prem, and you are dissatisfied with what you are getting out of today's onprem sellers. Things like bad firmware with slow update cycles, issues with rack/power supplies/cabling/interconnecting systems. Closed down systems that don't allow much customization, etc. They are open sourcing a lot of their work along the way

Again, I'm not affiliated with them, and my info may be outdated so take it with a grain of salt. But that's how I've seen them for some time.

wmf · on July 20, 2023

I would say Oxide is inflexible and non-customizable since they have exactly one hardware configuration and few software features at this point. Their claim is more that their rack works and everything else on the market is full of bugs.

throwaway2037 · on July 21, 2023

I'm not an infra engineer, but this claim "everything else on the market is full of bugs" might be the killer app. Of course, it needs to be true. What if they iterate to an insanely stable embedded code base (BIOS, etc.)? Then, continuously upgrade the hardware to use the latest CPU/RAM/NVME. I could see that being very valuable.

yencabulator · on July 21, 2023

I fully expect there will be data-loss bugs and poor performance during recovery in their in-house distributed block storage solution. That's just in the nature of the problem domain, and this is all new code:

https://github.com/oxidecomputer/crucible

rtpg · on July 21, 2023

The short answer is that on-prem is used by a lot of companies for many reasons that go beyond "legacy/people don't know about cloud" (and even go beyond "regulatory environments"!), and these boxes are meant for people who have serious requirements.

I'd recommend the O+F ep posted in sibling, but I think here the pitch is "well you need this hardware anyways right? How about buying one that's easy to use and doesn't take a month to get working?" All built by people who are so obsessed with root cause analysis that they've ended up writing their own firmware, running on an OS where these people are common contributors.

solarkraft · on July 21, 2023

Thanks, I think now I get the excitement (from the engineering perspective more than the business one)! "Fixing everything" is probably everyone's dream.

mhh__ · on July 20, 2023

YOLO - sometimes it's worth paying more money for something someone has actually sat down and designed properly.

throwawaylinux · on July 21, 2023

So what has been designed properly and what hasn't?

mhh__ · on July 21, 2023

The random piles of servers in most offices above a certain headcount

bpye · on July 21, 2023

It feels like they’re trying to give people the benefits that hyperscale cloud operators have, without being one?

steveklabnik · on July 24, 2023

This is the most succinct pitch for Oxide, the problem is that many people do not know what the hyperscalers even are, so it doesn't land with a lot of folks.

superb_dev · on July 21, 2023

That is exactly their goal. Hyperscale for the rest of us

strangemonad · on July 20, 2023

A dumbed down interpretation. What most people can buy off the shelf for servers more or less looks like a pc shoved into an odd looking case (1u, 3U rack). To get anywhere near the cost / performance of the big players you need something that’s designed for the data center server workloads.

throwaway2037 · on July 21, 2023

Stupid question (no trolling, I promise): Could this product be valuable to public cloud vendors, like AWS, Google, Oracle, IBM, etc? Or even medium sized ones, like Linode? My thinking: Could this be the cloud inversion moment akin to TSMC and outsourcing semi-conductor manuf?

littlestymaar · on July 21, 2023

For big vendors, no. Because what oxide does is basically sell the kind of server that the hyperscalers have been building for themselves internally. But for smaller cloud provider, who aren't running custom hardware made for hyperscale and are instead using of the shelf servers with all their flaws, then it could make a lot of sense.

dbish · on July 21, 2023

Probably not. These vendors have massive supply lines that are designed and tuned to their needs.

dasil003 · on July 22, 2023

I am definitely not an infra engineer, but I see it as an attempt to fix problems that have accumulated over decades as standards calcified and no single vendor was able to improve due to interoperability issues. Some of these things can’t be tackled without a full end-to-end solution, hence the full server. It doesn’t mean components couldn’t be swappable in the future, but at the beginning it’s best to develop on a narrow set of hardware until a solid baseline of stability is reached. Sort of an Apple-like approach for large scale servers.

panick21_ · on July 21, 2023

Cars aren't a new invention either, and yet people build new ones.

There is an existing market, it is large. Making a product for that market that is just simply 'really good' has potential to make money.

Almost all great computer companies started into markets where things already existed that were comparable.

goalonetwo · on July 20, 2023

Oxide is an HN darling. I have never seen anything even close to negative about them here. I hope someone writes a case-study about this. It seems to be a mix of the charisma of the team/founders and their product that makes everyone love them.

I would put Wireguard/Tailscale in the same category as well.

Tuna-Fish · on July 21, 2023

Here's the big negative things I can come up with quickly:

1. It's not clear the market case closes, because while there are customers who would greatly benefit from their systems, those same customers are also very averse to change, meaning that most of them will probably sit out for the first few generations to see if they have staying power. If everyone does that, it becomes a self-fulfilling prophecy.

2. ... but this would have been fine, a few years ago. It's not fine in the current market situation, where there is much less easy money looking for a place to go. I sure hope they are sitting on a long runway.

3. Speaking of, they are very late in their execution. A SM4 platform that only shipped in 2023 is not great. I really hope they are far along on their SP5 development. (... But this also dampens current sales. I bet a lot of potential customers are thinking that a SP5 Oxide Rack seems a much more appealing than a SP4 one, so why not wait?)

But the reason they are not discussed that much is that people who understand the market are all really, really hoping that they pull it off. Because the current situation in server hardware is dire. In every thread people who don't know much about servers ask why not get a similar system from Dell or Supermicro or whoever. The answer is that the commodity servers are pieces of shit that requires significant local engineering resources to manage. Software quality of firmware is generally horrendous, and when this is pointed out to the vendors, their answer is that they know, but everyone else is just as bad, wontfix. A significant draw of the cloud is that all that pain goes away, because the hyperscalers realized how shit everything was and fixed it in their systems. It just never trickled down to the market below them.

panick21_ · on July 21, 2023

There are just very few people regularly buy 500k-1M in Server hardware and have a good understanding of the market and the finances involved in the alternative.

owenmarshall · on July 21, 2023

> I would put Wireguard/Tailscale in the same category as well.

Sometimes the hype exists for a good reason - as a paying enterprise Tailscale customer, I’ll fight you if you ever suggest I have to staff people to admin an IPsec or SSL-based VPN again.

HL33tibCe7 · on July 20, 2023

fly.io too, although the scales have fallen from the eyes of HN to some extent after their repeated outages

panick21_ · on July 21, 2023

It partly because they have a good podcast that is perfectly aligned with what many people on HN like.

Other then that, the code is open source. They also help sponsor both Rust and OpenFireware conferences.

azinman2 · on July 20, 2023

But are people buying their machines?

intvocoder · on July 20, 2023

They only started shipping this month... so, jury's out?

monocasa · on July 20, 2023

They recently shipped their first rack IIRC.

LoganDark · on July 20, 2023

no, companies are :)

newsclues · on July 21, 2023

Given they shipped their first machine, it seems so.

Roark66 · on July 21, 2023

It sound nice, but I've been building systems like that since 2012 (usually vmware vCloud director plus custom code). Using hardware like Fujitsu's cx1000, Nexus 1000 distributed vswitch, brocade network switches, and San storage. The systems I built back then could dynamically provision separate network/conpute/storage segments for PCI compliance and more. If I were to build something like this today I'd probably consider using Kubernetes too and integrating it with public cloud for scale out. This way a business can have their "own" cloud that is much more cost effective at certain scales and access to "unlimited" public cloud resources at the same time (at the cost of increase in system complexity).

Is this just more of the same, or is there some innovation there? It is interesting that computing trends go in centralise/decentralise cycles. We can observe this as far back as 1980s. I can't wait for the next "decentralise" cycle as I'm under an impression a lot more innovation happens during that phase.

panick21_ · on July 21, 2023

I think the vision is that a company doesn't need to heir and pay you and not pay vmware and they can have that kind of thing turn key.

The innovation is mostly in the integration between the hardware and software of both the server and the switch.

This podcast goes into what that enables: https://www.youtube.com/watch?v=AkWh2Sms3aw

> I can't wait for the next "decentralise" cycle as I'm under an impression a lot more innovation happens during that phase.

I think we are beyond that cycle. Centralization and decentralization happen at the same time. We have much more much cheaper chips now everywhere that do a lot more, and then we have local and distributed compute deployed everywhere and we also have large datacenters at the center of it.

Neiter datacenters nor distributed compute is gone go away anytime soon.

benrockwood · on July 20, 2023

It's great to see Oxide shipping. They have an incredibly talented team thats worked very hard for a long time in the best tradition of Sun Microsystems.

dsies · on July 21, 2023

Not being pedactic, I swear :)

Can you elaborate on the Sun comparison? I am a huge fan of Sun and what they did for computing at large - designing hardware, creating specs, their contributions to evolving unix and so on. I'm not sure how Oxide compares. Unless you're talking about "in the spirit of Sun".

cbarrick · on July 21, 2023

I'm not the one you replied to, so I don't know what they mean specifically about the "tradition of Sun", but

Bryan worked at Sun where he helped create DTrace.

After the Oracle acquisition, he left to join Joyent as VP of Engineering and then CTO. Steve was COO of Joyent. And I have heard similar comparisons between Joyent and Sun.

jeffrallen · on July 21, 2023

Sun was a combination hardware and software shop, which Bryan appreciated and has tried to replicate at Oxide. The only reason they have a chance today is because the hardware/firmware interface in most servers is terrible quality.

0x6c6f6c · on July 20, 2023

> Oxide Computer Model 0, also known as the “0x1”

So is it 1 or is it 0..

OJFord · on July 20, 2023

rafram · on July 21, 2023

It was a 0 in both names, but cosmic rays flipped one of the bits.

haimez · on July 20, 2023

Hopefully they end up with a German market only SKU before the next generation of hardware ships, otherwise it’s a real missed opportunity.

codetrotter · on July 20, 2023

I wonder how much the smallest, cheapest configuration will cost. I would really love to buy one of these. But I suspect it will cost 100x more than I can afford heh.

yencabulator · on July 21, 2023

Supermicro equivalent to what one of those sleds might be is around $14k. Times 32, plus switches etc, plus proprietary development margin, plus lack of volume manufacturing benefits.

https://www.siliconmechanics.com/system/rackform-a235.v8.1/2...

wmf · on July 20, 2023

Estimates in the previous thread were at least half a million.

codetrotter · on July 20, 2023

Yeah that’s about 100x more than I can afford exactly.

I think I will go back to my little collection of RPi Compute Module machines. And maybe some day in the very distant future I can buy big boy servers lol

masklinn · on July 21, 2023

Depending on your exact wants, I think there was an LTT video recently on how you can purchase older server blades for very reasonable prices, and upgrade them using also reasonably priced used server-grade CPUs and RAM. Obviously you won’t have the bang of new hardware, but if what you want to do is play around with enterprise-class systems…

mwcampbell · on July 21, 2023

I'm a little surprised that, according to the docs, "the Oxide rack does not come with any preloaded machine images" [1]. But I guess it makes sense that the initial, early-adopter customers wouldn't have a problem with rolling their own VM images. I'm at least glad that Oxide didn't decide to start by only allowing VMs that use a pre-defined set of pre-made images in some custom format.

[1]: https://docs.oxide.computer/guides/creating-and-sharing-imag...

wmf · on July 21, 2023

The preloaded images would all be Solaris anyway. ;-)

jclulow · on July 21, 2023

We don't call it Solaris anymore! :P

59nadir · on July 21, 2023

Surely Illumos? Solaris is probably dead to most people involved, a husk of what it was and could have been.

earthboundkid · on July 21, 2023

Does anyone know an order of magnitude for the price of a rack?

sitkack · on July 21, 2023

500k-1M-1.5M

wmf · on July 20, 2023

It's cool to see Oxide shipping RIFT routing well before networking vendors.

ocdtrekkie · on July 21, 2023

As an on-prem sysadmin, I'm not sure I will ever in my life work on an environment that has the minimum specifications of an Oxide system. Every entire datacenter I've worked in has less total capacity than a single "sled" here.

Are there intentions to go smaller than this?

steveklabnik · on July 24, 2023

> Are there intentions to go smaller than this?

(my personal perspective as an Oxide employee)

Not in the near future, when a lot of the design relies on the benefits of scale, scaling down kind of removes some of the benefits. But also just like, things are still early, there's not a lot of choice period.

That said, we do have a lot of fans who want to buy something, and it would be cool to figure out how to do that. But also, we gotta like, make and ship the primary product. So we'll see.

gorkish · on July 21, 2023

As someone who has been attempting to smash computers together into 'hyperconverged infrastructure' since before Y2K, I could not be happier for the buzz around Oxide; hopefully we'll see smaller scale versions of the concept out soon -- something in the 3-10 node size for SMB and/or a small scale-out system for SOHO. It seems insane to me that the tech to do this with off the shelf free software exists, but there's no way to buy it ready to go at small scale. Go get a quote for an HCI vmware buildout and DISMAY.

ThinkBeat · on July 20, 2023

I recently saw a horrible presentation by Microsoft on how they built the OpenAI supercomputer on Azure. (with custom bits).

How would an Oxide computer systems look like for for an "OpenAI super computer" compared to Azure?

pitaj · on July 21, 2023

That kind of magnitude of compute is more datacenter level. But either way, you're probably better off buying one of the all-in-one tightly integrated solutions from Nvidia or Intel.

oxide · on July 21, 2023

This is great, thanks for bringing this to my attention. Cool name, too.

javajosh · on July 20, 2023

I'm particularly impressed with their anti-tamper measures. "For each server sled, shine a light into the cubby to look for any physical tampering or damage."

https://docs.oxide.computer/guides/system/rack-installation-...

rhinoceraptor · on July 21, 2023

    The rack also comes with built-in security features to ensure all hardware and software are genuine Oxide products:

    purpose-built hardware root of trust (RoT) – present on every Oxide server and switch – cryptographically validates that its own firmware is genuine and unmodified
    encryption of data at rest via internal key management system built on the RoT
    trust quorum establishment at boot time to ensure the cryptographically-derived rack secret is verified before unlocking storage

https://docs.oxide.computer/release-notes/system/1-0-0

soneil · on July 20, 2023

That's pretty standard? Anything that's going straight into 400V (I assume) deserves a good eyeball before it's energised. I doubt their "root of trust" knows how to protect the system from a dead mouse in the bus bars.

javajosh · on July 21, 2023

I'm not worried about dead mice; I'm worried about tampering. I would have hoped there'd be more guidance than "just look and see if anything looks weird". A determined threat actor can easily side-step that check with, for example, careful desoldering/resoldering technique.

scq · on July 21, 2023

I suspect this step is intended to protect you from FedEx, not the NSA.

gorkish · on July 21, 2023

If this is seriously your level of supply chain scrutiny, being an Oxide early adopter probably isn't the best idea.

yencabulator · on July 22, 2023

I think for the unit price of $1M per rack or more, you can easily afford an armed guard to escort the crate from the factory.

Tuna-Fish · on July 21, 2023

> Anything that's going straight into 400V (I assume)

I think the sleds blind mate to a 55V DC common bus.

engagthe · on July 20, 2023

I’d love to see someone provide a turnkey managed bare metal container platform, complete with L4 / L7 routing. I haven’t heard if oxide has a container play, and I suspect it may require virtualization based on their choice of host OS.

doctorpangloss · on July 20, 2023

By "L4 / L7 routing" do you mean specifically something that works like Amazon's Application Load Balancer and Network Load Balancer for Kubernetes? And by "turnkey managed" you mean, you """just""" """configure""" """IP addresses""" and it """all just works"""?

You can certainly install Ubuntu on a very powerful machine with a WAN interface (e.g., a NIC connected to a residential cable internet connection). Then, use something like k0s to provision other bare metal workers. Those three steps, and you've got a "bare metal container platform." You don't need a "LoadBalancer", you can specify that nginx-controller runs on the host network of specifically the machine with the WAN interface and configure its service's external IP to the WAN IP, and now you support Ingress.

But how do you imagine having multiple LoadBalancer resources without multiple IPs? And how do you imagine having multiple IPs without ARIN? The turnkey challenging part is the public IPv4 addresses, not the platform.

dilyevsky · on July 21, 2023

> You don't need a "LoadBalancer", you can specify that nginx-controller runs on the host network of specifically the machine with the WAN interface and configure its service's external IP to the WAN IP, and now you support Ingress.

And if you need to service that machine or it goes down?

> But how do you imagine having multiple LoadBalancer resources without multiple IPs? And how do you imagine having multiple IPs without ARIN?

Most colos/transit providers will happily lend you their IPs

lwhsiao · on July 20, 2023

Anyone know if a documentation framework is used for this site, or whether it is bespoke?

dcre · on July 20, 2023

It’s bespoke. Remix + AsciiDoc + OpenAPI.

ChrisArchitect · on July 20, 2023

What's new here

wmf · on July 20, 2023

I think these docs were just published.

slater · on July 20, 2023

Yeah, few days ago:

https://docs.oxide.computer/release-notes/system

dang · on July 20, 2023

OK, let's document that in the title above. Thanks!

benjaminleonard · on July 21, 2023

These have been online for a few months actually, though the release notes have only just been added.

whalesalad · on July 21, 2023

Would be cool to play with this hypervisor on some lab hardware.

mkeeter · on July 21, 2023

It's been done before!

See the excellent "Oxide at Home" blog post [1], and HN discussion [2]

[1] https://artemis.sh/2022/03/14/propolis-oxide-at-home-pt1.htm... [2] https://news.ycombinator.com/item?id=30671447

ThinkBeat · on July 20, 2023

Can you configure your Oxide mainframe for different uses? Like max it out with 90% being GPUs?

I like the idea, though it may be dumb, if you could fill it with different types of processors for what you need.

Sort of like chiplets but at a different scale

sam_bristow · on July 21, 2023

I don't believe they have any GPU option in this initial rack. In one of the podcasts they said something about none of the GPU vendors being open enough to allow the kind of deep integration they're going for. It might change in future of course.

paywallasinbeer · on July 20, 2023

After giving the "Known Behavior and Limitations" a scroll here https://docs.oxide.computer/release-notes/system/1-0-0 It seems a little... half baked? Especially for a company whose specialty is the integrated management platform. Interested to see customer reviews.

convolvatron · on July 20, 2023

this looks like a pretty normal bug list