I like the idea, it is a small computer that is used to monitor and control your big computer. But hate the implementation. Why are they all super secret special firmware blobs? Why can't I just install my linux of choice and run the manufacturers software? This would still suck but not as bad as the full stack nonsense they foist on you at this point.
They’re special firmware blobs because generally the OEMs aren’t building their BMCs from scratch as they might with their main boards and other components. They’re generally getting the bmc SoC from the likes of Aspeed and others who are the ones keeping them closed up. I’ve tried to get the magic binaries and source for various projects but have given up because there are so many layers of NDAs and sales gatekeepers. I’m not entirely sure who makes the dell bmcs but I know supermicro bundles Aspeed (at least they did with older generations of their main boards.)
I agree with you that you should be able to run whatever since in the end it’s just another computer, but the manufacturers believe otherwise since there’s “valuable IP” or whatever nonsense (insert rollseyes emoji here).
There are open specs like redfish but still doesn’t get to the heart of the matter.
AMI sells a bmc software stack, https://www.ami.com/megarac/
Intel and small manufacturer were unhappy, about always paying the ami tax. So intel created openbmc, as a hedge against ami's monopoly for small manufacturers. I have heard Openbmc has user from facebook, google, ibm, bytdance, and ali.
Dell owns their own stack in idrac, I have heard most of their systems are nuvoton based. I am suspect dell pays some big bucks to keep their systems at feature parity with the other options, and they view it as a an investment.
There are also silicon devices on the motherboard, that have drivers that are not able to be shared. So it not surprising that companies don't share source in a way that would be useful.
If you wanted a system that as a bmc that could be tested try the asrock-e3c246d4c, it looks like there are hobbyist, that have it running coreboot, and openbmc. (impressively)
Aspeed aren't the ones keeping them closed - OpenBMC running on Aspeed chips is an option chosen by some system vendors. It's a vendor choice whether to go with an open platform or a closed BMC from AMI etc.
Yeah they are utter garbage. For years you had to use Java 6 with absolutely every modern security measure turned off in both the JVM runtime itself and your browser to access Dell DRACs. Accept expired certs, run unsigned code, I'm sure this is all fine ...
I mostly work in the cloud now but when I last had to manage a bunch of physical machines we had a physically separate network accessed via its own VPN to get onto the BMCs. Because yeah, the security situation was a joke.
I found that if you leave your bmc unplugged on a super micro, it’ll conveniently bridge it to whatever other Ethernet is plugged in, meaning an outage of your management network may roll over to another network unintentionally.
Id put money on there being preauth vulnerabilities in those things, judging by the engineering quality.
On a much less-important note, it might explain some weirdness I'm seeing lately with one of my home Supermicro servers. (The docs say the BMC should only listen on one port, but the switch still sees some degree of responsiveness on the normal non-management port when "off".)
Ran a fleet of servers with terrible BMCs. We kept those on a well-sealed-off private network.
Woe betide you if you run into one of the BMC implementations that shares a host network interface; no separate cable! These things are terrible from a security standpoint.
Was responsible for trying to improve the management and operation of a large fleet of BMCs for a while. Plenty of bugs and pace of releases is slow. :(
Definitely an area where a more open ecosystem would improve the pace of innovation.
We need a Linux for BMCs. Oxide is working on one, but I'd like to see a contender fielded from the seL4 community, along with some other folks. For example, why doesn't Wind River have one already?
GPL can also turn manufacturers away. I would rather have variation in the possible BMC operating systems instead of sticking linux everywhere and contributing to a monoculture.
And how would this affect the original open BSD-based BMC firmware which would be available? If the corporations want to maintain their own fork, it's on them.
The original BSD release will stay open and free for everyone to utilize.
How can you trust that a closed source OpenBSD fork is secure when there's no way to audit the quality (or lack thereof) in the firmware the vendor gives you?
If it's GPL you can at least interrogate the code release and make an informed decision
Actually, why aren't they literally normal little computers? Like, fully open, bring your own OS computers. All it needs is some peripherals - Ethernet, its own host USB, gadget mode USB to present mouse/kb/storage to the main computer, video capture card, some GPIO to control power - but there's nothing all that special there; then you just install Debian or w/e and control the perfectly standard interfaces at will.
The interface between the BMC and motherboard is unique to each motherboard, especially for "added value" features that some servers have. DC-SCM is working on standardizing this but I don't know how interoperable it will be.
Personally I have some pikvms at home. And while not full parity to idrac enterprise, and it takes some extra effort to get atx, I have STRONGLY been considering dropping them in our DC to at least replace the dell ipkvm that frankly is a security nightmare.
It may no be "enterprise" enough for a given employer, by it's not hard to replicate most of this functionality with cheap (and open) hardware. For example I had a case that called for BMC that I resolved with a spare raspi3b, a very cheap capture device and a "smart plug". Total cost of materials was about 30 euro and (for me) it wasn't any harder to operate that an idrac.
It would also be sooo much easier to automate everything if it just ran slightly custom Debian install instead of... whatever the fuck abomination manufacturer made.
The repo claims that the servers themselves throttle the GPUs, but isn't it the GPUs themselves that can throttle or maybe the OS? Neither of those are controlled by the server (hopefully) so is there a different system at play here?
I can actually answer this (as it is how I stumbled on to the repo), it's through a signal from the motherboard called Pwrbrk (Power Brake), Pin 30 on PCIe. It tells the PCIe device to maintain a low-power mode, in the case of Nvidia GPUs it's about 50W (300Mhz out of 2100Mhz in my case).
You can check if it's active using `nvidia-smi -q | grep Slowdown` as shown in the post
No, that's controlled by the server: try lspci -vv on any linux system. Look at the link speed and width, like LnkSta: Speed 8GT/s, Width x2: x2 means 2 lanes.
Besides the speed, you can have another problem with lanes limitations.
For example, AMD CPUs have a lot of lanes, but unless you have an EPYC, most of them are not exposed, so the PCH tries to spread its meager set among the devices connected to your PCI bus, and if you have a x16 GPU, but also a WIFI adapter, a WWAN card and a few identical NVMe, you may find only of the NVMe benchmarks at the throughput you expect.
> For example, AMD CPUs have a lot of lanes, but unless you have an EPYC, most of them are not exposed, so the PCH tries to spread its meager set among the devices connected to your PCI bus, and if you have a x16 GPU, but also a WIFI adapter, a WWAN card and a few identical NVMe, you may find only of the NVMe benchmarks at the throughput you expect.
Most AM4 boards put an x16 slot direct to the CPU, and an x4 direct linked NVMe slot. That's 20 of the 24 lanes; the other 4 lanes go to the chipset, which all the rest of the peripherals are behind. (There's some USB and other I/O from the cpu, too). AM5 CPUs added another 4 lanes, which is usually a second cpu x4 slot.
Early AM4 boards might not have a cpu x4 NVMe slot, and those 4 cpu lanes might not be exposed, and the a300/x300 chipsetless boards don't tend to expose everything, but where else are you seeing AMD boards where all the CPU lanes aren't exposed?
> Early AM4 boards might not have a cpu x4 NVMe slot, and those 4 cpu lanes might not be exposed, and the a300/x300 chipsetless boards don't tend to expose everything
I'm sorry, I oversimplified, and said "most of them" while I should have said "not all of them" as 20/24 is more correct for B550 chipsets (the most common for AM4) instead of trying to generalize.
I'm still not quite sure what you're trying to say?
Lanes behind the chipset are multiplexed, and you can't get more than x4 throughput through the chipset (and the link speed between the cpu and the chipset varies depending on the chipset and cpu). But that's not a problem of the CPU lanes not being exposed, it's a problem of "not enough lanes" or more likely, lanes not arranged how you'd like. On AM4, if your GPU uses x16, and one NVMe uses x4, then everything else is going to be squeezed through the chipset. On AM5, you usually get two x4 NVMe slots, but again everything else is squeezed through the chipset; x670 is particularly constrained because it just puts a second chipset downstream of the first chipset, so you're just adding more stuff to squeeze through the same x4 link to the CPU.
Personally, I found that link to be more confusing than just reading through the descriptions on wikipedia for a particular Zen version. For example https://en.wikipedia.org/wiki/Zen_3 ... just text search in the page for "lanes" and it explains for all the flavors of chips how many lanes, and how many go to the chipset. Similarly the page for AMD chipsets is pretty succinct https://en.wikipedia.org/wiki/List_of_AMD_chipsets#AM5_chips...
There's a reason why so many motherboard makers avoid putting a block diagram in their manuals and go for paragraphs of legalese instead, and laziness is only half of it.
> Most AM4 boards put an x16 slot direct to the CPU, and an x4 direct linked NVMe slot. That's 20 of the 24 lanes; the other 4 lanes go to the chipset, which all the rest of the peripherals are behind. (There's some USB and other I/O from the cpu, too). AM5 CPUs added another 4 lanes, which is usually a second cpu x4 slot.
> For example, AMD CPUs have a lot of lanes, but unless you have an EPYC, most of them are not exposed, so the PCH tries to spread its meager set among the devices connected to your PCI bus, and if you have a x16 GPU, but also a WIFI adapter, a WWAN card and a few identical NVMe, you may find only of the NVMe benchmarks at the throughput you expect.
example from my X670E board
* first NVME = 4x gen 5
* second= 4x gen 4
* 2 USB ports connected to CPU (10/5 Gbit)
and EVERYTHING ELSE goes thru 4x gen 4 PCIE bus, including additional 3x nvme, 7 SATA ports, a bunch of USBs, few 1x PCIE ports, network, etc.
PCIe devices can only draw a limited wattage until the host clears them for higher power. There is also a separate power brake mechanism (optional part of PCIe) mentioned in the article, which has been proposed by nVidia for PCIe so it seems likely their GPUs support it.
There are a number of valid engineering reasons for thermal dissipation why you don't want to overload the heat producing things in a 1U server beyond what it was designed for.
This article doesn't mention at all what the max TDP of each gpu is, which makes me suspicious. Or things like max tdp of cpus (such as when running a prime number calculatio multi core stress benchmark to load them to 100%) combined with total wattage of GPUs.
If you have never built an x86-64 1U dual socket server from discrete whitebox components (chassis, power supply, 12x13 size motherboard, etc) this is harder to intuitively understand.
I would recommend that people who want four powerful GPUs in something they own themselves to look at more conventional sized server chassis, 3U to 4U in height, or tower format if it doesn't need to be in a datacenter cabinet somewhere.
I'm dealing with something similar. I wanted to use Redfish to clear out hard drives but storage is not standardize across different vendors. Dell has a secure erase. HPE gen10 has smart storage and anything older doesn't have any useful functionality in their Redfish API. What a mess. So I need to use PXE booting and probably winpe to do this.
> The automatic system cooling response for third-party cards provisions airflow based on common industry PCIe requirements, regulating the inlet air to the card to a maximum of 55°C. The algorithm also approximates airflow in linear foot per minute (LFM) for the card based on card power delivery expectations for the slot (not actual card power consumption) and sets fan speeds to meet that LFM expectation. Since the airflow delivery is based on limited information from the third-party card, it is possible that this estimated airflow delivery may result in overcooling or undercooling of the card. Therefore, Dell EMC provides airflow customization for third-party PCIe adapters installed in PowerEdge platforms.
You need to use their RACADM interface to update the minimum LFM for your card.
HP is way worse than Dell. Fans at full speed I can handle. Servers permanently throwing errors because a part isn't HP branded, that's peak asshole design. So is refusing to measure status of drives if I use a third party drive sled (which is some folded metal, and some LEDs on a flex PCB) AND throwing errors about them.
The wifi thing is slightly understandable because FCC requires you to limit your radiated emissions, and when you do the certification you have to control the entire configuration to pass the testing, which means that your radio is paired with your antenna cable and antenna in the body of the device. Allowing people to replace the radio without a paired antenna cable and antenna could cause radiated emissions to fall outside of the spectrum allowed by the FCC. It's dumb in practice but at least somewhat understandable in principle.
And not just stop you or refuse to activate the card, but to intentionally cause the hardware I purchased to irreparably malfunction as long as I attempt to use something you didn't sell me.
Of course, I can use a USB wifi card, no problem. Just the convenient PCI card inside the system that is specifically designed to enable a wide variety of functions aside from wifi that is locked in the BIOS just in case I wanted to use a 3rd party card of a specification that you don't happen to sell.
Lenovo did that to reduce their customer's ability to buy PC expansion devices from other manufacturers, in hopes of making more aftermarket sales in the future.
It used to be very common across all OEMs, at least in the consumer space. I remember having to flash a modded BIOS on my 2012 HP laptop so it would boot with a WiFi + BT card. Back in those days it was uncommon for a laptop to have BT.
5u system, ML350Gen9. I ignore the errors, but of course that means that if a real error pops up there, I won't know. It's a lower-urgency server of my own so it's ok, but annoying as well in production.
I see the Z6 is a workstation unit, they're going to be more flexible there.
Don't forget how Dell included/includes DRM in their laptop chargers to prevent customers from buying cheap aftermarket replacements. Of course the wire for the DRM functionality is as thin as possible and is always the first thing to break.
I don't know what it's like currently, but when I stopped using Dell laptops a few years ago they wouldn't even boot if the battery was dead and you plugged in a third party charger or a charger with a busted data cable.
Other manufacturers do worse and prevent boot if your PCI ids aren't on a positive list.
This is for example present on thinkpads, and while you could patch the bios before, Intel bootguard now prevents you do that "for your own protection" :)
I hope the MSI leak contains actual bootguard keys for intel 11th gen+, and can be used to allow "unauthorized" PCI modules on modern thinkpads!
What worked for some older think machines (haven't tried it on my thinkpad though) is to update the bios (can be to the same version) and change the serial nr.to all zeroes (the update script asks if you want that). That got rid of the wifi whitelist i encountered.
Could you please explain which serial? Can you do dmidecode and tell me which Handle/ UUID is all zeroes?
Even if it may not directly apply to current thinkpads, it implies the UEFI module might have other conditionals before going on checking the positive list - something that should be easy to check by reversing the LenovoWmaPolicyDxe.sct PE32.
If you’re interested, I have a dump of my TP bios flash unmodified and then the same version I paid some Russian guy to modify so the whitelist is removed and an extra menu unlocked. I basically sent them the dump, and after a “donation” I got the patches dump back.
Would love to learn how to figure out what the changes are because with my limited knowledge I didn’t figure anything out.
If you are interested in doing other modifications yourself, also read about UEFI variables, how they control menu options and can be tweaked with just grub
The MSI leak comments like https://sizeof.cat/post/leak-intel-private-keys-msi-firmware... mentioned the bootguard keys maybe have been common to other manufacturers: "It is assumed that the keys for downloading the guard are not limited to compromising MSI products and can also be used to attack equipment from other manufacturers using Intel’s 11th, 12th and 13th generation processors (for example, Intel, Lenovo and Supermicro boards are mentioned)"
Restrictive nonsense seems common in the server space unfortunately. HPe do similar things. IIRC they disabled certain features if you used non-HPe ‘approved’ hard drives.
Dell also does this with their EMC storage arrays, it’s meant to push you towards their pro services. You are supposed to tell the array to order drives for you from pro services and someone from some nameless MSP contracted with dell installs it for you at a 10x markup.
Are there even any good server vendors? Dell, HPE and Lenovo do their lock-in shit. Supermicro's BMC is pretty bad. xFusion is totally-not-Huawei-I-pwomise. There's a few more that come to mind but all of them are niches like HPC and don't really do sales on a small scale.
I observed this behavior in a Dell system about a decade ago, but based on experience over the last 5 years or so with PowerEdge servers, installing a third-party GPU no longer triggers the (extremely loud) maximum fan-speed response.
Because they were the most well known of the affected companies and they didn't really do much to repair the goodwill the issue cost them.
Sure, they "put up $300 million for repairs" according to https://www.theguardian.com/technology/blog/2010/jun/29/dell..., but I bet the lions share of that went to their largest purchasers, so the people who blew their school's IT budget on Dell computers were just SOL.
They also stop producing laptop batteries after a few years while refusing to let the laptop charge 3rd party battery replacements, significantly limiting the useful life of their laptops.
You can actually call a small business rep and get them to order it for you. They still produce them or else have them in stock, just don’t sell them online.
Nominally done for your protection. Lowering power (clock) and heat load (fast fan) for unapproved gear prevents things from going dead and getting people REALLY mad and likely reduces warranty claims.
I like the idea, it is a small computer that is used to monitor and control your big computer. But hate the implementation. Why are they all super secret special firmware blobs? Why can't I just install my linux of choice and run the manufacturers software? This would still suck but not as bad as the full stack nonsense they foist on you at this point.