Hacker News new | past | comments | ask | show | jobs | submit login
Understanding QEMU Devices (2018) (qemu.org)
180 points by sipofwater 4 months ago | hide | past | favorite | 58 comments



QEMU along with Bochs were my first tries getting into virtualization / emulation way back (maybe around early 2000s? I can't remember!).

Although the emulation / virtualization market already grew larger with more and more options available today, QEMU was (and still is) one of the most awesome projects out there.


Back in the day I ran Bochs in all its 4 Bogomips glory on a university IP address and went to IRC channels where script kiddies were "trading root".

I let them go first, and watched through an instrumented terminal how they clumsily installed a rootkit, then inevitably refused to give anything in return and laughed calling me a noob.

Their laughter was short lived.

I had even spent quite a bit of effort kludging the kernel to report much higher specs than Bochs could deliver, but all that effort was wasted because no one knew how to check.


A lot of the options (especially the free ones) are either using qemu or using ideas that were developed early (not actually first) for qemu like virtio. There are just a lot more layers on top these days, and not always for the better.


Proxmox is great as a FOSS hypervisor, but their docs for doing pretty much anything advanced are just "here's a qemu command".


hmm, if you refer to the `qm` command, that is not from Qemu itself but a Proxomx CLI tool :) https://git.proxmox.com/?p=qemu-server.git;a=blob;f=PVE/CLI/...


QEMU is used in basically every single hardware vendor today and has been since I've been in virtualization/containerization tech (2010+).

I've only seen Vmware (gsx/esx) at Windows shops for things like big Exchange clusters, etc. Every CDN I've worked at used qemu.


There's still a lot of Xen out there


If you’re on a Mac, UTM is an excellent wrapper around Qemu.

https://mac.getutm.app/


you can choose Apple virtualisation in UTM instead of QEMU too. Apple virtualization is optimized for M1+


That's a bit like saying "instead of ext4 you can use an SSD" in that the things involved span multiple layers. When you select the option to use Apple virtualization framework in UTM you're still using QEMU, what you're changing is the backend QEMU is using for the CPU virtualization.



QEMU has a Hypervisor[0] backend these days, called “hvf”.

https://wiki.qemu.org/Features/HVF

[0] “Hypervisor” is a “sibling” to the Virtualization framework. IMHO, the naming is incredibly confusing (:


What is the definitive new-comer friendly guide to QEMU? Not just about using it but also understanding its internals (say to add new instructions to a supported ISA etc)?


Unfortuately there is none. QEMU is a large project and doesn't have much formal design or API documentation. On the other hand it's not big enough (compared for instance to the Linux kernel) to have a wider community interested in trying to provide internals documentation for newcomers.

Our general advice is "look at the existing code for the bit you're interested in to see how it works". You can sometimes find descriptions of the overall architecture online in third party blog posts and the like, but if they're more than a few years old then be wary that they might be out of date -- they're likely to be right in general principles and wrong in details, because things change.

For adding new instructions to an existing ISA: the first couple of sections of https://www.qemu.org/docs/master/devel/index-tcg.html are relevant here. Depending on the target it might or might not use decodetree (decodetree is much easier to add a new insn to, but some older targets still do by-hand switch-statement based decoding.) Look at how an existing insn that is similar to what you want to do works.

Implementing CHERI in particular is going to be pretty awful, because the things it does (like 128-bit pointers) break various assumptions QEMU makes. The University of Cambridge forked QEMU to add CHERI support for MIPS and RISC-V and I think also AArch64: https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/cheri... -- but the changes are pretty invasive and also not likely to be very fast. (The fork looks like it's based on 6.0, so three years old now.)

(If anybody is interested in trying to write up some documentation for QEMU's internals (either a general overview/roadmap or something on a particular subsystem), I'd be happy to code-review patches that add something to the "Developer Information" subsection of our manual.)


> QEMU is a large project and doesn't have much formal design or API documentation.

This is bonkers to me considering how it’s used in industry.


Linux (+ KVM) doesn't either.

Well, they have reasonable documentation for certain external APIs (syscalls, boot parameters, sysfs files, etc). But not internal API documentation or "formal design".

Certain things are sketched and outlined, and certain things have detailed documentation, but as a whole there is no "formal design" of the system.

It's not really bonkers though because it turns out that formal designs doesn't necessarily make better software. Or rather, the formal designs that academia might have taught. There is a formal design, it's the code.

20 or 30 years ago, there was this big push that formal designs should be the key piece of work and you should be able to press a button and generate the application from the design automatically. Turns out they were so wrong they basically went 360 back to right again and that's what we do. It's just that the design doesn't look like some crazy incomprehensible executable-UML, but programming languages. Which are quite legible, precise, and unambiguous (at least compared to English), and make very good languages to write designs in.

(The place where they are still wrong of is that you don't need to know or care about any of the fine detail in order to make a good design. Once you accept that, then specifying the design with code is pretty reasonable.)


Indeed, I was going to point out the lack for KVM as well. The same is also very true when it comes to Linux networking stuff. One of the most difficult things I've ever had to do was complicated networking stuff with KVM/qemu VMs when I had nobody to ask or talk to about it. There are enormous swaths of undocumented surface (or lightly documented by a blog post that may or may not be accurate anymore, and is nowhere near comprehensive). One of my biggest hope for LLMs like GPT-4 is the ability to improve on this, though as of right now it hallucinates like mad. The more niche the case, the worse it gets too.


Wake me up when those industry users want to pay somebody to improve the developer documentation :-)


Well, every time this line of thinking comes up, I don't believe there is a gofundme, indiegogo, patreon, etc to which I could donate. Because I for sure think that would be a good investment for future generations, but you are correct that I almost certainly couldn't convince my employer to spend the money. I'd guess that's partially because they don't directly benefit from qemu, setting aside the daily use of buildkit which for sure does. Come to think of it, I'd guess Docker(Mirantis?) is BY FAR the most "you really, really should be a corporate sponsor" of qemu


Well, you can donate to the project (there's a paypal link at the bottom of https://www.qemu.org/sponsors/ which donates to the Software Freedom Conservancy earmarking it as being for QEMU), *but* doing that won't cause somebody to be paid to work on the project (it can cover random project expenses like CI usage, I think). Mostly our sponsorship is either "in-kind" (access to compute hosts, hosting downloads, cloud compute credits etc) or else is sponsorship to help pay for the annual KVM Forum conference.

In general there is no mechanism for "pay money to have work happen" because pretty much all non-hobbyist QEMU developers are doing it because they're paid by some company (RedHat, Linaro, etc etc etc) to do that work as their full time job. So they're not in the market for random small side jobs.


Out of curiosity, how does one reach you? (Saw no info/contact details in your profile.)

Also, where does QEMU people hang out online? AFAICT the IRC channel is not very active. (Based on few and random visits, so I could be wrong.)


The primary nexus for QEMU developers is the qemu-devel mailing list. (Very high traffic because it's also used for patchmails.) The irc channel is a bit more variable and tends in particular to be quiet outside UK/Europe working hours, just because most QEMU devs happen to be Europe based.

I discourage private emails sent direct to me on QEMU topics (because they should generally be to public lists so other community members can answer them or benefit from the answer), but you can find me on the mailing lists and irc.


Ok, cool!


I implemented a bit of a STM32 and it was a chore and a half. I've noticed 2 things with the code base: 1. It's C but they really want C++. Qemu wrote their own class system, foreach loops, containers, etc. And because of that, when I tried to use actual cpp, compilation failed due to how many reserved keywords were used in headers and other mess. 2. As noted in other comments, copy paste and modify. It got me where I need to go. But it was a slog. Eventually I had gdb debugging my qemu build, and gdb debugging the program I was running. I could even connect in from the STM32 IDE which was nice


Start with libvirt, it provides a full GUI around QEMU operations. Run ps to see the underlying QEMU commands it runs. Inspect the XML files to understand how it builds machines.


A coworker came up with a similar idea: We started a VM using Lima, then ran ps to see what args it passed to QEMU. It was enlightening!


I agree but I think you meant "Virt Manager"? afaik libvirt is an API, not even a CLI and definitely not a full GUI. For a CLI, virsh is your guy.


There's also virt-install, which is part of the Virt Manager package. https://github.com/virt-manager/virt-manager/blob/main/man/v...

  curl --output-dir $HOME/.local/share/libvirt/images/ -LO https://cloud.debian.org/images/cloud/bookworm/20240507-1740/debian-12-genericcloud-arm64-20240507-1740.qcow2
  virt-install --import --osinfo debian12 --disk size=20,backing_store=$HOME/.local/share/libvirt/images/debian-12-genericcloud-arm64-20240507-1740.qcow2 --controller type=scsi,model=virtio-scsi --cloud-init



I am looking for some hand-holding with this. The documentation seems more for reference (albeit a bit lack-luster if I may say so).


If there are any specific topics that you'd like covered, please ask.


What are your goals? Are you interested in emulation (i.e. running a VM that uses another architecture than your physical computer) or in virtualization? (i.e. running a VM that uses the same architecture)


Well both. I'm interested in implementing an ISA extension (not sure if you know about CHERI). Also, there are reference implementations for aarch64 and risc-v that I'd like to understand.


Just curious, what about simulation? I heard that simulation is more serious than emulation and targets for say pipeline level emulation, but maybe it's just a fancier word?


Beats me. What does simulation mean in this context?


Ah nevermind then, probably just a synonym for emulation.


probably just start using Proxmox, as it's a pretty beginner-friendly FOSS hypervisor with extensive docs and forums, and it's largely a wrapper around qemu.

Their docs often include equivalent qemu commands for any UI actions.

For anything the UI can't do yet, they only give the QEMU command


> so the QEMU developers have added the virtio-net card (a PCI hardware specification, although no bare-metal hardware exists yet that actually implements it)

Things have changed since 2018. BlueField DPUs provide virtio network [1] and block [2] devices in the host PCI space. These can be passed through to VMs using vfio-pci in the host.

1. https://docs.nvidia.com/networking/display/bluefieldvirtione...

2. https://docs.nvidia.com/networking/display/bluefield2snap380

Disclaimer: I work for NVIDIA, have used BlueField DPUs, but have not used the virtio feature.


It can act as a PCIE root as well is that correct? I assume by connecting it to the input of a PCIE switch with other disks attached downstream, but would love to learn more and what PCIE switches are known to work.


The models of BlueField DPUs I’ve worked with have no storage (aside from eMMC to boot the DPU) directly attached. They typically run SPDK, then SPDK block devices are presented to the host as NVMe namespaces or virtio-blk devices. If using NVMe-oF with RDMA, this can provide a zero copy IO path to remote storage.

For NVMe, the devices may be presented to the host as PFs or VFs. I assume but do not know that it would be the same for virtio devices.


Don't feel obligated to answer if you shouldn't, but how seriously is Nvidia taking Linux nowadays? With the rise of ML is linux being seen as an important support target?


NVIDIA's business these days is dominated by datacenter products that are almost exclusively used with Linux. This has had almost no effect on their approach toward dealing with/integrating with Linux. They did follow in AMD's footsteps and adopted a driver model with an open-source kernel component and closed-source userspace components.


QEMU is a great piece of software. I use it regularly to debug various cross compiled kernel images under gdb without requiring actual ARM hardware (buildroot + custom kernel + gdb).


qemu is a treasure and reading its source to learn how computers work is very fun.


> Understanding QEMU...

Best of luck


Followed by libvirt, VirtIO, KVM, qcow2. If only there was one book or course to pull it all together.


"USB disk as /dev/sda on a not-rooted smartphone using Termux, QEMU, Alpine Linux": https://news.ycombinator.com/item?id=40507319


This is such a fantastic description of what is going on underneath the hood - it took me quite a while to understand how qemu works, wish I had seen this before!


> QEMU actually has the ability to glue together a lot of different host formats (raw, qcow2, qed, vhdx, …)

> and protocols (file system, block device, NBD, Ceph, gluster, …)

Yeah, and it's awesome. With qcow2 images mounted via nbd, I was able to manipulate ddrescue images without modifying the original. Truly one of the most useful software ever written. I even use it on Android with Termux to use and test x86_64 software.

I wish there was a step by step porting and implementation guide. I tried to port the open source Sensor Watch board to QEMU in order to facillitate software development for it. Didn't get very far. I'm not particularly knowledgeable about electronics but I had all the hardware documentation, I feel like it should have been enough.


Warning for humans! If you are trying to run a virtual PC on QEMY let's say on bare metal hosting from hetzner, you will very soon discover - QEMU is a dead slow without actual graphic card or at best you will get all kind of funky missing libs error messages on Ubuntu and other OS in very surprise spots.

Had a very good experience simulating K8s cluster with QEMU aka studing K8s hard way once I figure out how networking actually works between virtual machines and domains can be assigned with external proxy.


Why would qemu care about having a graphics card? Do you mean that whatever system you were running inside qemu expected a GPU and was slow without it?


If the OP was running a virtual desktop on a server with no GPU then they probably had to fall back to software rendering which can be slow. This isn't QEMU's fault per se; if you physically don't have a GPU then you don't have a GPU.


correct. thank you for phrasing it better


Hi, one probably really wants to use libvirt rather than qemu directly. That way you can create your VMs remotely with a GUI (virt-manager) using a ssh-based libvirt url, or a CLI (virsh) and it will handle all the right parameters for qemu, the required networking setup, etc. Check it out !


as @wmf correctly stated "If the OP was running a virtual desktop on a server with no GPU then they probably had to fall back to software rendering which can be slow. This isn't QEMU's fault per se; if you physically don't have a GPU then you don't have a GPU."

and 99% cheap servers in the wild dont have a GPU or even hardware graphic card


yes, i connect with virt-manager and ssh-based libvirt url.

my main case was to scrape multiple messangers and apps with desktop ui only.

everything must run and render on server.


> Had a very good experience simulating K8s cluster with QEMU aka studing K8s hard way once I figure out how networking actually works between virtual machines and domains can be assigned with external proxy.

This is an awesome use of QEMU! I'm both interested in learning K8s and what goes on under the hood at the kernel level because I do cloud connected IoT stuff, so I'll definitely use that!

Is there any kind of "build the kernel from scratch" project for that kind of stuff?


I was inspired by the article "Kubernetes The Hard Way" ... cant find the original article now

Had to study concepts like "bridged networking with libvirt" https://linuxconfig.org/how-to-use-bridged-networking-with-l...

No way I would recommend my way to study k8s ... it was just a pet project for a greater good.

Now in 2024 it's better to start from projects like minikube or k3s (my preference) on local computer

https://minikube.sigs.k8s.io/docs/ https://docs.k3s.io/

and later use terraform to provision infra

or

Forget it all and just use managed kubernets from cheaper providers like digital ocean

The result of inflicting pain on myself - I do value the work done by people who provide stable managed kubernets and upgrade it flawlessly.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: