What about VirtualGL? https://www.virtualgl.org/ As I understand it, when you op...

notyourday · on April 20, 2020

I am using it with qemu to allow Linux guests to use hosts's gpu to do 3d acceleration. It works fairly well except that it seems there is a memory leak in the path somewhere because over time qemu VMs with it enabled grow over the memory allocated to them.

exikyut · on April 22, 2020

Wait, how does that... ohh, you're forwarding GLX. Ha, that's awesome, I am definitely borrowing that idea :)

The graphics contexts should be well-contained on the host though. That the guest is leaking memory is kind of interesting.

notyourday · on April 22, 2020

This is qemu 4.2.0 configure command line that built what i needed to support this ( Debain 10's qemu did not have virgl support ). Spice did not work me, SDL did.

  PKG_CONFIG_PATH=/opt/virgl-0.8.1/lib/pkgconfig/ ./configure --prefix=/opt/qemu-4.2.0 --enable-spice --enable-kvm --enable-linux-aio --target-list=x86_64-softmmu --enable-sdl --enable-gtk --enable-opengl --enable-virglrenderer --extra-ldflags=-L/opt/libepoxy-1.5.4/lib/

And this is the command line to launch a guest:

  export QEMU_PA_LATENCY_OUT=20
  export QEMU_PA_SAMPLES=44100

  ${QEMU_BIN} -name "${NAME}" -m size=${RAM_SIZE} -machine accel=kvm -drive format=raw,aio=native,cache=none,if=virtio,file=${VMDIR}/${VM} -netdev tap,id=mynet0,ifname=${VMTAP},script=no,downscript=no -device e1000,netdev=mynet0,mac=${VMMAC} -vga virtio -display sdl,gl=on -soundhw hda -audiodev pa,id=pa1

Oh and when I say it is working "fairly well" I mean the a guest with a single vCPU runs a spinning cube with no sweat, not to mention glxgears etc. Just for a test I have done 4 guests ( one per physical core ) doing gears and spinning cube each in addition to the host itself doing spinning cube with no issue.

If only I could figure out how to fix this memory leak.

exikyut · on April 27, 2020

Arg, only noticed your reply yesterday then the tab got buried!

Thanks for the info. I probably don't have the hardware setup to run VMs at the moment (chronically low on RAM + do not have a GPU). I will be very interested to play with this when that changes in the future.

The one question (if you notice this) I do have is: how quickly does the memory leak happen? And can just running glxgears do it?

notyourday · on April 27, 2020

I'm running 32GB system with two browser VMs getting 4G each.

The work VM which is typically connected to Github and work gmail needs to be restarted once every couple of days.

The play VM which would have a Chromium with a dozen tabs open can can take ~20 min to ~1 hour depending on the content in tabs.

I notice the memory leak when Sublime that I run on the host itself starts dropping key press speed ( I run "xset r rate 400 50" ). Killing VMs or restarting them makes everything go back to normal.

I've ran glxgears for ~2 hours now with no significant leak. I will probably leave it overnight to see if something as simple as it can be used as a leak example.

I'm going to speculate the key is heavy usage of GLX something that only Chromium does as I have left a desktop with an xterm running top for two days and no leaks were detected on a play vm. So it is not just virtgl in QEMU being exposed to the guest, initialized by the guest and interfaced by the guest via virtio, but actually using virgl on a host to do rendering. Maybe virtual context creation/release cycle?

P.S. I originally made this work on my super light laptop ( i7-6500U with 8G ), so you don't need a beefy rig for it although there's a bug in recent Intel GPU libraries used by mesa that sometimes lock it up. I initially attributed it to something I did but it was not:

https://github.com/qutebrowser/qutebrowser/issues/4641

kitty, alacritty, and even Chromium would sometimes lock up with intel iGPUs.

Feel free to reach out - my email address in a profile here. I check it here and there.

exikyut · on April 28, 2020

Hmm.

"Depending on the content in tabs". Might be interesting to graph the leak rate (in KB/min or MB/min) and maybe either record the screen or limit to one or two websites at a time to narrow down what causes the most issues. Besides video playback, CSS animations come to mind as a potential source of GLX acceleration caused by random/mainstream websites.

Now I'm wondering what would happen if you fired up multiple tabs with stuff from https://www.shadertoy.com/, https://experiments.withgoogle.com/collection/chrome, https://webglsamples.org/, etc >:)

Virtual context creation+release could indeed be the problem. FWIW https://github.com/cyrus-and/chrome-remote-interface makes using Chrome's remote debug protocol (to open and close tabs) utterly trivial; or you could just run multiple concurrent browsers with --user-data-dir=somewhere (probably actually much easier to work with).

Chromium aside, another potentially useful trigger may be Unity games - the last(/first/only) time I tried to fire one up my laptop (i5-3360M + HD Graphics 3000) I was scrambling for ^C^C^C while everything slowed to a crawl and the system climbed (impressively quickly) past 97°C ;) (thankfully(!) nothing froze - thanks for the Intel lockup reference), so perhaps that could be a very effective trigger.

I'll keep your email address in mind; email is a bit of a long-term sore point due to Gmail constantly slowing everything to a crawl, and I'm currently using 8GB of swap (need to close some tabs) so I replied here so I could respond in a timely manner.