I wanted something like this in the embedded space (m0/m4). One challenge I encountered was the amount of additional memory and scratch - what does that look like with this?
The C code that applies a patch can do so using just a couple 100 bytes of RAM, depending on the chosen compression algorithm. Heatshrink is probably a good choice for an MCU.
There are two kind of patches; normal and in-place. The normal patch requires the old firmware to be readable until the patching procedure is completed. The in-place patch is designed to write the new firmware to the same memory area as the old is read from.