A Pure HTTP/3 Alternative to MQTT-over-QUIC in Resource-Constrained IoT

riskable · on June 27, 2021

Ugh. The point of MQTT is that it's simple! It's trivial to implement on even 8-bit microcontrollers. Adding MQTT messaging to embedded stuff is a joy and even setting up a broker is a piece of cake. It's fantastic.

HTTP/3 isn't simple! It's a huge pain in the ass! Why would anyone want to do this to themselves on embedded‽

The only answer I can come up with is, "it let's us keep doing things the way we've always done them... With web servers."

detaro · on June 27, 2021

"trivial to implement" if you don't need encryption. Which you kinda do most of the time. Not everything runs in the design constraints of an 8bit micro.

It would also be nice if you'd at least pretend to engage with the arguments in the paper, instead of making up imagined reasons.

wyager · on June 27, 2021

Isn’t the recommended approach to use application-level encryption in MQTT? That seems mostly fine to me for a spec with simplicity as a goal.

ori_b · on June 27, 2021

It's also trivial to tunnel an unencrypted protocol over TLS.

https://man.openbsd.org/tls_connect.3

Or even

https://www.stunnel.org/

SV_BubbleTime · on June 27, 2021

I ran into a lot of difficulty when dealing with devices that needed a bridge to access the internet. So device doesn’t have a tcp stack and requires a socket-like connection to an intentional-man-in-the-middle.

Implementing payload end to end encrypted MQTT was easy enough.

Any thoughts on TLS for that scenario? At best I could have a simulated socket, but always to an untrusted middle man.

SV_BubbleTime · on June 28, 2021

Turns out ONE answer is that DTLS is TLS for datagram connections (ie: UDP) and it’s pretty easy to forward along UDP packets at a proxy (as there is no need at all to track what sent or needs a retry).

It should be noted (if anyone reads this) that DTLS has protections and reliability on packets to establish the connection, but once it’s established it just goes back to being UDP although now encrypted, so you need to be ok with losses.

If you need messaging on top of that, CoAP adds in the counters and retries and reliability to UDP. Making UDP a little more like TCP but features to get this are one level higher in the CoAP packet. It’s a little weird, but gets you relatively reliable request/response HTTP-like (CoAP CONFIRMATION) and also event driven async MQTT-like (NON-CONFIRMATION). So that’s pretty cool.

Although near for me as GCP and Azure support CoAP but for some reason Amazon seems to refuse to and I’m locked in to Amazon for other reasons.

kelnos · on June 27, 2021

You're not going to be running openssl or stunnel on an 8-bit microcontroller, though.

mbedtls/polarssl is usually the library of choice for microcontrollers (or, really, things that are not POSIX), but not sure even that will run on an 8-bit micro.

ori_b · on June 27, 2021

> You're not going to be running openssl or stunnel on an 8-bit microcontroller, though.

True, you may need BearSSL, as with this port: https://github.com/OPEnSLab-OSU/SSLClient

It's a bit fat for many 8 bit microcontrollers -- needs 110kB flash and 7kB RAM -- but you can get low end cortex cores that it'll work just fine with.

scoopertrooper · on June 27, 2021

But that’s negate many of the benefits of quic!

emilfihlman · on June 27, 2021

>"trivial to implement" if you don't need encryption. Which you kinda do most of the time. Not everything runs in the design constraints of an 8bit micro.

What? Encryption is a completely different step from framing and message semantics.

>It would also be nice if you'd at least pretend to engage with the arguments in the paper, instead of making up imagined reasons.

This is just absolutely silly and stupid. HTTP3 is not tied to the encryption used. I suggest you review your understanding before contributing more.

belter · on June 27, 2021

The authors do mention in their findings that:

"...While QUIC has been proven by different researchers to be more advantageous than TCP for MQTT traffic, our hypothesis was that further gains may be found through a pure H3 solution. The reasoning behind this hypothesis was rooted in QUIC’s original design intent and various optimizations specifically for carrying HTTP traffic. This turned out to be the case for performance, but posed a trade-off for resource consumption.

... Performance indicators were investigated in addition to network and device level overhead. MQTT-over-QUIC was found to put marginally less strain over the network than H3, but H3 offered a key performance savings of 1-RTT in the time to first data frame..."

pantalaimon · on June 27, 2021

What about CoAP? It supports DTLS too.

rurban · on June 27, 2021

Exactly. I'm always sceptical of protocols where the header size is 3x bigger than my body. I just implemented MQTT via NB-IoT, and I had to remove the subscribe receive part.

My other custom protocol fits into a single TCP frame, which fits into about 5 radio frames, I think. Of course I compress the hell out of it, and for MQTT clients I just put a small MQTT bridge on the host translating the body. The battery should last 10 years afterall, and TCP costs 95% of all power. The less packets, the longer it lasts.

With http/3 I would give it 2 years max. And much more dropped packets. I'm also pretty sure that I don't have the ram for http/3

Matthias247 · on June 27, 2021

Http/3 is actually simple if you just look at that one and don’t include quic itself. It’s far easier than http/2 since the transport layer does the multiplexing, and just defines some simple header and body frames on top which are imho not harder to decode than looking for a `\r\n\r\n` in http 1. the main challenge is Header encoding via qpack which is mostly overwhelming someone with encoding options (Huffman, static tables, dynamic tables on another stream, etc…). But at least one can now opt out of supporting dynamic tables without losing interoperability

liquidise · on June 27, 2021

> ‽

Amazing what one can learn[0] on a Sunday morning in unexpected places.

0: https://en.wikipedia.org/wiki/Interrobang

belter · on June 27, 2021

"The rise and fall of the interrobang" :-)

https://www.economist.com/the-economist-explains/2014/10/01/...

mtwittman · on June 27, 2021

Also good[0].

0:https://99percentinvisible.org/episode/interrobang/

dheera · on June 27, 2021

I want a DDS implementation on microcontrollers. I really want them to be able to speak ROS2. That would be the ultimate awesomeness, since it needs no broker/server/master, devices on a network can just discover and talk to each other in a decentralized fashion.

im_down_w_otp · on June 27, 2021

Doesn't this exist? Isn't that the whole point of RTI's "Micro Edition" or whatever they call it.

dheera · on June 27, 2021

I think I saw one for STM32 but they're hard to work with. Really looking for one that works on ESP32 ...

baybal2 · on June 27, 2021

> HTTP/3 isn't simple! It's a huge pain in the ass! Why would anyone want to do this to themselves on embedded‽

Why would anyone want to do it on desktop? Let alone server-server APIs?

HTTP 2 and 3 are spec monstrosities, which currently, effectively don't work.

HTTP 2 is for example slower than HTTP 1.1 over modern TLS even on near perfect connections.

HTTP 3 seems to be even slower in real world use.

I strongly believe Google fudged its "real world trial data" to push W3C to adopt it.

vitus · on June 27, 2021

> HTTP 2 and 3 are spec monstrosities

A valid concern.

> which currently, effectively don't work.

Citation needed.

> HTTP 2 is for example slower than HTTP 1.1 over modern TLS even on near perfect connections.

H2 attempts to solve a number of problems present in HTTP 1.1 when your connection _isn't_ perfect -- head-of-line (HOL) blocking comes to mind, although since it still uses a single TCP connection, it's not a complete solution. It also provides header compression which can yield significant performance wins (bandwidth usage, latency if it reduces RTTs, CPU usage although this is always a mixed bag with compression).

H3 solves the transport-layer HOL blocking issue, and has a number of other RTT-reducing features (e.g. removing the need for TCP 3-way handshake -> TLS handshake -> actual sending of requests by mandating the use of TLS, which admittedly can have performance impacts compared to, say, unencrypted HTTP/1.1).

It does have a number of aspects which could negatively impact performance depending on your use case (e.g. userspace congestion control, always-on encryption).

If you're dealing with small resources without HTTP pipelining where individual HTTP requests can and will complete in 1 RTT (and parallelization is limited because of dependencies between resources, e.g. a HTTP page contains an iframe which contains a js file which fetches an image), then your end-user latency will be dominated by the number of RTTs imposed by the underlying transport.

And even if you include pipelining (assuming a non-buggy implementation, which inherently limits concurrency in order to reap its benefits), then you're still subject to HOL blocking, both request- and connection-level, which is problematic in moderate loss scenarios (wifi, mobile).

KaiserPro · on June 27, 2021

> H2 attempts to solve a number of problems present in HTTP 1.1 when your connection _isn't_ perfect

by multiplexing a many pseudo connections down one TCP pipe. I still don't quite understand how that was meant to improve performance over a lossy link.

It seemed to me that Http2 was meant to be a file transfer protocol, but designed by people who don't really quite understand how TCP performs in the real world (especially on mobile)

vitus · on June 27, 2021

> I still don't quite understand how that was meant to improve performance over a lossy link.

It does!

First, consider the case that HTTP pipelining addresses: lots of small independent requests. By packing multiple requests into the same connection, you can avoid handshakes, TCP slow start, etc. Perhaps more importantly, if your connection has more bytes in it and therefore more packets to ACK, then you can actually trigger TCP's detection of packet loss via duplicate acknowledgments, as opposed to waiting for retransmission timeouts.

Further, browsers typically limit the number of parallel requests, so you potentially have another source of HOL blocking if you've exhausted your connection pool (as happens much more quickly without pipelining).

That said, pipelining is also subject to HOL blocking: let's say that the first external resource to be requested is some huge JS file (not all that uncommon if you're putting <script> tags in your <head>). That needs to finish loading before we can fetch resources that are actually needed for layout, e.g. inline images.

H2 provides a handful of innovations here: one, it allows for prioritization of resources (so you can say that the JS file isn't as important); two, multiplexing allows multiple underlying responses to progress even if that prioritization isn't explicitly in place.

Yes, it still runs into problems with lossy links (as that's a problem with TCP + traditional loss-based CC algs like New Reno / CUBIC). But it's also better than the status quo either with HTTP/1.1 + connection pooling or with pipelining. And it has the noted advantage over QUIC in that it looks the same as H1 to ISPs.

baybal2 · on June 27, 2021

> It seemed to me that Http2 was meant to be a file transfer protocol, but designed by people who don't really quite understand how TCP performs in the real world (especially on mobile)

And not only mobile, same applies for server-server connections over near perfect, inter-dc connections.

Even gigabit, and above links drop packets... because it's how TCP is supposed to work!

The higher is the speed, the more aggressive curve will be used by the congestion control algorithm.

In other words it will hit drops fast, be in the drop zone for longer, and this will be recurring more frequently than on slower links.

In fact, faster congestion protocols for speeds above 500 mbps get their speed by having A LOT MORE drops, not less.

Any protocol designed for the TCP transport must have been designed to work in concert TCP mechanics, not fight, or try to workaround it.

This is a perfect example of how AB test vodoo lead people down the rabbit hole.

moltenguardian · on June 27, 2021

> Any protocol designed for the TCP transport must have been designed to work in concert TCP mechanics, not fight, or try to workaround it.

There is a reality of running code on other peoples hardware: they impose limitations on network traffic that you cannot change. Example: Home-grade routers do NAT, but have table limitations which make it hard to have multiple connections. The connections get dropped, and users complain, not realizing it was due to them having bought a cheap router. Other actors in the network path between the client and the server impose their own limitations, making it hard to have multiple connections. (corporate firewalls being another big one).

HTTP/2 is a compromise, admitting that it is not possible to go fix or replace all this hardware. I worked on a few HTTP/2 implementations, and the complexity it incurs is significantly less than the alternatives.

slver · on June 27, 2021

Sometimes when you come up with a tidbit that the entire world of experience contradicts, it's healthy to stop and think "wait, am I full of shit?" Of course, you may not be full of shit, but let's say the odds are you are.

What "real world data" you have to contradict Google's and basically everyone's personal experience?

I have sites that are unbearable over HTTP/1.x and instant over HTTP/2. But maybe I was fooled by Google.

mnahkies · on June 27, 2021

I think QUICs design around handling network switches is a primary advantage.

Devices are not static anymore and tolerating a switch from WiFi to mobile etc is a big plus

driverdan · on June 27, 2021

> HTTP 2 is for example slower than HTTP 1.1 over modern TLS even on near perfect connections. > > HTTP 3 seems to be even slower in real world use.

[citation needed]

kstrauser · on June 27, 2021

I think the disconnect between your experience and others’ may involve the complexity of the sites you’re talking about. If you have a simple site that has a few images, HTTP 1 may be nearly equal to HTTP 2/3. If you have a complex site with a lot of resources, and your server software allows you to push resources you know the client will need before the client even gets around to asking for them, HTTP 2/3 can be a clear winner.

kilburn · on June 27, 2021

I think you are speaking about HTTP push, which has been abandoned [1] (also discussed in HN [2]).

[1] https://evertpot.com/http-2-push-is-dead/

[2] https://news.ycombinator.com/item?id=25283971

jlokier · on June 27, 2021

No, it's not Push.

As a practical matter if your page requires a large number of resources (say 50+), and you are measuring the time to complete loading all of them, HTTP/2 loads much faster than HTTP/1.1. There are some nice demo sites which show this, with pages full of small images.

This is because all requests can be issued in parallel over HTTP/2 as soon as the client knows what to request, and this reduces the number of round trip times before all resources complete loading.

Even without Push, the client usually knows all the requests after it receives the HTML or sometimes the CSS.

With HTTP/1.1, the client requests are bottlenecked in a queue limited by the number of parallel TCP+TLS connections. Some sites boost this limit by using extra subdomains. But running 50+ TCP connections bursting at the same time is not efficient, even if it does bring down the latency by parallelising. With that many, there will be a burst of congestion and possible packet loss at the low bandwidth receiving end. And the 3-way handshake and TLS handshake needed for each new TCP+TLS means the parallel request latencies are a multiple of HTTP/2 request latency.

HTTP/2 also has request header compression, which makes a difference when upstream bandwidth from the client is limited.

KaiserPro · on June 27, 2021

MQTT is simple. You open a single connection, and thats about it. You can use an off the self TCP implementation, which you almost certainly don't have to worry about. The hard part is managing certificate.

Even though I don't have to, I'm pretty sure I can implement a MQTT client on my own. I'm not sure I can do it with HTTP3, and thats before I implement the pub:sub bits on top. And you still have to manage certificates.

But reading the paper there are some glaring issues. firstly its using MQTT over QUIC is not the greatest choice. They don't compare it to raw MQTT over tcp. They seem to be labouring over the assumption that UDP is "faster". Given that ideally your message size for MQTT is < than MTU, in most cases UDP has little advantage. (depending on how you cost the connection initialisation.)

Second, peak memory usage for the Http3 implementation is 10 megs, thats more ram than an ESP32 can address (4/8megs). I'm not sure what kind of target hardware they are aiming for.

The total data transmitted per 1kb of message appears to be off as well. I'm not sure how 1K of message turns into 24k of network traffic.

In short, it might be an alternative, but not as implemented in this paper.

rektide · on June 27, 2021

the stated point was to compare mqtt vs h3, so using quic as a common transport for these application protocols makes perfect sense

Matthias247 · on June 27, 2021

While I think Mqtt has its issues and http with response body streaming (as in gRPC) is a legit solution for many pub/sub use-cases I can’t see how quic fits into the „resource constrained“ category: quic is a complicated protocol and implementations have to keep a lot of state around - eg the about the State of individual streams, the connection state, tracking state which information was sent in packets and has to be acknowledged, etc.

Quic implementations usually do that via dynamic allocations - and pretty much all the libraries do a lot of them.

For resource constrained systems you usually want the opposite: no dynamic allocations to avoid the chance to run out of memory or to be subject to memory fragmentation issues. You might be able to work around this a bit with a custom allocator that is only used by one particular library, but guarantees are still on the low end.

Tcp in comparison is easier to implement with static resources. TLS is also easier, it mainly requires a full TLS frame sized buffer for sending and receiving (or at least 16kB for receiving). And still that has shown to be a big implementation hurdle for many devices.

goalieca · on June 27, 2021

> Quic implementations usually do that via dynamic allocations - and pretty much all the libraries do a lot of them.

Any mission critical system or true embedded system will have some maximum limits defined and will reserve a table in memory with static allocation. It’ll be well packed and cache friendly too.

I bet within a few years there will be a « Show HN » for some uQUIC written in zig that does just this.

rndmio · on June 27, 2021

If you're interested in mqtt over constrained networks with embedded IoT devices you should take a look at what's being done to formalise MQTT-SN (MQTT for Sensor Networks)