Hacker News new | past | comments | ask | show | jobs | submit login
Mitmproxy 11: Full HTTP/3 Support (mitmproxy.org)
388 points by mhils 3 months ago | hide | past | favorite | 75 comments

It’s great to see that Mitmproxy is still being developed - it indirectly made my career.

Back in 2011, I was using it to learn API development by intercepting mobile app requests when I discovered that Airbnb’s API was susceptible to Rails mass assignment (https://github.com/rails/rails/issues/5228). I then used it to modify some benign attributes, reached out to the company, and it landed me an interview. Rest is history.

It's absolutely insane how many core devs argued against change there

To this day it remains incredibly useful to me, and weirdly obscure to people who I would've thought should know better.

Sometimes it's easier to use mitmproxy with an existing implementation than to read the documentation!

> Rest is history


Only slightly related ...

> Chrome does not trust user-added Certificate Authorities for QUIC.

Interesting. In linked issue chrome team says:

> We explicitly disallow non-publicly-trusted certificates in QUIC to prevent the deployment of QUIC interception software/hardware, as that would harm the evolvability of the QUIC protocol long-term. Use-cases that rely on non-publicly-trusted certificates can use TLS+TCP instead of QUIC.

I don't follow evolution of those protocols, but i am not sure how disallowing custom certificates has anything with "evolvability" of protocol ...

Anyone knows are those _reasons_?

If I were to guess, it's to allow Google freedom in experimenting with changes to QUIC, since they control both the client and large server endpoints (Google Search, Youtube etc).

They can easily release a sightly tweaked QUIC version in Chrome and support it on e.g Youtube, and then use metrics from that to inform proposed changes to the "real" standard (or just continue to run the special version for their own stuff).

If they were to allow custom certificates, enterprises using something like ZScaler's ZIA to MITM employee network traffic, would risk to break when they tweak the protocol. If the data stream is completely encrypted and opaque to middleboxes, Google can more or less do whatever they want.

Kinda related: https://en.wikipedia.org/wiki/Protocol_ossification

Companies that use something like Zscaler would be highly likely to block QUIC traffic to force it onto TCP.

That’s exactly what Google is hoping will happen. If QUIC is blocked entirely, there’s no risk that small tweaks to the quic protocol will break Google’s websites for any companies using these tools.

Well, my company is doing it already. They split VPN traffic depending on the target domain (mostly for benign reasons), and that can't do it with QUIC, so they have to block QUIC traffic.

What benign reason could there possibly be that isn't better based on IP addresses rather than domains.

When this kind of VPN clients do split traffic based on domains, they do it with some tricks, either via DNS or capturing traffic on the browser, or similar things.

But for doing split VPN with IP addresses they need to create an IP route in the VPN client. If you just have a couple IPs, it's fine, but if you have a couple hundred targets, you're gonna break some guys Windows or Mac machine sending that huge routing table.

Also, there are targets that change IP addresses. For example, AWS Elastic Load Balancers change IP addresses sometimes (if nothing have changed in the last years, haven't deployed ELBs in a while...).

Middle boxes (https://en.m.wikipedia.org/wiki/Middlebox) are a well known source of protocol stagnation. A protocol with extensibility usually needs the client and server to upgrade, but with middle boxes there are N other devices that potentially need updating as well. Where the user (client) and service provider (server) are motivated to adopt new feature sets, the owners of middle boxes might be far less so. In net, it makes it hard for protocols to evolve.

Perhaps they're referring to this famous objection of financial institutions to TLS 1.3, motivated by them not wanting to update their MitM software needed for compliance: https://mailarchive.ietf.org/arch/msg/tls/CzjJB1g0uFypY8UDdr...

TLS1.3 breaks MITM boxes because a client can establish a session key outside of the network with the middle box and continue using it afterwards in the middlebox’s network.

> I don't follow evolution of those protocols, but i am not sure how disallowing custom certificates has anything with "evolvability" of protocol ...

One of the reasons for developing HTTP 2 and 3 was because it was so difficult to make changes to HTTP 1.1 because of middleware that relied heavily on implementation details, so it was hard to tweak things without inadvertently breaking people. They're trying to avoid a similar situation with newer versions.

QUIC exists to improve ad deliverability, to grant user freedom would counteract that goal.

How does QUIC improve ad deliverability?

The entire protocol puts corporate/institutional needs first and foremost to the detriment of human person use cases. HTTP/3 makes all web things require CA TLS and means that if something in the TLS breaks (as it does every couple years with root cert expirations, version obsolecence, acme version obsolecence, etc) then the website is not accessible. Because there's no such thing as HTTP+HTTPS HTTP/3, self-signed HTTPS HTTP/3, or even, as in this case, custom CA TLS HTTP/3. It's designed entirely around corporate/institutional needs and is a terrible protocol for human people. HTTP+HTTPS websites can last decades without admin work. HTTP/3 websites can only last a few years at most.

If it was about institutional needs, surely it would make it easier to mitm for middleboxes? The biggest opposition to QUIC came from big corporations and other institutional players

Different players. Google and other ad companies are bigger than old companies trying to MITM their users.

This doesn't have much to do with QUIC. If HTTP/3 was based upon another transport protocol, you'd have the exact same problems.

You can use QUIC with custom certs without any trouble.

QUIC isn't just some transport protocol though: it's weird. These restrictions are based in the QUIC libs, not in UDP (which is the QUIC transport protocol).

And while you can use QUIC with custom certs in a technical sense if you do the compile flags and build your own universe, 99.9999% of the people on Earth with their standard QUIC lib implementations (most use the same two) will be unable to connect to it.

I don't know what you're talking about, but I just imported a QUIC library and used it with a self-signed certificate. No extra steps required, either on the server or the client side.

Yes, the protocol is weird, compared to TCP. It has many extra features and one restriction, which is mandatory TLS, which I wouldn't even consider skipping anyway. Still nothing to do with ads.

He was arguing about 99.99% of users being people that cannot use your stuff because chrome doesn't allow the use of snakeoil / self signed certs for QUIC, specifically, and TLS encryption is mandatory.

If you compare that to the graceful Connection: Upgrade handshake in http/1.1 and websockets, for example, this would've been much better because there is no isolation based on tools and libraries, only based on trust chain of the certificates. If a new version of the protocol breaks, it automatically falls back with both parties knowing about it. If QUIC protocol has changes on the client side, good luck finding that out.

Then the OP used a bad framing, because it was apparent to me that they opposed QUIC in general, not QUIC in Chrome.

Either way, I still fail to see how this relates to the original complaint that QUIC somehow leads to ads.

QUiC was developed by ad as company

"Either way, I still fail to see how this relates to the original complaint that QUIC somehow leads to ads."

"HTTP/3 uses QUIC, a transport layer network protocol which uses user space congestion control over the User Datagram Protocol (UDP). The switch to QUIC aims to fix a major problem of HTTP/2 called "head-of-line blocking": because the parallel nature of HTTP/2's multiplexing is not visible to TCP's loss recovery mechanisms, a lost or reordered packet causes all active transactions to experience a stall regardless of whether that transaction was impacted by the lost packet." (Wikipedia)

I'm a text-only browser user; I also use TCP clients with a localhost TLS forward proxy. When I visit a website I only request resources manually and only from a single domain at a time. This works great for me. But obviously it is not compatible with ads. This is because ads requires websites that cause so-called "modern" browsers to automatically request resources (ads) from multiple domains "simulataneously", i.e., "in parallel". As such, I see no ads. Not a problem, I can live without them.

However, for those who let these browsers request ads automatically, there are problems. "The web", as they experience it, gets slower. Because although the requests to multiple domains may be executed "simultaneously", and the HTTP/1.1 protocol allows for pipelining requests, these browsers cannot process the responses that arrive out of order.

HTTP/1.1 pipelining does work. Outside the browser, when requesting non-advertising resources, for example. I have used HTTP/1.1 pipelining outside the browser for over 20 years. It works beautifully for me. I am only requesting resources from a single domain and I want the responses in order. So, right there, we can see that HTTP/3 is addressing the needs of advertising companies and their partners rather than web users who are reqesting non-advertising resources like me.

As the web became infested with ads/tracking and became slower as a result, the ad companies and CDNs, e.g., Google and Akamai, sought to "make the web faster" by introducing a new HTTP protcocol for the so-called "modern" browser. (This browser, as we know, is more or less under the control of an advertising company, the same one introducing the protocol.)

Now, a web user (cf. institutional player in the online advertising industry) might conclude the easiest way to "make the web faster" is to remove what is making the web slow: ads.

But the "solution" chosen by the advertising company was to keep the ads and try to make it easier for the so-called "modern" browser to process out-of-order responses from multiple domains, e.g., ad servers, faster.

Will HTTP/3 and its use of QUIC "lead to ads". That is for the reader decide. I think it is safe to say it will not lead away from them and it could help cement the ad infestation of the web even more.

tl;dr HTTP/1.1 is optimal for sequential information retrieval from a single domain. HTTP/3 is optimal for resource, e.g., advertising, retrieval from multiple domains "simultaneously".

HTTP/1.1 is no different from HTTP/2 and even HTTP/3 when it comes to multiple domains. The optimization strategies are all aimed at solving simultaneous resource requests from a single host (not domain but close enough) - should be obvious when you consider that each host requires at least one separate socket connection anyway.

In practice http/1.1 with 1 connection per request even encouraged making more domains/subdomains if you wanted to have more simultaneous requests

The ad company wants to make ads browsing easier, therefore it created HTTP/3. I can entertain that statement.

The ad company created QUIC to make HTTP/3 possible. That's also certainly true.

What follows: the ad company created QUIC because they wanted to make ads easier.

But "QUIC exists to improve ad deliverability" is true only in the shallowest of senses, similar to "nuclear power plants exist to drop bombs on civilians" just because research on nuclear power was driven in large part by war needs. In reality, nuclear power has its own uses beyond the military.

Similarly, QUIC taken on its own merits does not have anything to do with ads. It's just another general purpose protocol.

BTW, multiple streams will not make it any faster to load ads from third parties. Head-of-line blocking only affects resources within a single TCP connection, which can only ever be to one server. That means QUIC's streams do nothing to make loading Google's third party ads easier.

> We explicitly disallow non-publicly-trusted certificates in QUIC to prevent the deployment of QUIC interception software/hardware, as that would harm the evolvability of the QUIC protocol

For Chrome at least..!

That has nothing to do with ad deliverability.

There is a case of Kazakhstan installing certs to MITM citizens couple years ago and bunch of cases where bad actors can social engineer people to install certain for.

I think because of KZ case browsers and Chrome especially went for using only their own cert store instead of operating system one.

Browsers responded by blacklisting the Kazakh certificate the same way they blacklist the certificates that came with pre-installed spyware on laptops from shit vendors like Lenovo. You don't need to block all certificates to prevent against a well-known bad certificate.

If your company requires communications to be monitored, the typical enforcement is a custom company CA installed on company equipment. Then they intercept TLS and proxy it.

Those proxies tend to be strict in what they accept, and slow to learn new protocol extensions. If Google wants to use Chrome browsers to try out a new version of QUIC with its servers, proxies make that harder.

It can seem confusing but it all makes sense when you realise Chrome is designed to work for Google, not for you. I remember people switching their Grandmas to Chrome 15 years ago when they could've chosen Firefox. Many of us knew this would happen, but convenience and branding is everything, sadly.

> Chrome is designed to work for Google, not for you.

Maybe more accurately “chrome is designed to work for you in so far as that also works for google”. I share the long standing dismay that so many willingly surrendered their data and attention stream to an ad company.

I don't really think Firefox cares about having users. The one killer feature Chrome has is being able to access all your state by logging into your Chrome account. Firefox refuses to provide this basic service which will allow you to seamlessly use your data on Firefox and then eventually stop using Chrome. I wish Firefox nothing but the worst.

I may be feeding the trolls, but not only is there a sync mechanism, at least with Firefox you can self-host[1] such a thing, thereby doubly ensuring the data isn't used for something you disagree with

If you're going to say that Firefox doesn't care about having users, point out its just stunningly stupid memory usage, blatantly stale developer tools (that one hurts me the worst because the Chrome dev-tooling is actually open source, so there's nothing stopping them from actually having Best In Class dev tooling other than no-fucks-given), or the har-de-har-har that comes up periodically of bugs that have been open longer than a lot of developers have been alive

1: https://github.com/mozilla-services/syncstorage-rs#running-v...

Don't care about self hosting. That's not a feature to me, it's a burden. I would rather some cloud provider do that for me, thankfully Google does it for free and the convenience is much appreciated. It's the same reason i'd put my personal code in Github than some hard drive in the basement which may die anytime.

Perhaps you interpreted my comment as that one must self-host, versus what I intended which is "you can use theirs, or you can use yours, depending on your paranoia level". I thought to include that distinction because some folks believe that Chrome is merely a data exfiltration and ad delivery vector created by the biggest Ad Tech on the planet and therefore don't trust them to be good stewards of arguably the most sensitive thing a modern user creates: browser history

Of course one can just use Firefox Sync out of the box <https://support.mozilla.org/en-US/kb/sync>; even Mozilla has not yet stooped so low as to require opening a terminal just to use Firefox or its Sync component

Again, I don't care that Firefox wants me to build up my profile on one device. Then transfer that to some other device. I already have my profile. Let me login to my chrome profile and use it. I don't want to buy into your shitty ecosystem even if it gives freedom (that i dont want or care about).

I find Firefox's memory usage and dev tooling better than Chrome.

My Firefox installs on my various computers have a shard profile, so what are you on about?

Do http/2 and http/3 offer any benefits if they are only supported by the reverse proxy but not the underlying web server? Most mainstream frameworks for JS/Python/Ruby don't support the newer http standards. Won't the web server be a bottleneck for the reverse proxied connection?

Yes, because http/2 or http/3 will improve the reliability of the connection between the client and the reverse proxy. The connection between the reverse proxy and the underlying web server is usually much faster and more reliable, so that part would benefit much less from being upgraded to http/2 or http/3.

the transport between reverse proxy <-> backend is not always http, eg python w/ uwsgi and php w/ fastcgi.

And even when it is HTTP, as other commenters said, the reverse proxy is able to handshake connections to the backend much more quickly than an actual remote client would, so it's still advantageous to use http/2 streams for the slower part of the connection.

> the transport between reverse proxy <-> backend is not always http, eg python w/ uwsgi and php w/ fastcgi.

That's just called a web server and not a reverse proxy then. Both are just evolutions of CGI.

Probably not, but mitmproxy is not a reverse proxy for any production purpose. It’s for running on your local machine and doing testing of either low-level protocol or web security stuff.

> mitmproxy is not a reverse proxy for any production purpose

At a startup I was working on a few years ago, I set up mitmproxy in dev and eventually if memory serves right I also sometimes enabled it in prod to debug things.

That being said, we did not have a lot of users. We had in fact very very few users at the time.

I’ve been patiently waiting for someone to write a howto that uses mitmproxy to transparently obtain acme certificates for any web servers that are behind it.

I’d totally pay a cloud provider to just do this and forward requests to my port 80 or 443 with self signed certificates.

Https+acme is already open to this attack vector, so why inconvenience myself by pretending it is not?

In our setup, TLS was already being terminated by Nginx or Caddy (I don’t remember which, but it was one of those two) sitting in front of another web server on the same host.

So inserting mitmproxy into the setup was just a case of putting it between the Nginx or Caddy that did TLS termination, and the web server that served the backend API. So to mitmproxy it was all plain HTTP traffic passing through it, locally on the same machine.

I bound the mitmweb web UI to the VPN interface so that us devs could connect to the dev server with VPN and then have access to the mitmweb web UI to inspect requests and responses.

Something not mentioned: web-browsers limit the number of connections per domain to 6. With +http/2 they will use a single connection for multiple concurrent requests.

Yes. Besides other performance benefits, HTTP/3 saves a full roundtrip for connection by combining TCP and TLS handshakes.

Depends. If they're running on the same box, the reverse proxy will be able to initiate tcp connections to the web server much more cheaply. Even if they're just in the same datacenter, the lower round trip latency will reduce the time for establishing TCP connections. Plus, the proxy might be load balancing across multiple instances of the backend.

Also browsers limit the number of HTTP/1.1 requests you can have in flight to a specific domain

The limit is much higher for proxies, though.

With a reverse proxy the browser doesn't know it's talking to one

One of the main promises of HTTP/3 is better performance under worse network conditions (e.g. no head-of-line blocking as in HTTP/2, connection migration, 0-RTT). For all of that HTTP/3 between client and proxy is really great. HTTP/3 between proxy and server is not required for that.

Yes, for http/3 since it handles network issues better. Http/2 is of more doubtful value since it can choke really bad on packet loss.

http/3 seems to be an excellent opportunity to optimize HTMX or any of the libraries which leverage HTML fragments like JSX. The obvious advantage of http/3 is for gaming.

The servers which run the frameworks have to http/3. In most cases the advantages should be transparent to the developers.

I’m curious what about HTTP/3 is particularly advantageous with HTMX?

A common use case of HTMX is sending fragments when scrolling.

Since http/3 uses udp to send the fragments, duplicate packet information doesn’t have to be sent.

Kind of funny the newer protocol effectively works in the opposite direction of GraphQl.

Network congestion management is gonna be wild in the coming decade with the proliferation of udp based protocols

Unfortunately there is still the issue[1] of fingerprinting. Until it can spoof the TLS handshake of a typical browser, you get these "Just a quick check..." or "Sorry, it looks like you're a bot" pages on about 80% of the web.

[1]: https://github.com/mitmproxy/mitmproxy/issues/4575

> Until it can spoof the TLS handshake of a typical browser, you get these "Just a quick check..." or "Sorry, it looks like you're a bot" pages on about 80% of the web.

Evidently Firefox is not a typical browser anymore.

Thanks for the shoutout to Hickory. It’s always fun to see what people build with it. Nice work!

Thank you for your work on Hickory! It's super exciting to see how PyO3's Python <-> Rust interop enables us to use a production-grade DNS library with Hickory and also a really solid user-space networking stack with smoltcp. These things wouldn't be available in Python otherwise.

I wonder, can I use it like Privoxy/Proxomitron/Yarip? E.g. can I strip out script tags from specific sites, which I request with my browser (Ungoogled Chromium), using Mitmproxy as a Proxy? And how will this affect performance?

In theory: yes. In practice: mitmproxy is written in Python so there will be a delay because of the language not being all that fast. When you're visiting web pages with hundreds of small delays, you'll notice.

That said, for many people who care about this stuff, this could be an option. There's nothing preventing you from doing this technically speaking.

There's a small risk of triggering subresource integrity checks when rewriting Javascript files, but you can probably rewrite the hashes to fix that problem if it comes up in practice.

is mitmproxy an alternative to fiddler?

Pretty much. What I like about mitmproxy is how easy it is to write a Python plugin to intercept/modify requests and responses.


Oh no, the summer intern didn't close the issue

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
