Hacker News new | past | comments | ask | show | jobs | submit login
See this page fetch itself, byte by byte, over TLS (subtls.pages.dev)
1341 points by gmac on May 10, 2023 | hide | past | favorite | 173 comments



While we all (HN audience) know roughly what kind of things are going on, seeing it in all the delicate details is fascinating.

Just thinking that all these bytes and everything that's happening have 6 more layers below all the way to physical electrical/electromagnetic flow, all routed potentially through many different ISP backbones and possibly international routes, all happening in roughly less than 1/10 of a second is even more fascinating.


It reminds me of being a teenager in the late 90s and the early days of the internet. I discovered entirely on my own that I could telnet to port 80, type "GET / HTTP/1.1\n\n" and the server would send me the headers + page content. Shortly after I discovered the same worked for SMTP.

I was very far from the first person to have this revelation but it was definitely an eye-opening "there is no magic, it's all just software" moment for me. It fundamentally changed the way I think about computers at every level and inspired me to look "under the hood" at everything... how CPUs implement superscalar OOO execution. How atomic operations work at the CPU level. What a syscall actually is. How subroutines are called (and calling conventions). How dynamic linkers work.

You don't have to be an expert at all these things but it is like a superpower to understand the basics of every layer.


> It fundamentally changed the way I think about computers at every level

I had the exact same revelation also in the late 90s! (~1998 for me). I was already telnetting into servers a bunch and was getting into running an Apache server. I remember the moment I typed "GET / HTTP/1.1" into port 80 so clearly because it suddenly turned the "web" and "HTTP" into something comprehensible with composable parts I could understand at a deeper level.

In our current world of SSH and HTTPS, it seems less likely the new generation will have the same experience. But we also have browser developer tools nowadays which make it so much easier and motivating for someone to start to learn the about the web and JavaScript. In the 1990s I had to use proxomitron to painstakingly modify a webpage's JavaScript or HTML, but these days it's dead simple.


`openssl s_client` is underused, and it's not as clunky since they added support for host:port syntax over separate args. Encouraging thinking of that as the modern replacement would help.


Wow thanks for showing me about `openssl s_client`. I just have this a try and you're it was quite easy!

   openssl s_client -connect example.org:443
   [TLS details...]
   GET / HTTP/1.1
   Host: www.example.org
   <newline>
   <newline>
   [headers and HTML response!]
The one thing tricker nowadays is almost all of the time you need to send the `Host:` header for things to work. Took me a sec to realize that since in 1998 it was almost never necessary.


Glad to spread the knowledge, and glad you gave it a shot! It really is that easy, and host headers have been pretty regularly required anyway.

Slightly more interesting is using it to access internal sites, and setting up your own TLS roots and chains for personal or corporate infrastructure. In practice, while useful for internal use, I generally recommend everyone use LetsEncrypt and public names for even internal APIs when they cross team boundaries, because it's just easier.


> In our current world of SSH and HTTPS, it seems less likely the new generation will have the same experience.

Pushing encryption into the transport layer a la QUIC could solve this, if not for the spurious dependency on user-hostile TLS instead of a simpler PKI. SSH would become telnet over a QUIC stream, which could be used with a QUIC-enabled netcat (say). HTTP/3 could have been either 1.1 or 2, just over QUIC, but this wasn't pursued.


I think this is one of the most fundamental things a person can learn about software engineering: there is no magic. If something happens, it was part of the operation of written code and an exchange of data somewhere.

My daughter is starting to get to an age where she’s inquisitive about how magical things work, and I usually respond by asking “how _could_ it work?” And we talk a lot about what actually does what.


Being even somewhat conversant in some of the more popular protocols is also like a superpower that the newbs don't really have, too. The ability to use telnet or nc to answer the question "aside from any other piece at any level in the stack that could be going wrong, can I even talk to the HTTP server" helps you eliminate a lot of possibilities about what's going on when troubleshooting something.


Same! I didn't even know anything about telnet or programming. I was using Klik&Play or MultiMedia Fusion to make games, and some component supported TCP, so I opened a port using that and just for fun connected to it with a browser. Then I saw a request, so I used that the other way around on a real web server and it worked. Same thing with SMTP.


> Klik&Play

That's a program I haven't heard of in a long time. I wonder if you can still get it running in a VM since it was just shareware with a nag screen if you said you were using it for "educational purposes"


> What a syscall actually is

ive only regarded it as a literal system call, the lowest possible level, “language agnostic” api that does a thing in the OS. do you have some deeper insight?


"Syscalls" is a topic in systems programming. You can Google "syscall faq" for example.

https://blog.packagecloud.io/the-definitive-guide-to-linux-s...

Understanding how syscalls are invoked from user space typically involves knowing what calling conventions are, knowing what an ABI is, etc.

https://man7.org/linux/man-pages/man2/syscall.2.html


At the most basic level a system call is: loading the arguments into the ABI-specified registers then triggering an interrupt. Some architectures have a specific syscall or syscall-like instruction that is more optimized than a generic software interrupt but conceptually it is similar.

The syscall/interrupt instruction transitions to supervisor/kernel mode and moves the execution pointer to the configured location in the kernel.

If this sounds kinda like switching threads or processes you would be right. But if you had to pay that context switching cost 2x on every syscall it would kill performance. Most OSes use a split address space as an optimization here: every userspace process has the kernel's memory mapped in the upper half, but with protection bits that make it inaccessible to userspace. That is so when a syscall is issued there is no need to change the active page table entries or flush the TLB: the kernel is already mapped only now in supervisor mode those kernel pages are accessible.

The CPU decided what code got control by the interrupt table which itself can only be configured in supervisor mode. That is what prevents a userspace process from hijacking the CPU. User mode code doesn't have permission to modify the register that points at or the memory containing the interrupt handler tables. Thus by definition any syscall/interrupt will jump to kernel code.

The kernel entrypoint then often has a COPYIN/COPYOUT process that will treat certain register values as pointers and copy the data into the kernel's address space when required (or copy it out to a caller provided buffer).

For reference pre-emptive multitasking is related. The kernel's scheduler configures a hardware timer interrupt. The configuration of this timer can only be done in supervisor mode. So once the current thread's timeslice is up the timer fires and the CPU changes the instruction pointer to the kernel's configured timer interrupt handler. User mode code can't prevent the timer from firing nor change what code the CPU will jump to. The scheduler routine saves the current context to memory, loads the next thread's context (registers, instruction pointer, page tables, etc), updates the timer's next deadline, then "returns" from the interrupt... only the instruction pointer is now in a different thread (or different process with different memory entirely) so the CPU "returns" to a different piece of code. If all goes correctly it "returns" to the next instruction beyond the one that completed when that thread last got pre-empted so from that thread's POV execution was continuous.


The difference between a syscall and a library function call is that a syscall crosses protection boundaries. Implementations differ, but where a library (even the lowest-level OS library like libc) runs in the context of the application and can be invoked with a regular "store pointer and jump" method call, a syscall usually involves transfering control to the kernel through a software interrupt.


I was lazier and just did "GET / HTTP/1.0\n" and saved one character :P

Edit: I am probably wrong about "1.0", might have been that I just did "GET /" and saved 8+ characters. I was just trying to make a funny remark about "Single line request" vs "Multi-line request"


The craziest thing is, if you’re using Ethernet, there’s a very high chance you have an actual physical data connection with the other computer on the other side of the world. A giant physical web blanketing the Earth.


Well, that's not new with computers or the internet. We had that ever since the analog telephone system. Even telegraphs before that for a smaller set of points and routes.


I imagine we're somewhere near peak density though, or maybe even past it. Wireless will slowly remove your fingers from laying on that physical web.


Really no. The RF world has nothing on fiber and copper for density.

There is almost unlimited desire for bandwidth in the access layer and over subscription ratios are still very very high.

The transition to 800G ports and then post-800G is not even warmed up yet. You can run the entire country of Australia on a backbone of two dozen routers today because of the lack of endpoint access bandwidth.

We are at a place where bandwidth is once again plentiful in the core … not so much on the edge. Sort of like 2003 when oc768 created that situation.


Why isn't wireless physical? Perhaps it's not massive :)


In what sense? What's a physical data connection? There's a shit ton of routers and switches in between, it's not like there's an electrical connection.


You could run your finger across from my keyboard to yours, without a single break in the continuous physical connection between them. Connector clips included, of course. ;)


I understand that you are being poetic, but just in case someone reads this as fact: you are describing a dedicated circuit - which is what telephones used. The internet works on packet switching, so there are numerous little breaks between the signal and receiver as your data is routed along a "connection".


No, I'm talking about the physical layer of the OSI model, and including the mechanical connections between those physical interfaces. You're talking about the link layer.

Unless your backbone/computer has a wireless hop, a literal, uninterrupted, physical chain of physical electronic devices, physically connected to one another with wires/cables, goes from my keyboard to yours. This is literal, not poetic. I'm not saying a galvanic connection. I'm saying a physical connection where, if nothing was bolted down, and high tension cables were used, I could pull and you would feel it.


AT&T had wireless microwave towers for phones and tv, so I imagine there was a period near the end if its life where some dial-up connections weren't physically connected:

https://99percentinvisible.org/article/vintage-skynet-atts-a...


Working for a Midwest dialup isp in the early 2000s, we definitely served some of our smaller POPs with PTP wireless backbones, thanks in part to vast expanses of flat land with fairly tall structures dotted throughout.


Yes, and if the comment implied a purely electrical connection, it is likely not the case either, as there is electrical to optical and vice versa transitions throughout.


I don't know, I alternate between thinking that's remarkable and not so much. I could run that connection via the power grid too, or the water supply (if we were in the same city).

Edit: I concede it's actually pretty remarkable. It is like the nervous system of the planet. Sure there's not a purely electrical continuity, but neuron synapses don't have an electrical nature either.


Are you sure? As far as I'm aware a single city might have its electricity network split in a few macro areas, and not really connected to each other. The Internet, on the other hand, needs all nodes to be connected somehow (and I bet the vast majority of connections are physical).



This is true but on a much larger scale than the city. For example, the United States has only three electrical grids (East, West, and Texas). Within those areas, not only are all electricity users in some sense connected, they are all /synchronous/, meaning that the 60hz coming out of your wall peaks and troughs at the same exact time as the 60hz at your cousin's house two states away.

The grids also have interconnections with one another, but they happen via DC to avoid needing to synchronize the whole continent.

In a literal sense, modernity is about using energy to coordinate human activity across vast distances at minute time scales. We normally think about this in terms of transportation, telegraphy, telephony, print, broadcasting, and the internet but it's also true of the motors in our washing machines and factories.


Somebody might be running wireless in there at some point. But I agree it's a neat thought.


Never though of it this way. Right!


Similarly, every time you step onto a road, you're stepping on a contiguous strip of pavement that spans the entire continent


My lifetime of road trips is just a really inefficient flood fill


I often daydream about this one and wonder if there is continuous asphalt from me (Sweden) down to South East Asia or if there are any breaks.

When I search for answers right now, AH1 looks like a candidate, but no confirmation https://en.wikipedia.org/wiki/AH1


Even more remarkable with rail.


Incredible, really, isn't it?


Assuming artificial superintelligence is possible, imagine how would its "experience" be in contrast to ours - it would be orders of magnitude faster in both thinking and execution. We would be like some weird plants to them, that take a year just to move from one location to another.

Truly fascinating.


This was beautiful. Two things stood out to me:

* There's more stuff than I expected which exists only for backwards compatibility. That's a lot of bytes when you add it all up over the whole Internet.

* There's a lot of "expect N more bytes for a data structure of type T1" messages followed immediately by "expect N-k more bytes for a data structure of type T1::T2". I assume this is because there's other stuff that could go in that spot, but it still looks strange.

I'm sure all of this is necessary and important; I just found it really fascinating to peek under the covers. It's nice to be able to have it all (mostly) just work thanks to the tireless efforts of many engineers and protocol developers.


That kind of stuff is necessary for binary protocols to evolve in a compatible way. When a TLS 1.4 is defined, we need a window where clients and servers can still negotiate 1.3 until both have been upgraded. And 1.3 had to find ways to be compatible with 1.2, and so forth. Decades of that kind of evolution are guaranteed to leave some marks in the protocol.

But let's keep a sense of proportion. I find it hard to worry about "wasting" maybe 0.01% of the total internet bandwidth for a few extra bytes here or there, when that's necessary to keep the internet working at all, when on the other hand, for no end-user benefit, we don't hesitate to waste maybe 15% (after compression) by insisting on using text-based formats for the payload in all our web standards.

In fact, I'd love to see a back-of-the-envelope calculation of how many tons of CO2 would have been saved in total if HTML was a well-engineered binary format. (Including bandwidth, storage, parsing on the client etc.) The number must be insane.


The thing about html is that it's a verbose text format on the surface, but it compresses incredibly easily, and support for gzip is widespread. If you're concerned about size, brotli is better again.


> The thing about html is that it's a verbose text format on the surface, but it compresses incredibly easily, and support for gzip is widespread. If you're concerned about size, brotli is better again.

That's why I said it wastes 15%, not 100%. Whenever text-based formats and binary formats are compared, the results after compression (of both) are usually in that range. You can debate those precise numbers, they might be lower for the brotli/HTML combination. That won't really change the point I was making in context: That a few extra bytes for backwards compatibility in the TLS handshake pale in comparison to the amount of waste we accept for encoding the payload.


I dare to say that human readability that accounts for that extra size allowed many people to learn basics about how internet works, namely HTML and its compatriots. Mankind would be poorer for this enriching experience, even if it resides mostly in the past, and for me personally 15% (or 100% extra) could easily justify that.

Plus add easiness of debugging this (before you bring some javascript monstrosity that hacks around it and will stop working probably in weeks after paid support of it ends)


A better example is HTTP, which is text, but read by humans much less frequently.


Only HTTP/0.9 HTTP/1.0 and HTTP/1.1 are probably text in the sense you mean

HTTP/2 and HTTP/3 are binary formats, they are semantically very similar to the older formats in some sense, but you would benefit from more tools to examine them properly because human readability was not the priority.


You’re saying that compressed HTML, because it is text, wastes 15% (could be however much, the number doesn’t really matter) over some binary format that would express the same content?

What binary format would/could that be? I’m just not seeing how (if it expresses the same content) it could be smaller than compressed HTML.


They're comparing compressed text vs compressed binary, apples to apples. While text compresses amazingly well, binaries aren't 100% entropic themselves. They usually also benefit from compression. For example, the defacto "executable" format for the Java runtime is a compressed archive (jar file).


The compression doesn’t help memory usage or cache locality on the receiving end. Http is terrible and JSON and other web technologies are terrible. The whole concept of how to write efficient code and the resulting performance left on the table by terrible coders is enormous.

A terrible waste.


the thing is that you need additional CPU power to compress and decompress that data, maybe that is not much by today standards, but when accumulated it could be a significant number


The alternatives are not free either and would accumulate different costs probably as well


>In fact, I'd love to see a back-of-the-envelope calculation of how many tons of CO2 would have been saved in total if

•we didn't include massive JS libraries that are not truly necessary

•we didn't track user's every move and report that data back


> That kind of stuff is necessary for binary protocols to evolve in a compatible way. When a TLS 1.4 is defined, we need a window where clients and servers can still negotiate 1.3 until both have been upgraded. And 1.3 had to find ways to be compatible with 1.2, and so forth. Decades of that kind of evolution are guaranteed to leave some marks in the protocol.

It's necessary because people are incompetent, and because overall the market rewards them for incompetence. From the outset TLS provided a trivial version negotiation mechanism but it was easier to ignore it and write incompatible garbage especially for so-called "Middle boxes" often sold as a drop-in security "solution" for businesses.

So when it came time to ship TLS 1.1, it was soon discovered that in practice you can't just say "Hi, I speak TLS 1.1" which would be a couple of bytes - that won't work, you need to find some other way to quietly signal to competent people that you know the newer protocol. So they did, slightly weakening the security in the process, and this continued into TLS 1.2 where browsers began doing "Fallback" which was a risky but sadly necessary process where you give up attempting the new protocol altogether sometimes, thus opening yourself up to supposedly obsolete attacks.

By TLS 1.3 things had become so bad that TLS 1.3 essentially begins, as you can see if you inspect the data shown on that page, by pretending we're speaking TLS 1.2 and then saying we want to negotiate an optional "extension" to TLS 1.2 which is where we confess we actually speak TLS 1.3.

Every single packet of TLS 1.3 encrypted data is also wrapped in a TLS 1.2 layer saying "Don't mind me, I'm just application data". Why? Because we can't confess we're not speaking TLS 1.2, ever, and if we said we were doing TLS 1.2 crypto system stuff the same incompetent garbage software would try to get involved because it "understands" (badly) how to speak TLS 1.2, so we just pretend it missed the negotiation phase, this is just application data, nothing to see.

And it works. That's crucial. It's why we did all this, and yet it reveals that because the products people bought were developed incompetently they wouldn't even have detected serious attacks anyway, let alone prevented them. Need to sneak 40GB of stolen financial data over a network "protected" by this Genuine Marketing Leading Brand Next Generation Firewall? Don't worry, just label it "application data" with no explanation and it'll be completely ignored.

If you've ever watched a Lock Picking Lawyer video on Youtube, it was like one of the ones where it's several minutes so you expect it'll be hard to pick, but then you find out he's actually so disgusted by the lacklustre security of this $150 "Pick Proof High Security Lock" that although he rakes it open in 2 seconds with a cheap tool, and then shims it open with a discarded Redbull can, and then knocks it open with a hammer, and then uses a purpose built bypass tool to open it instantly in a single flowing motion, he also takes time to disassemble it and show you that the manufacturer fucked up, wasting material solving a non-existent problem and in the process making the lock much worse, which is why the video was so long.

Learning from their experience with these "Security" products for TLS 1.3, the QUIC people designed QUIC specifically with the intent that you can't even tell what version it is unless you're the client or the server, and then they shipped a new QUIC version to check that works even though they don't really need one yet, so that they don't have to do this whole dance again every few years.


While I mostly agree, I wouldn’t attribute buying middleboxes (or building shitty middleboxes) to incompetence. This is about doing the minimum viable effort for compliance, not about actual security, and everyone who bought or sold these boxes knew that.


In my opinion we should consider doing things just for compliance without any real benefit as incompetence. I know this isn’t the reality we live in at the moment however. Way too many things are just box-checking and I find that frustrating.


Is not having the box checked a ‘real benefit’?


From what I understand of the parent's comment, there wasn't even the minimum viable effort for compliance, because the version checking system doesn't comply with the TLS standard.


I think _hl_ is thinking of compliance in terms of compliance with a government regulation, or internal company policy, rather than compliance with the TLS standard.

Large organisations may even have a department named compliance whose job is to ensure the whole company obey certain policies, or at least in most cases, that they avoid being fined / prosecuted / shut down for not complying.

In practice this shades over, Compliance may include the 3rd party contractor who wanders into a C suite meeting, wearing somebody else's badge to point out that er, company security policy isn't being obeyed, you there, Bob, this badge I'm wearing is your badge, why are you in this meeting without your badge and why is your badge, now around my neck, still working even though you lost it last week?

But it will also include Sarah, who would really rather be playing Solitaire, and is implementing the policy that everybody needs to check this box which says "I have no outside interests in violation of Federal Rule 1234.567". What is Federal Rule 1234.567 ? Sarah doesn't know and doesn't care. She also has no clue what you should do if you in fact do have outside interests which violate this rule, she just wants you to check the box.

When it turns out the company has over 800 staff in violation of 1234.567 the lawyers will insist "But we checked none of them were violating the rule" and so the company didn't break the rules.

People like Sarah caused the shitty systems to be installed that, you're right, do not comply with the TLS specification, but it's not really Sarah's fault, people like her are always going to exist, we should make it harder to get it wrong than not to, so that laziness results in success.


I agree it's probably not Sarah's fault, but certainly the company would be at fault for incorrectly implementing the TLS standard. Especially if they advertised TLS support to their customers and users.


> In fact, I'd love to see a back-of-the-envelope calculation of how many tons of CO2 would have been saved in total if HTML was a well-engineered binary format.

Far, far less than would be saved if HTML was still expected to be human-readable. Things like React (server-side and client-side: think of all those poor <div>s!) waste far more resources than decompressing and parsing HTML.


Don't forget to add a few billion dollars saved because of how easy it is to debug HTML. How easy to is for people to start. You press ctrl+u and see the source. There were not many platforms where you could just view the underlying instructions... when I grew up, some 8 bit machines had cartridges to hack but let's face it 6502 or Z80 assembly is nowhere near as friendly as HTML.


ctrl+u could just make the displayed representation as easy to read. There are many binary protocols/formats that are a breeze to work with if you have the right tooling. You never see the 0xa 0x8 0xf. Your tools will either show you what the bytes represent or that there was a parsing error. Those parsing errors would be rare in the average case, just like debugging strange unicode issues in HTML today.


Exactly; when you click the padlock icon, the browser shows you the parsed representation of the x.509 certificate chain, not the ASN.1 bytes.


At that point, how big is the win in a binary protocol versus compression? Wouldn't a binary version be just a mapping between the HTML tags and a bit representation which a Huffman compression directory can just recreate on the fly but better?


There are a fair amount of benchmarks out there showing the difference in compressed JSON vs things like protobuf, msgpack, cbor for HTTP APIs. It's a decent stand in for this so you can get you a general idea of the scale.

It's also something you can easily build and test for yourself today. The compression ratio for that test is really depends on the content of the JSON, but that will be the same for some theoretical binary HTML too. From my real world experience, I saw msgpack coming out slightly ahead of compressed JSON with a non-negligible CPU advantage too.

This differences really are extremely minor. But when the tooling is designed well and easy to use then these minor performance wins are free. It is something you never notice or have to deal with directly.


My argument has not been well made -- nor was effort made to understand it -- but the textual nature of HTML extends much further than seeing it, it's also editing HTML with whatever you want. You can slap together a webpage with Notepad.


Most binary formats are easier, not harder, to parse.

It’s sad. An entire generation or two that doesn’t know the basics.


Most text formats are easier to parse but in incorrect ways.


> In fact, I'd love to see a back-of-the-envelope calculation of how many tons of CO2 would have been saved in total if HTML was a well-engineered binary format.

I wonder if it is feasible to create something like that. Because, a binary format requires specialized tooling, which needs to be created and maintained.

But maybe, with further adoption of WebAssembly, HTML will be less important in 10 - 15 years. :)


You wonder if its feasible to create what, a binary format?


I wonder if there is a significant CO2 reduction from binary over plaintext, if you figure in the additional work that a binary format requires. As such I wonder if it is feasible to create a report, as it will never be able to factor everything in.

You assume that there will be a significant reduction. I assume that we don’t know for sure.


If HTTP or HTML were binary formats 1000s of embryonic engineers would not have been able to learn them by hacking codes into telnet or notepad and the web would now be even more concentrated in the hands of the few players who are busily engaged in fucking it to death.


TLV encoding. It can be a pain to implement correctly.


Since I didn't know the acronym: https://en.wikipedia.org/wiki/Type%E2%80%93length%E2%80%93va...

In fact, a key part of the subtls library is a simple class to make this relatively painless (it's currently just called `Bytes`, but perhaps something like `TLVCoder` would be a more informative).

It reserves space for the length field, and provides a callback to write it automatically once you're done writing the value: https://github.com/jawj/subtls/tree/main#navigating-the-code. It also enables the indented and annotated output options.


Not trying to diminish it's importance, however isn't it slightly sad how much total payload is devoted to encryption?

Also, and I know this is slightly pedantic, but this is only speaking to layer 7. If you really wanted to watch everything byte by byte you'd need everything upwards of layer 2 where we go from bits on wire to bytes and frames. If you've not used Wireshark before run it yourself and load some pages.


I don't think much is wasted on encryption. Most is wasted on backwards compatibility in the form of negotiating protocols, algorithm, extenstions, etc.

If there would forever be only one encryption algorithm in use we we wouldn't need most of the stuff.


> If there would forever be only one encryption algorithm in use we we wouldn't need most of the stuff.

Coincidentally, this is why a protocol like WireGuard is much simpler than OpenVPN. It's a clean-sheet design, so its developers had the luxury to design it with hardcoded modern crypto primitives without any revision or extension. Protocol negotiation is skipped entirely since it's often error-prone and suspect to attacks...


How will we go to newer Wireguard versions then? Throw it all away and use a wholly new product? Or have a Wireguard 2.0, where all participants need to upgrade in lockstep, with no backwards compatibility (actually sounds attractive: some churn from time to time, but none of the avoidable headaches)?


> Or have a Wireguard 2.0, where all participants need to upgrade in lockstep, with no backwards compatibility (actually sounds attractive: some churn from time to time, but none of the avoidable headaches)?

I think it's basically the plan.

Even with protocol negotiation, replacing an existing crypto algorithm would be a breaking change anyway. Furthermore, it should be possible to preserve backward compatibility using a reserved field without a heavy-weight, full-fledged protocol negotiation. Entering a different code path if the message is tagged "version 2" should be enough.


Yeah, I'm pretty sure this is not the byte stream as seen by the ethernet port or Wifi/Cellular antenna.

So it's quite a misleading title if the reader takes 'byte by byte' literally.


As a Computer Science undergrad student we just finished our course on Computer Networks in Depth along with Cryptography and it feels really good to be able to at least know what is going on in every output step of the page. Amazing demonstration for beginners! Thank you for sharing

btw I was thinking if we had a visual interface with the request details (as in local switch, router, which AS, DNS or host is the request at) then it might be even more intuitive and easy to learn/understand. I don't think the exact private position of the packet can be traced that easily on a web request, but if it can then it probably might already be out there somewhere.


This brings back memories of when I created a webbased version of spotify[1], using their (custom) encrypted protocol over TCP[2].

Notably: back then I used flash to speak TCP, with a memory-leaky and bug-ridden mechanism to transfer byte buffers between JS and Flash.

[1] https://news.ycombinator.com/item?id=2556118 [2] https://github.com/EmielM/spotifyontheweb-crypt


how did you reverse engineer a custom encrypted protocol? i wouldnt even know where to begin.


This is a really beautiful way to see something we take for granted all the time. The fact that computers can do all this in the blink of an eye (sans network speed) is just mind blowing.


One kinda related thing I find constantly amazing is that your processor running at x GHz will have had several more ticks in the time it takes the photons to go from your screen to your eyes.

(Light travels at about 30cm/1ft per nanosecond)


In game development we can abuse this fact with things like client-side prediction in authoritative online environs. The server plays along with the client in a no-graphics mode and plays the inputs from the client, validates said moves on the server, and if all goes well sends back the response allowing for client action all before the blink of an eye. In the case of an invalid move, the server yanks the client back to the state preceeding. Pair that with some lag compensation and you've got yourself a stew.


what kind of latency budget do you have to do all that? even at like 15fps i cant imagine that you can provide a non janky experience when this “server yanks client back to valid state” move is done. i mean obviously you guys have figured it out, just the math doesnt quite make sense to someone like me not in gaming


Not an expert but:

The lag compensation is, more intuitively, just allowing the client to move on its own without needing server verification yet. The server maintains the authoritative game state, but (assuming no cheating is going on) the clients are able to move as they think they will. In the background, the client is actually sending its inputs to the server, and they technically haven't happened until the server verifies them (however many milliseconds later), but we allow the client to move on its own regardless.

The yanking back only happens if the client's new state (position etc.) don't match the server's state.


Think of it like the NTP protocol to for machines to sync wall-time. The idea is that there's an authoritative source of data, in this case the local time on the server. Your machine also does its best to independently track and maintain the current time. But it may be fast or slow, or experience some other issue that causes it to drift. And the longer its left by itself without syncing to an NTP server, the larger that potential drift.

So the server and the client are both independently calculating the state. But whatever the server calculates is authoritative, and the client's version is just a local simulation that is very accurate, but not 100%. Especially when it comes to other players, where your client is typically just executing under simplistic assumptions of things like constant forward movement.

However, luckily for the client, it receives a data sync from the authoritative server several times a second, so all that potential drift gets quickly corrected while the differences are still extremely small. For your own actions, where your client had access to all your input, the differences are probably even too small to notice. For other players though, it will be more noticeable, especially if there's an abrupt change in input.


Do you have more detailed links about that?



The Valve paper posted is amazing but huge shoutout to Photon (not affiliated) as their docs break it down really well. Their Bolt and now Fusion offerings are a boon!

https://doc.photonengine.com/fusion/current/tutorials/host-m...


My understanding is that this puts a fundamental limit on how fast CPU clocks can be. If the CPU die has a diameter of 3 centimeters, and the clock speed is a hypothetical 20 GHz, light (and electrical signals) can traverse at most half the size of the die in the course of one clock cycle. It's hard to imagine how such a CPU could work, since components at opposite sides of the die cannot communicate within a cycle.


Definitely nitpicking here, but for a modern pipelined CPU there's already lots of components of the die that aren't designed to be able to exchange data within a single clock. I don't see what would be fundamentally different about laying out a CPU die such that operations that need to happen in the same clock are physically nearby, and it seems like it would (at worst) have similar properties to multi-socket and pipelined CPUs today where cache coherency operations and branch mispredictions have multi-cycle penalties associated with them. I'm pretty sure that multi-socket CPUs can't talk in less than one clock cycle anyway.


I only had an intro course in CPU architecture, but AFAIK pipelining depends on parts of the pipeline running on the same clock. I'd say CPU manufacturers already sidestepped this limitation by using multicore dies.


That just means you can't have a single simple clock source fan out in a simple way across the entire chip, but that kind of design performs badly anyway.

There are lots of ways to have multiple clock sources kept in sync or to alter the clock's phase to compensate for distance, so you can keep the entire CPU synchronized even if crossing it takes multiple clock cycles.


No, it doesn’t. Putting aside that there is an entire field of clockless designs, asynchronous logic, free-running clocks, etc.

You should think of this as “when the cpu is processing an instruction, what does it need to track to dispatch and retire it?” Designing to a click tick is one version and kind of where things were in the 1970s. It’s an educational model.


That's not nitpicking at all. The argument about light-nanoseconds crossing a chip is very common and completely wrong.


Was going to say the same. My mind was fairly blown when I realized that light only travels approx 4 inches in a clock cycle.

Then again, every time my computer pauses for multiple seconds, me knowing how many billions of clocks cycles I’m waiting for doesn’t help my already challenged patience… haha.


Not to mention eons required for our optical organics to start registering the photons!


The race is about to begin. I wait for the screen to flash green, finger hovering over the enter key. It flashes and I press the button immediately, and the race is over.

In this time, my year old consumer graphics card has performed more floating point calculations than my entire town could in their entire lives.

Occasionally I work this stuff out and it's frankly hard or impossible to truly appreciate just how fast these things are.

(36 TFlops, 250ms reaction time = 9e12 calculations. One calculation every four seconds from birth to age 100 for 10,000 people is about 8e12 in total)


> The race is about to begin. I wait for the screen to flash green, finger hovering over the enter key. It flashes and I press the button immediately, and the race is over.

And you missed your window to get into BIOS, you need to reboot and try again. /s


I'm fairly sure this is more than one somebody's recurring nightmare.


No lie, my eye twitched.


And once you understand how fast modern computers are, you realize how absurd it is when software is still slow. How is it that Microsoft Teams takes 15 seconds to load? How many billions of calculations does it take to put a list of names on the screen?

Never forget what they took from you.


Although it can be laziness/ incompetence wasting cycles, sometimes the problem is that only the CPU got so much faster, and only in certain ways.

If Teams talks to a remote server, doesn't matter how fast your CPU is, it takes time for the question to get to the server, time to get back, that's some time regardless. Now yes, maybe they write code which goes

Hi! / Yes? / I'm a Teams client / That's nice / And you? / I'm the Teams server, what username? / SamBankmanFried / OK, and password? / Hunter2 / OK, you're logged in SamBankmanFried / OK do I have new #catpics ? / Nope / How about #emergency ? / Nope / How about #general ? / Yes, six messages ...

And that's a lot of round trips whereas they could have done:

Hi, I'm a Teams client, hopefully you're a Teams server / Yes, I'm the Teams server, what username & password ? / SamBankmanFried password Hunter2, also summarise my new stuff / Hello SamBankmanFried, you have six #general messages ...

Which is far fewer round trips.


And then you factor in the occurrence of different systems running at different clock speeds having to interchange across several interfaces in a way that the entire communication somehow stays afloat and even produces feedback.


Very cool! Reminds me of the illustrated TLS connection which was a helpful resource for me when I had to implement a variation of it.

https://tls12.xargs.org/


> https://tls12.xargs.org/

in the first paragraph on the page there is this link:

> It’s inspired by [The Illustrated TLS 1.3 Connection](https://tls13.xargs.org/) and Julia Evans’ [toy TLS 1.3](https://jvns.ca/blog/2022/03/23/a-toy-version-of-tls/).


Great resource, thanks for sharing.


Absurdly fascinating. I should probably stop reloading the site now!

Also is there a reason why on HN this domain just shows itself as "pages.dev"? That's the root of Cloudflare Pages, so probably best to also show the subdomain for it like HN does for github.io


We have to specifically mark a domain for sharding like that. I've done so for pages.dev now. Thanks!


I wonder if its an out of date public suffix list? Thats how I'd decide when to show a subdomain. pages.dev was added a few years back:

https://github.com/publicsuffix/list/pull/1093


The people responsible for the PSL existing would like everybody to stop inventing more uses for it. That is, this is clearly a terrible idea but it is better than nothing, so here it is, but please invent something that's not a terrible idea.


Are you sure? Looking at their website[1] I see:

> Highlight the most important part of a domain name in the user interface

Which is my suggestion above.

> If you are using it for something else, you are encouraged to tell us, because it helps us to assess the potential impact of changes

Which sounds cautiously supportive of additional use cases.

[1] https://publicsuffix.org/


Ryan Sleevi has written about this before on Hacker News and here's his list https://github.com/sleevi/psl-problems

It's definitely possible that Ryan would consider using this for HN a reasonable choice, because it's mostly cosmetic, but in general you should just not add more dependencies.


Thanks, that's an interesting read. It feels pessimistic to me (understandable given all the problems) and I'm not sure it's really offering viable alternatives.

For cosmetics the PSL feels like a close enough solution, and it's available today. At the very least, it looks like a good way to bootstrap a list used to decide on cosmetic styling, with manual amendments when PSL has gaps or unneeded entries Cross-thread it looks like dang is building such a list, just without the initial PSL bootstrap to get it started. I wonder if there's anything on the PSL we wouldn't want on the HN list?


Very good! This would be very helpful for students on every "intro to networks" course on computer science courses as a way to visualise something that can seem very abstract.


The Cisco Packet Tracer software is also good for this, as is Wireshark.


I couldn't quite follow the symmetric key derivation.

First there's something called "handshake key computations", which generates a full set of encryption and mac keys. It seems to be entirely based on secp256r1 (NIST P-256) key share. I understand that this part is not protected against man-in-the-middle, right?

Then there's "application key computations", what are those values computed from? This process creates another complete set of encryption and authentication keys, and I assume this set is protected against man-in-the-middle.


You might find this useful (it was a helpful source when implementing): https://tls13.xargs.org/. Both sets of keys should be MITM-proof.


Thanks for the link.

But I believe that the first set of keys/IVs is not protected whatsoever. Anyone can run the "Server Key Exchange Generation", as it doesn't require any secrets such as certificate private keys, only random inputs.

EDIT: Specifically, at least up to this point anyone can impersonate the server.


Yes, maybe it's easier to understand what's going on once you understand the strategy in terms of a conversation. Suppose I'm Alice and I want to securely talk to Bob with TLS 1.3

The key exchange (KEX) allows me to agree a secret S with somebody I'm talking to over the network. I don't know if they're Bob, but whoever they are, eavesdroppers can't discover this secret.

Now, we (Alice, and the maybe Bob) can communicate securely with our secret S, nobody can MITM this conversation, they don't know S.

Bob sends public documents (a certificate) which say Bob knows some Private Key such that I can verify it with a published Public Key mentioned in the certificate. That's no big deal, they're public documents, anybody could have those but then...

Bob also sends a transcript. The transcript is of our KEX earlier, and it's signed using Bob's private key, which means I, Alice, can verify this is a signed transcript of our conversation. I was in the conversation, so I can confirm it's identical.

Having verified this signed transcript, I know that Bob was in the conversation I had earlier, with the unknown maybe-Bob, and it stands to reason that's because this is Bob.

Notice that if Mallory, a hypothetical MITM, tried to attack us, this won't work. The conversation between me (Alice) and Mallory must involve parameters that Mallory chose, if Mallory just forwards Bob's parameters to Alice and vice versa, Mallory doesn't learn S and so is cut out of the picture. But, if Mallory changes the parameters Alice and Bob see, then the transcript that Bob signed won't match the KEX that Alice experienced, so the connection is aborted.


This makes a lot of sense.

Slightly stupid question but maybe you know the exact reason: could we apply symmetric encryption only after Alice has determined that maybe-Bob is the real Bob? That is only after Bob sends the certs and the signed transcript of key exchange? I'm calling this a stupid question because I see no real advantage, but I wonder if there are clear disadvantages.


If the message from Bob, proving his identity, was sent in the clear, we're needlessly telling people that Alice is talking to Bob.

In fact earlier TLS versions did send the certificates in the clear which is a lesser version of this problem, TLS 1.3 no longer does that, everything after "Hello" is encrypted.

IETF Best Common Practice 188 (RFC 7258) "Pervasive Monitoring is an Attack" says that for Internet Protocols we should assume that surveillance is inherently an atttack on the Internet and pro-actively defeat it by designing protocols not to leak information unnecessarily.

Now, in this specific case, the Hello message from Alice likely has SNI (Server Name Indication) so probably Alice is saying "Hi Bob" at the start in the clear, but (a) Not necessarily, in some cases this won't happen, so no need to tell us it's Bob later (b) There is ongoing work to fix that, Encrypted Client Hello, which will leverage DNS to give Alice keys she can use to make it harder to know, so there's no reason to make it worse.


The subtls hack is an amazing idea. I’ve been wanting to build a TLS Playground for a while(play around with CAs, Cert Issuance, Client Certs etc) and this would be great for it.


Subtls author and OP here. That sounds great — feel free to ping me if useful! (Assemble these into an email address: george mackerron com).


Would be good to be able to point at any domain, a common issue with debugging cert stuff is working out which cert is being presented, TLS level, cipher suites etc are being negotiated.


That’s true, but I was wary of running what would essentially be an open TCP relay. Also, I don’t currently bundle a comprehensive set of root certs in subtls, nor support all ciphers or signing methods.


https://github.com/jawj/subtls (A proof-of-concept TypeScript TLS 1.3 client) is implemented with the SubtleCrypto API.

TIL about SubtleCrypto https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypt... :

> The SubtleCrypto interface of the Web Crypto API provides a number of low-level cryptographic functions. Access to the features of SubtleCrypto is obtained through the subtle property of the Crypto object you get from the crypto property.

  decrypt()
  deriveBits()
  deriveKey()
  digest()
  encrypt()
  exportKey()
  generateKey()
  importKey()
  sign()
  unwrapKey()
  verify()
  wrapkey()
Can SubtleCrypto accelerate any of the W3C Verifiable Credential Data Integrity 1.0 APIs? vc-data-integrity: https://w3c.github.io/vc-data-integrity/ ctrl-f "signature suite"

> ISSUE: Avoid signature format proliferation by using text-based suite value The pattern that Data Integrity Signatures use presently leads to a proliferation in signature types and JSON-LD Contexts. This proliferation can be avoided without any loss of the security characteristics of tightly binding a cryptography suite version to one or more acceptable public keys. The following signature suites are currently being contemplated: eddsa-2022, nist-ecdsa-2022, koblitz-ecdsa-2022, rsa-2022, pgp-2022, bbs-2022, eascdsa-2022, ibsa-2022, and jws-2022.

But what about "Kyber, NTRU, {FIPS-140-3}? [TLS1.4/2.0?]" i.e. PQ Post-Quantum signature suites? Why don't those need to be URIs, too?


Very nice, thank you.

There is an older document (2009) which I liked very much and which taught me the details of a HTTPS connection (in a quite leisury way): http://www.moserware.com/2009/06/first-few-milliseconds-of-h...


This is great, would be useful to be able to point it at any domain.


For those looking for in-depth info on TLS, see also https://www.feistyduck.com/books/bulletproof-tls-and-pki/


I remember back in the day some software companies typically used to ask a question like "What happens (under to hood) when you type in a URL in your web browser, and hit enter?" and you'd have to jog them through every step you could remember. This takes it to the next level.


That’s pretty neat. I’m all for things that make these systems more visual.

I do chuckle that, for all of the advancements we’ve made with the web and 3D graphics, we’re still mostly worried about stringing together little squiggly characters and writing test cases or print statements to see what’s actually going on.


Since they add annotations to what is actually loaded, shouldn't the loading continue forever?


This site is fetched twice, once by your browser, and another time by the JS code after you press the button. For example the JS loader specifies that only one cipher suite is to be used (TLS_AES_128_GCM_SHA256 ), while your browser will definitely support more than one. Any additional traffic is again fetched by your browser.


If the annotations were added by the server, yes. Not if they are added client side.


As a network guy I am blissed out by this, thank you for sharing. Often when I study the concepts I repeatedly have that same shock of how much is going on under the hood across large distances at tremendous speed.


Ugh TLS certificates are such flaming garbage. There is zero reason for them to be in a machine-first protocol like ASN. Here we are with billions of computers spewing human-first protocols like JSON at each other and we can't even use it for human-only info like TLS certificate fields.

But of course the complete incomprehensibility of TLS certs is a feature, not a bug: creating and signing your own is so difficult and arcane that it generates business for certificate vendors, even for certs for private use.


> even for certs for private use.

This has been a concern of mine. Paired with incompatibility in clients[1] it's easy to mess up.

I'm building a service[2] and related tooling that makes it easier to use public CAs for private use. I'm in early stages, but feedback has been good so far.

[1] https://alexsci.com/blog/name-non-constraint/

[2] https://alexsci.com/blog/improve-https-1/


Revealing my non-nerd credentials: when I click "Get this page, byte by byte" I just see:

> We begin the TLS handshake by sending a client hello message (source):

and nothing else seems to happen, no matter how long I wait. I thought I might have to click the button again, but it just seems to display the same message, no matter how often I click it. What else should I be doing?


Huh, that’s odd. Are you able to open your browser dev tools and check for JS errors in the console?


I tried in Firefox, Safari, and Chrome. Here it is from Chrome:

> index.js:2231 WebSocket connection to 'wss://ws.manipulexity.com/v1?address=subtls.pages.dev:443' failed:

> ws @ index.js:2231

> index.js:2235 ws error: Event [I didn't expand this, but let me know if you need it]

> connection closed index.js:2238


Hmm. Could you be behind some kind of interfering proxy that doesn't like WebSockets? Does this demo page work for you? https://libwebsockets.org/testserver/


> Hmm. Could you be behind some kind of interfering proxy that doesn't like WebSockets?

I'm at work, so it's possible, but I'm not sure.

> Does this demo page work for you? https://libwebsockets.org/testserver/

It shows me a rapidly incrementing number, that resets when I click "reset counter". Nothing obviously happens when I click "send junk". Is that what should happen?


> Hmm. Could you be behind some kind of interfering proxy that doesn't like WebSockets?

I tried again from home, and it worked fine, so that must be it.


Great!


This reminds me of when I first discovered Wireshark (called Ethereal back then). It's always a good day when I have to break it out and still fun. I would try to reverse engineer the protocols. It's nice to see the annotations here as a lot of it doesn't make much sense without it.


I'm annoyed TLS 1.2 compatibility mode is still required. Is there a way to turn it off, accepting that people with incompetent network administrators can't access your website, or have shitty middleboxes forever ruined TLS?


The problem is that you do not know where the middleboxes are. Removing support might therefore block users in weird and silent ways. The middlebox might be out of the users control and then nothing can be done to solve it. Blocking TLS 1.2 from a small blog is no big deal, but not something FAANG can risk

One way of solving this is by circumventing the middleboxes. This is more or less the way QUIC works. QUIC uses UDP which avoid most middleboxes and their failed attempts at being helpful. So QUIC can without issue upgrade to a new version of TLS without middleboxes ruining it, while TLS over TCP is stuck (at least for the moment). This brings with it other issues, but hey one problem solved at least.


If TLS 1.3 will fail, I doubt QUIC will make it through. I'm no fortune 100 sysadmin but if I were that worried about network traffic, I wouldn't allow some unknown UDP protocol to go out unnoticed.

I don't know where these middleboxes are but this doesn't seem that important until people report having issues. It's been five years since TLS 1.3 was introduced and even longer since the survey that led to the compatibility mode was conducted, many companies with bad middleboxes will probably have had to upgrade them by now.


It's not a problem of being paranoid about TLS. It's a problem with well intended middleboxes which worked at the time, but now do harm instead. It's easy to say that people should of updated by know. But that's hard to force when nobody fully knows what will break ( ipv4 to ipv6 for example).

QUIC works better becuase UDP haven't gotten the same treatment as TCP and TLS. It brings with it other problems . One problem for example being NAT's sometimes poor handling of long lasting UDP connections. But QUIC has functionality to handle that.

> I wouldn't allow some unknown UDP protocol to go out unnoticed

A bit of a compromise has been made in this regard by using something called a spin bit.

https://greenbytes.de/tech/webdav/draft-ietf-quic-spin-exp-l...


I've been on several networks that blocked all outgoing UDP traffic because "it's for file sharing and we don't allow that". It's just as easy to say UDP somehow gets special treatment in an environment where protocol parsing DPI is the norm. I've even used infuriating networks that explicitly sabotaged QUIC because their middlebox couldn't parse it, whereas standard TLS 1.3 worked just fine.

The problematic middleboxes are the ones that don't forward packets they can't parse. If they would correctly identified traffic as "TLS but too recent to parse" and let the packets flow through, we wouldn't have this problem. For that reason I strongly doubt that anywhere these boxes are employed UDP traffic somehow goes by unnoticed, because there's layer 3 filtering going on wherever these boxes fail.


>I've been on several networks that blocked all outgoing UDP traffic because "it's for file sharing and we don't allow that".

I think we're thinkig about different things when it comes to middleboxes. It's not about networking rules on LAN. More about middleboxed that do queue management traffic prioriyy etc on MAN and WAN levels. If a network administrator wants to block something then so be it.

>For that reason I strongly doubt that anywhere these boxes are employed UDP traffic somehow goes by unnoticed

UDP doesn't go by unnoticed in these cases, but is more or less ignored in a way that TCP is not. UDP can't be blocked on WAN and MAN level as it's was widely used before middleboxes became a thing. New protocols however can't be introduced. Hence the QUIC solution. If you're interested in it more then MPTCP is also an interesting example of this


Middleboxes that merely apply traffic shaping don't need to parse TLS headers, though. For optimising HTTPS flows, tcp/443 is good enough.

UDP often gets special treatment in that it gets dropped more often when the unlink becomes saturated. After all, UDP has no delivery guarantee so dropping the packets is less likely to cause retransmissions and other noise. DNS traffic may be excluded from this treatment, but I'd expect such shapers to also implement a transparent caching DNS proxy for performance improvements.


They shouldn't have to but they do . Like I said look at MPTCP and some of the issues that has in order to see other examples of middleboxes in action


You can definitely just require TLS 1.3 with no backwards compatibility.


awesome work. It’s a bit of intersection of instructional material, technical material and art.

TLS debugging is infamous and this goes a long way to improve understanding. great work here


lol, this page did not work at all for me, and the only addon I have on firefox is ublock origin, which I disabled to no avail. Did anyone else find a workaround?


working for me with ublock origin enabled


Nice, super helpful for cs students.


Contrasting this with what happens over plain HTTP would be fascinating as well.


Well, the equivalent for plain HTTP is very short. Client sends the GET request, shown near the end, and server sends back the payload, same as you see decrypted.


Does anyone know what the NEL and Report-To headers are about/purpose?



The secret life of packets.


Maybe there should be a diagram to show the processes in the right


This was cool. Are there other tools/websites like this?


Would this make it a TLS quine?


Hack game Ff


84 76 68 82

(ASCII for “TLDR”)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: